OmegaConf
OmegaConf is a library to handle configurations in Python, typically making a bridge between:
- configuration files (YAML, JSON, …)
 - configuration objects in Python code (dictionaries, dataclasses, record, …)
 - command line arguments
 
Typically, a yaml configuration is used, loaded with OmegaConf, possibly overridden with command line arguments, and then used in the code. It can also handle command line arguments without any configuration file, merging multiple source of configuration, etc…
Example of a YAML configuration file:
model:  type: "resnet50"  learning_rate: 0.001  batch_size: 32dataset:  name: cifar10logging:  iterations: [100, 200, 1000]  name: "bs_${model.batch_size}" # see tutorial for more interpolation                                 # with default, env vars, etc...Example of a variety of command line invocation:
pip install omegaconfpython train.pypython train.py model.learning_rate=2e-4python train.py dataset.name=CIFAR100 model.type=resnet18Example of Python code using OmegaConf
from omegaconf import OmegaConf
cfg = OmegaConf.load("config.yml")cfg.merge_with_cli()
print(cfg.model.type)print(cfg.model["learning_rate"])print(cfg.dataset.name)print("# Modifying and dumping to yaml")cfg.dataset.name = cfg.dataset.name.lower()print(OmegaConf.to_yaml(cfg, resolve=True)) # resolve/interpolateExample output
when run with:
python train.py model.learning_rate=2e-4 model.batch_size=8
The above code, with the above config.yml, will output:
resnet500.0002cifar10# Modifying and dumping to yamlmodel:  type: resnet50  learning_rate: 0.0002  batch_size: 8dataset:  name: cifar10logging:  iterations:  - 100  - 200  - 1000  name: bs_8Hydra
Hydra builds on top of OmegaConf and handles the main entry point of your application.
It provides a decorator: you can annotate your main function with @hydra.main to wrap it so that it automatically loads a config file, handles command line arguments, etc.
Compared to using OmegaConf directly, Hydra provides:
- automatic loading of configuration files (and allowing to pass its path/name as command line argument) that get fed to the main function,
 - a concept of config directory that will be used to load config files (the main one and possible config chunks),
 - at the command line, a distinction between overriding config values (with the OmegaConf syntax 
key=value) and adding new config values (with+new_key=new_value), which may detect some typos, - the ability to compose multiple config files and override specific parts of the configuration (see config groups in the documentation).
 
Important
💚- Beware: Hydra expects you to use 
.yamlas file extension for configuration files, not.yml(and config names do not include an extension). - Hydra writes automatically output files (logs, config files, etc…) in a new folder for each run.
By default, these folders are created in 
./outputs/YYYY-MM-DD/HH-MM-SSrelative to the current working directory. - Multiruns (parameter sweeps, see below) create folders within 
./multirun/instead of./outputs/. - Looking at the command line syntax page of Hydra documentation is very useful, especially if you are not very familiar with the shell.
 
Example of Python code using Hydra
import hydrafrom omegaconf import DictConfig, OmegaConf
@hydra.main(config_path="conf/", config_name="config", version_base="1.3")def train(cfg: DictConfig) -> None:    if 'quiet' in cfg and cfg.quiet:        return    print("# Accessing config values")    print(cfg.model.type)    print(cfg.model["learning_rate"])    print(cfg.dataset.name)
    print("# Modifying and dumping to yaml")    cfg.dataset.name = cfg.dataset.name.lower()    print(repr(cfg))    print(OmegaConf.to_yaml(cfg, resolve=True)) # resolve/interpolate
if __name__ == "__main__":    train()One can run the above code, using the conf/config.yaml OmegaConf, with:
python train.pyHydra multiruns (parameter sweeps), from the command line
Hydra provides an easy way to launch multiple runs with different configurations (parameter sweeps).
You can do it from the command line, using the -m option and specifying space-separated list of values for some keys.
For instance, with:
python train.py -m model.learning_rate=0.001,0.0001 model.batch_size=16,32,64,128,256 +quiet=TrueHydra will then run all the 10 combinations (2 learning rates x 5 batch sizes).
Hydra multirun from the config file
If your sweeps are core and you don’t want to pass them at the command line, you can also specify them in a config file, under the hydra key.
# ...hydra:  sweeper:    params:      model.learning_rate: 0.1,0.01,0.001,0.0001Still running with -m:
python train.py -mHydra multirun from the config groups (chunks, facets)
You can also use hydra config groups to specify sweeps.
For instance, you can create a file conf/hydra/sweeplr.yaml (the folder is very important, see groups) with the following content:
sweeper:  params:    model.learning_rate: 0.1,0.01,0.001,0.0001Then, you can launch the multirun by asking to add the config with:
python train.py -m +hydra=sweeplrTo combine several chunks, you can pass a list:
python train.py -m +hydra='[sweeplr,sweepbatch]'Above the files are in the conf/hydra/ folder and can only contribute to the hydra key of the configuration.
To decorellate these two aspects (directory name and config key), you can have files that contribute to the main configuration, for instance in conf/sweep/batch.yaml:
# @package hydra.sweeper.paramsmodel.learning_rate: 0.1,0.01,0.001,0.0001# @package _global_hydra:  sweeper:    params:      model.batch_size: 16,32,64,128,256Then you can launch the multirun with:
💚python train.py -m +sweeps=lrpython train.py -m +sweeps='[batch,lr]'python train.py -m -cn config2 +sweeps='[batch,lr]'Hydra structured configs
To have better type checking and autocompletion, you can define structured configs with python dataclasses (or pydantic models). See the structured config tutorial for all details.
We introduce here only one approach, which replaces the base yaml config file with a python dataclass. Still, all override, multiruns, config groups, etc… work as before.
The main python imports the config dataclass and uses it as type for the config argument of the main function. This way tools can type check it.
from config import Config   ######## <<<<-----import hydrafrom omegaconf import OmegaConf
@hydra.main(config_path="conf/", config_name="config", version_base="1.3")def train(cfg: Config) -> None: ###### <<<<-----    if 'quiet' in cfg and cfg.quiet:        return    #print(cfg.problem)    print(OmegaConf.to_yaml(cfg, resolve=True)) # resolve/interpolate
if __name__ == "__main__":    train()The configuration dataclass itself looks like that (it needs to be defined and then registered with OmegaConf).
💚from dataclasses import fieldfrom dataclasses import dataclass#from pydantic.dataclasses import dataclass # for more pydantic features
@dataclassclass ModelConfig:    type: str = "resnet50"    learning_rate: float = 0.001    batch_size: int = 32
@dataclassclass DatasetConfig:    name: str = "cifar10"
@dataclassclass LoggingConfig:    iterations: list[int] = (100, 200, 1000)    name: str = "bs_${model.batch_size}"
@dataclassclass Config:    model: ModelConfig = field(default_factory=ModelConfig)    dataset: DatasetConfig = field(default_factory=DatasetConfig)    logging: LoggingConfig = field(default_factory=LoggingConfig)    quiet: bool | None = None
# Register itfrom hydra.core.config_store import ConfigStorecs = ConfigStore.instance()cs.store(name="config", node=Config)Example with pydantic BaseModel’s
It might be better to use BaseModel from pydantic instead of dataclass for better validation
BUT… it seems omegaconf does not accept pydantic models directly.
from pydantic import BaseModelclass ModelConfig(BaseModel):    type: str = "resnet50"    learning_rate: float = 0.001    batch_size: int = 32class DatasetConfig(BaseModel):    name: str = "cifar10"class LoggingConfig(BaseModel):    iterations: list[int] = [100, 200, 1000]    name: str = "bs_${model.batch_size}"class Config(BaseModel):    model: ModelConfig = ModelConfig()    dataset: DatasetConfig = DatasetConfig()    logging: LoggingConfig = LoggingConfig()    quiet: bool | None = None
# Register itfrom hydra.core.config_store import ConfigStorecs = ConfigStore.instance()cs.store(name="config", node=Config)If one wants to have the config file (python dataclass) in the same folder as the usual config files (yaml), it can be.
One just have to place it in conf/ and import it properly.
from conf.config import Config   ######## <<<<-----
# ... the rest is unchanged...Hydra typing but keeping yaml base config file
This approach is hybrid: you keep the base config file as yaml, but you define a dataclass for typing only. It is not DRY (Don’t Repeat Yourself) but it might be preferred in some cases. Indeed, the python typing dataclass becomes much cleaner:
cleaner config.py (using dataclasses, so not validating)
from dataclasses import dataclass
@dataclassclass ModelConfig:    type: str    learning_rate: float    batch_size: int
@dataclassclass DatasetConfig:    name: str
@dataclassclass LoggingConfig:    iterations: list[int]    name: str
@dataclassclass Config:    model: ModelConfig    dataset: DatasetConfig    logging: LoggingConfig    quiet: bool | NoneThen the main code is unchanged (it imports the Config dataclass from config.py).
It is recommended to use BaseModel from pydantic instead of dataclass for better validation.
from pydantic import BaseModel
class ModelConfig(BaseModel):    type: str    learning_rate: float    batch_size: int
class DatasetConfig(BaseModel):    name: str
class LoggingConfig(BaseModel):    iterations: list[int]    name: str
class Config(BaseModel):    model: ModelConfig    dataset: DatasetConfig    logging: LoggingConfig    quiet: bool | NoneAnd in the main code, to explicitly validate the config (to catch e.g. typos):
💚...def train(cfg: Config) -> None:    Config.model_validate(cfg)    ...Running slurm with Hydra (launchers)
See [https://hydra.cc/docs/plugins/submitit_launcher/](Hydra submitit launcher) for all details.
For this step we will typically run the script from the slurm front (labslurm), that have shared file system access to the compute nodes (labcompute-<id>).
In addition to our config (that we keep to be able to also run fully normally), we can create a new config that:
- inherits from the base config (so that we don’t repeat ourselves)
 - selects the slurm launcher
 - configures some default slurm parameters (partition, gpus, cpus, memory, time, etc…)
 
defaults:  - config  - override hydra/launcher: submitit_slurmhydra:  launcher:    nodes: 1    name: ${hydra.job.name}    _target_: hydra_plugins.hydra_submitit_launcher.submitit_launcher.SlurmLauncher    partition: "GPU,GPU-DEPINFO"    gres: "gpu:1"    cpus_per_task: 2    mem_per_cpu: 32G    timeout_min: 1200    constraint: "[gpu24G]"We can also have some specific “groups” for typical overrides, for instance:
# @package hydra.launcherpartition: "GPU"Then we can launch a slurm job, from the front, with:
pip install omegaconf hydra-corepip install hydra-submitit-launcher --upgradepip install setuptools
python train.py -m -cn slurm24python train.py -m -cn slurm24 +slurm=GPUpython train.py -m -cn slurm24 +slurm=GPU +sweeps=batch(use uv run if using uv)
Running different configs/entry-points with slurm
The previous example used one slurm config (slurm24.yaml).
However, it requires to create a new config file (for slurm) everytime we have a new base config.
To separate concerns, we can put all the slurm config in a file, typically conf/slurm/gpu24.yaml:
# @package _global_defaults:  - override /hydra/launcher: submitit_slurmhydra:  launcher:    nodes: 1    name: ${hydra.job.name}    _target_: hydra_plugins.hydra_submitit_launcher.submitit_launcher.SlurmLauncher    partition: "GPU,GPU-DEPINFO"    gres: "gpu:1"    cpus_per_task: 2    mem_per_cpu: 32G    timeout_min: 1200    constraint: "[gpu24G]"Then, we can launch any config with that slurm config, for instance:
💚pip install omegaconf hydra-core setuptoolspip install hydra-submitit-launcher --upgrade
python train.py -m +slurm=gpu24python train.py -m +slurm=gpu24 +sweeps=batchpython train.py -m -cn config2 +slurm=gpu24 +sweeps=batch