# Welcome

<div data-full-width="false"><figure><picture><source srcset="/files/XElGRKvhdyH5unbX9hyH" media="(prefers-color-scheme: dark)"><img src="/files/0gIDppmJ23Mkkwy6g33e" alt=""></picture><figcaption></figcaption></figure></div>

### Hypster is a lightweight configuration framework for managing and **optimizing AI & ML workflows**

## Key Features

* :snake: **Pythonic API**: Intuitive & minimal syntax that feels natural to Python developers
* :nesting\_dolls: **Hierarchical, Conditional Configurations**: Support for nested and swappable configurations
* :triangular\_ruler: **Type Safety**: Built-in type hints and validation
* :mag: **Schema Exploration**: Inspect parameters, defaults, and active branches with `explore()`
* :test\_tube: **Hyperparameter Optimization Built-In**: Native, first-class optuna support

> Show your support by giving us a [star](https://github.com/gilad-rubin/hypster)! ⭐

## How Does it work?

{% stepper %}
{% step %}
**Install Hypster**

{% code overflow="wrap" %}

```bash
uv add hypster
```

{% endcode %}
{% endstep %}

{% step %}
**Define a configuration space**

{% code overflow="wrap" %}

```python
from hypster import HP
from my_app.llms import LLMClient


def llm_config(hp: HP) -> LLMClient:
    model_name = hp.select(["gpt-5.5", "gpt-5.5-mini"], name="model_name")
    temperature = hp.float(0.0, name="temperature", min=0.0, max=1.0)

    return LLMClient(model_name=model_name, temperature=temperature)
```

{% endcode %}
{% endstep %}

{% step %}
**Explore your configuration**

{% code overflow="wrap" %}

```python
from hypster import explore

explore(llm_config)
```

{% endcode %}
{% endstep %}

{% step %}
**Instantiate your runtime object**

{% code overflow="wrap" %}

```python
from hypster import instantiate

llm = instantiate(llm_config, values={"model_name": "gpt-5.5", "temperature": 0.7})
```

{% endcode %}
{% endstep %}

{% step %}
**Execute!**

{% code overflow="wrap" %}

```python
llm.invoke("What is Hypster?")
```

{% endcode %}
{% endstep %}
{% endstepper %}

## Discover Hypster

<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-cover data-type="files"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><strong>Getting Started</strong></td><td>How to create &#x26; instantiate Hypster configs</td><td></td><td><a href="/files/gvePptq2cRq2igl7WW6q">/files/gvePptq2cRq2igl7WW6q</a></td><td><a href="/pages/AOgX0cNsNavrnnfQlMx0">/pages/AOgX0cNsNavrnnfQlMx0</a></td></tr><tr><td><strong>Tutorials</strong></td><td>Step-by-step guides for ML &#x26; Generative AI use-cases</td><td></td><td><a href="/files/Nj5lJQCrmlAPCTmQbniQ">/files/Nj5lJQCrmlAPCTmQbniQ</a></td><td><a href="/pages/sG9e3bPGcSDJ519TnTmC">/pages/sG9e3bPGcSDJ519TnTmC</a></td></tr><tr><td><strong>Best Practices</strong></td><td>How to make the most out of Hypster</td><td></td><td><a href="/files/3HiBewQRJ2fliafwoYSY">/files/3HiBewQRJ2fliafwoYSY</a></td><td><a href="/pages/xpQxAIZEJ6Qsrj3Scmbh">/pages/xpQxAIZEJ6Qsrj3Scmbh</a></td></tr></tbody></table>

## Why Use Hypster?

In modern AI/ML development, we often need to handle **multiple configurations across different scenarios**. This is essential because:

1. We don't know in advance which **hyperparameters** will best optimize our performance metrics and satisfy our constraints.
2. We need to support multiple **"modes"** for different scenarios. For example:
   1. Local vs. Remote Environments, Development vs. Production Settings
   2. Different App Configurations for specific use-cases and populations

Hypster takes care of these challenges by providing a simple way to define configuration spaces and instantiate them into concrete workflows. This enables you to manage and optimize swappable runtime components in your codebase.

## Core Workflow

* **Define** ordinary Python config functions whose first argument is `hp: HP`.
* **Return** the initialized object your application will use whenever that object is cheap and safe to construct.
* **Choose** swappable components with named option dictionaries that map simple keys to config functions.
* **Explore** the active parameter tree with `explore(config)` before running a branch.
* **Instantiate** with `instantiate(config, values={...})` when you only need the returned object.
* **Log params** with `instantiate_with_params(config, values={...})` when you need a stable replay record.

## Design Notes

Hypster treats `values=` as a reproducibility surface. Unknown values and values for inactive branches raise by default, because silently accepting them can make an experiment impossible to replay. Use `explore(config, values=...)` to inspect a branch before instantiating it.

Because exploration and interactive controls execute the config function to discover the current branch, avoid doing work there that should happen only once or only after the user confirms a run. Build expensive clients, load indexes, write files, call paid APIs, and train models after `instantiate()` returns.

## Additional Reading

* [Introducing Hypster](https://medium.com/@giladrubin/introducing-hypster-a-pythonic-framework-for-managing-configurations-to-build-highly-optimized-ai-5ee004dbd6a5)
* [Implementing Modular RAG With Haystack & Hypster](https://towardsdatascience.com/implementing-modular-rag-with-haystack-and-hypster-d2f0ecc88b8f)
* [5 Pillars for Hyper-Optimized AI Workflows](https://medium.com/@giladrubin/5-pillars-for-a-hyper-optimized-ai-workflow-21fcaefe48ca)

## AI-Readable Docs

GitBook publishes Hypster's docs as an agent-friendly index at [llms.txt](https://gilad-rubin.gitbook.io/hypster/llms.txt) and as a full Markdown export at [llms-full.txt](https://gilad-rubin.gitbook.io/hypster/llms-full.txt).


# Installation

Hypster's core package has no runtime dependencies and supports Python 3.10, 3.11, and 3.12.

## Install With uv

{% code overflow="wrap" %}

```bash
uv add hypster
```

{% endcode %}

Install Optuna support when you want hyperparameter optimization:

{% code overflow="wrap" %}

```bash
uv add "hypster[optuna]"
```

{% endcode %}

Install the notebook visualization extra when you want Hypster's interactive instantiation UI:

{% code overflow="wrap" %}

```bash
uv add "hypster[viz]"
```

{% endcode %}

The `viz` extra installs `anywidget`, `ipywidgets`, and `jupyterlab_widgets` for Jupyter Notebook, JupyterLab, and VS Code notebooks.

If you are starting a notebook project from scratch, install the notebook frontend and Hypster widget runtime together:

{% code overflow="wrap" %}

```bash
uv add "hypster[viz]" jupyterlab
```

{% endcode %}

## Install With pip

Check that `python` points at a supported interpreter first:

{% code overflow="wrap" %}

```bash
python --version
```

{% endcode %}

Hypster supports Python 3.10, 3.11, and 3.12.

{% code overflow="wrap" %}

```bash
pip install hypster
```

{% endcode %}

Optional extras:

{% code overflow="wrap" %}

```bash
pip install "hypster[optuna]"
pip install "hypster[viz]"
```

{% endcode %}

For a new JupyterLab environment:

{% code overflow="wrap" %}

```bash
python -m pip install "hypster[viz]" jupyterlab
```

{% endcode %}

## Verify The Install

Run a smoke test in the same environment where your project runs:

{% code overflow="wrap" %}

```python
from pathlib import Path

from hypster import HP, explore, instantiate


def config(hp: HP) -> Path:
    data_dir = hp.text("data", name="data_dir")
    split = hp.select(["train", "validation"], name="split")
    return Path(data_dir) / f"{split}.csv"


explore(config)
path = instantiate(config, values={"split": "validation"})
assert path == Path("data/validation.csv")
```

{% endcode %}

Expected tree:

{% code overflow="wrap" %}

```
config
├── data_dir: text = 'data'
└── split: select = 'train'  (options: ['train', 'validation'])
```

{% endcode %}

## Check The Version

With uv:

{% code overflow="wrap" %}

```bash
uv run python -c "import hypster; print(hypster.__version__)"
```

{% endcode %}

With pip or a manually managed interpreter:

{% code overflow="wrap" %}

```bash
python -c "import hypster; print(hypster.__version__)"
```

{% endcode %}

Inside Python:

{% code overflow="wrap" %}

```python
import hypster

print(hypster.__version__)
```

{% endcode %}

## Development Setup

{% code overflow="wrap" %}

```bash
git clone https://github.com/gilad-rubin/hypster.git
cd hypster
uv sync --all-extras --dev
uv run pytest
```

{% endcode %}

Hypster's maintainer tooling lives in local `uv` dependency groups rather than a published `dev` extra, so a `pip`-based setup installs runtime extras and maintainer tools separately:

{% code overflow="wrap" %}

```bash
git clone https://github.com/gilad-rubin/hypster.git
cd hypster
python -m pip install -e ".[viz,optuna]"
python -m pip install pytest pytest-cov "coverage[toml]" ruff mypy pre-commit pytest-codspeed
```

{% endcode %}

Adjust the extras in the first command if you only need a subset of the optional runtime integrations.

## Troubleshooting

If `import hypster` fails, check that your package manager installed Hypster into the interpreter running your code:

{% code overflow="wrap" %}

```bash
python -m pip show hypster
python -c "import hypster; print(hypster.__version__)"
```

{% endcode %}

If Optuna imports fail, install the optional extra:

{% code overflow="wrap" %}

```bash
python -m pip install "hypster[optuna]"
```

{% endcode %}

If `interact()` says the visualization extra is missing, install the notebook widget extra and restart the notebook kernel:

{% code overflow="wrap" %}

```bash
python -m pip install "hypster[viz]"
```

{% endcode %}

For notebook UI issues, make sure your notebook frontend is installed:

{% code overflow="wrap" %}

```bash
# JupyterLab
uv add jupyterlab

# Classic Jupyter Notebook
uv add notebook
```

{% endcode %}

With pip:

{% code overflow="wrap" %}

```bash
python -m pip install -U jupyterlab
python -m pip install -U notebook
```

{% endcode %}

In VS Code notebooks, the Jupyter extension may ask to enable downloads for `anywidget` support files the first time the widget renders. Accept that prompt, then rerun the cell.

Hypster config functions are plain Python. No CLI, project initialization, or config file format is required.


# Define a Config Function

A Hypster configuration space is an ordinary Python function whose first argument is named `hp`. It is not a DSL or a separate config file format: use normal Python control flow, lists, helper functions, imports, and object construction.

{% code overflow="wrap" %}

```python
from hypster import HP

def config(hp: HP):
    ...
```

{% endcode %}

`hp` must be the first positional parameter; keyword-only `hp` is rejected before the config executes. The `hp: HP` annotation is recommended, but unannotated config functions are still valid.

The function can return anything your application needs, but the most common pattern is to return the initialized runtime object your application will use. Prefer a return type annotation when the output is a meaningful object.

Because Hypster discovers available parameters by running this function, keep config bodies fast and predictable. Avoid side effects such as writes, training, paid API calls, network calls, database access, or heavyweight initialization inside the config.

## Add Parameters

Use `hp.*` calls for values that should be visible, overrideable, replayable, searchable, or rendered in a UI.

{% code overflow="wrap" %}

```python
from hypster import HP
from my_app.llms import LLMClient

def llm_config(hp: HP) -> LLMClient:
    model_name = hp.select(["gpt-5.5", "gpt-5.5-mini"], name="model_name")
    temperature = hp.float(0.2, name="temperature", min=0.0, max=2.0)
    max_tokens = hp.int(1024, name="max_tokens", min=1, max=16_384)

    return LLMClient(
        model_name=model_name,
        temperature=temperature,
        max_tokens=max_tokens,
    )
```

{% endcode %}

Every public parameter needs an explicit `name=...`. Names must be valid Python identifiers, so use `batch_size`, not `batch-size` or `model.learning_rate`.

## Use Branches

Hypster is define-by-run. Only parameters touched by the active branch are selected for that run.

{% code overflow="wrap" %}

```python
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP

def model_config(hp: HP) -> ClassifierMixin:
    family = hp.select(["linear", "forest"], name="family", default="forest", options_only=True)

    if family == "linear":
        C = hp.float(1.0, name="C", min=1e-4, max=100.0)
        return LogisticRegression(C=C, max_iter=1000)

    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    return RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
```

{% endcode %}

If `family="linear"`, `n_estimators` and `max_depth` are unreachable. Hypster raises if you pass values for inactive branches.

This define-by-run model is what makes normal Python branches work: Hypster runs the function with the selected values and records the `hp.*` calls it reaches.

## Compose With Nesting

The branch example above works, but for reusable components we recommend composition with `hp.nest()`. Each model gets its own config function, the parent config only chooses which child to run, and interactive UIs can render the nested child parameters as a contained group.

{% code overflow="wrap" %}

```python
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP

def logistic_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    return RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)

model_options = {
    "linear": logistic_model,
    "forest": forest_model,
}

def model_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="family", default="forest", options_only=True)
    return hp.nest(selected_config, name="model")
```

{% endcode %}

Nested parameters use dotted paths in `values=`:

{% code overflow="wrap" %}

```python
{"family": "forest", "model.n_estimators": 500}
```

{% endcode %}

Compared with the inline branch version, this keeps the forest parameters under the `model` group and the linear parameters under the same reusable child scope. That makes larger configs easier to scan, test, reuse, and render in interactive UIs.

## Use Dict-Backed Selects For Swappable Components

The nested model example uses a dict-backed `select`: `values=` records the simple key such as `"forest"`, while the config receives the mapped function `forest_model`.

Keep the options mapping in a named variable such as `model_options`, `optimizer_options`, or `retriever_options`. That makes long option sets easier to scan, test, and reuse, while the parent config stays focused on the runtime decision. Add `options_only=True` when values outside the mapping should be rejected.

## Return Initialized Objects

It is often best to let a config return the initialized object your application will use:

{% code overflow="wrap" %}

```python
from sklearn.ensemble import RandomForestClassifier
from hypster import HP

def classifier_config(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    min_samples_leaf = hp.int(2, name="min_samples_leaf", min=1, max=50)
    random_state = hp.int(42, name="random_state", min=0)

    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_leaf=min_samples_leaf,
        random_state=random_state,
    )
```

{% endcode %}

{% hint style="info" %}
Initialized in-memory objects are a good fit when construction is cheap. For SDK clients, remote retrievers, loaded indexes, database handles, training jobs, or writes, return lightweight settings and build the side-effectful object after `instantiate()`. See [Best Practices](/hypster/in-depth/basic-best-practices).
{% endhint %}

## Next Step

Use [Explore a Configuration Space](/hypster/getting-started/exploring-a-configuration-space) to inspect the parameter tree before running it.


# Explore a Configuration Space

Use `explore()` to inspect the parameters that are reachable for the active branch of a config function. Because Hypster configs are pure Python rather than a declarative DSL, exploration discovers the schema by executing the config with a schema-recording `HP`. It follows the same Python conditionals, loops, and helper calls as `instantiate()`.

That execution should be cheap and safe. Keep side effects, paid API calls, database work, file writes, training, and costly object construction outside the config paths you explore.

## Print The Parameter Tree

{% code overflow="wrap" %}

```python
from hypster import HP, explore
from my_app.backends import Application, LocalBackend, RemoteBackend

def local_backend(hp: HP) -> LocalBackend:
    threads = hp.int(4, name="threads", min=1, max=64)
    cache = hp.bool(True, name="cache")
    return LocalBackend(threads=threads, cache=cache)

def remote_backend(hp: HP) -> RemoteBackend:
    endpoint = hp.text("https://api.example.com", name="endpoint")
    timeout = hp.float(10.0, name="timeout", min=0.1, max=120.0)
    return RemoteBackend(endpoint=endpoint, timeout=timeout)

backend_options = {
    "local": local_backend,
    "remote": remote_backend,
}

def app_config(hp: HP) -> Application:
    selected_config = hp.select(backend_options, name="backend", default="local", options_only=True)
    backend = hp.nest(selected_config, name="backend_settings")
    return Application(backend=backend)

explore(app_config)
```

{% endcode %}

Expected output:

{% code overflow="wrap" %}

```
app_config
├── backend: select = "local"  (options: ["local", "remote"])
└── backend_settings
    ├── threads: int = 4  (1-64)
    └── cache: bool = True
```

{% endcode %}

## Explore A Different Conditional Branch

Pass `values=` to choose a branch before tracing it:

{% code overflow="wrap" %}

```python
explore(
    app_config,
    values={"backend": "remote", "backend_settings.timeout": 30.0},
)
```

{% endcode %}

Expected output:

{% code overflow="wrap" %}

```
app_config
├── backend: select = "remote"  (options: ["local", "remote"])
└── backend_settings
    ├── endpoint: text = "https://api.example.com"
    └── timeout: float = 30.0  (0.1-120.0)
```

{% endcode %}

## Get Structured Metadata

Use `return_schema=True` when you want to inspect the schema in code:

{% code overflow="wrap" %}

```python
info = explore(app_config, return_schema=True)

print(info.defaults())
print(info.to_dict())
```

{% endcode %}

For programmatic inspection before instantiation, use `schema = explore(config, return_schema=True)` and read `schema.to_dict()["parameters"]`. Use plain `explore(config)` when a printed tree is enough.

`defaults()` returns a flat dictionary of the active branch's default parameter values:

{% code overflow="wrap" %}

```python
{
    "backend": "local",
    "backend_settings.threads": 4,
    "backend_settings.cache": True,
}
```

{% endcode %}

## When To Use `explore()` vs `instantiate()`

| Use                          | API                                   |
| ---------------------------- | ------------------------------------- |
| Print the active tree        | `explore(config)`                     |
| Build a UI or schema export  | `explore(config, return_schema=True)` |
| Inspect a conditional branch | `explore(config, values={...})`       |
| Get the runtime object       | `instantiate(config, values={...})`   |

By default, `explore()` raises when `values=` contains unknown names or overrides for a branch that was not reached. Pass `on_unknown="warn"` to inspect while warning, or `on_unknown="ignore"` to silence it.

## Notes

* Exploration runs your config function. Keep config functions fast and side-effect-free, just like you would for HPO or interactive UI generation.
* Avoid paid API calls, database calls, file writes, training loops, and costly resource initialization in code paths that `explore()` will execute.
* `explore()` records `hp.*` parameters, nested groups, defaults, selected values, options, and numeric bounds.
* Select choices are converted to JSON-friendly values in `to_dict()`.
* `explore()` does not instantiate external services unless your config function does so directly.


# Instantiate and Replay

Use `instantiate()` to execute a config function and get its returned runtime value.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

def linear_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    return RandomForestClassifier(n_estimators=n_estimators, random_state=42)

model_options = {"linear": linear_model, "forest": forest_model}

def model_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="family", default="forest", options_only=True)
    return hp.nest(selected_config, name="model")

model = instantiate(model_config, values={"family": "forest", "model.n_estimators": 500})
assert isinstance(model, RandomForestClassifier)
```

{% endcode %}

If you want to inspect a branch before running it, use [`explore()`](/hypster/getting-started/exploring-a-configuration-space).

## Log Selected Params

Use `instantiate_with_params()` when you need a replayable record for experiment tracking tools such as MLflow, Weights & Biases, or a database table. The returned `params` dictionary includes every reachable `hp.*` parameter on that run, including defaults the user did not override.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate, instantiate_with_params
from my_app.llms import OpenAIClient

def llm_config(hp: HP) -> OpenAIClient:
    model_name = hp.select(["gpt-5.5-mini", "gpt-5.5"], name="model_name")
    temperature = hp.float(0.2, name="temperature", min=0.0, max=2.0)
    return OpenAIClient(model=model_name, temperature=temperature)

run = instantiate_with_params(llm_config, values={"model_name": "gpt-5.5"})

assert run.value.model == "gpt-5.5"
assert run.params == {"model_name": "gpt-5.5", "temperature": 0.2}

replayed = instantiate(llm_config, values=run.params)
assert replayed.model == run.value.model
```

{% endcode %}

`instantiate_with_params()` accepts the same `values` and `on_unknown` arguments as `instantiate()`, plus any direct execution arguments your config requires. It does not change what your config returns; it adds a sidecar for logging and replay.

## Unknown Parameters

Unknown or conditionally unreachable values raise by default:

{% code overflow="wrap" %}

```python
instantiate(model_config, values={"n_trees": 200})
# ValueError: Unknown or unreachable parameters:
#   - 'n_trees': Unknown parameter
#
# Run explore(config, values=...) to inspect the active branch.
# Nested dict values are interpreted as parameter paths; use dict-backed select keys for objects.
```

{% endcode %}

Use `on_unknown="warn"` or `on_unknown="ignore"` only when you intentionally want softer handling:

{% code overflow="wrap" %}

```python
instantiate(model_config, values={"n_trees": 200}, on_unknown="warn")
```

{% endcode %}

## Dotted Keys vs Nested Dicts

Nested values can be provided with dotted keys or nested dictionaries:

{% code overflow="wrap" %}

```python
def child_config(hp: HP):
    return {"x": hp.int(1, name="x")}

def parent_config(hp: HP):
    return {"child": hp.nest(child_config, name="child")}

assert instantiate(parent_config, values={"child.x": 10}) == {"child": {"x": 10}}
assert instantiate(parent_config, values={"child": {"x": 10}}) == {"child": {"x": 10}}
```

{% endcode %}

Do not provide both forms for the same final path in the same call. Hypster raises duplicate-path errors to keep logs unambiguous. The nested scope name itself is not a leaf parameter, so `values={"child": 10}` raises as unknown or unreachable.

See [Values & Overrides](/hypster/in-depth/values-and-overrides) for more examples.

## Nullable Parameters

`None` is an explicit value in Hypster, not a marker for "not selected". Use `allow_none=True` when `None` is part of a parameter's domain:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    thinking_level = hp.select(
        [None, "low", "medium", "high"],
        name="thinking_level",
        default=None,
        allow_none=True,
    )
    return {"max_depth": max_depth, "thinking_level": thinking_level}
```

{% endcode %}

Nullable elements are supported for `multi_select(..., allow_none=True)`. They are not supported for `multi_int`, `multi_float`, `multi_text`, or `multi_bool`; those calls raise with guidance if `allow_none=True` is passed.

Nullable choices are captured and replayed like any other selected parameter:

{% code overflow="wrap" %}

```python
from hypster import instantiate_with_params

run = instantiate_with_params(config, values={"thinking_level": None})

assert run.value == {"max_depth": None, "thinking_level": None}
assert run.params["thinking_level"] is None
assert instantiate(config, values=run.params) == run.value
```

{% endcode %}

## Passing Execution Arguments To Nested Configs

When using `hp.nest`, pass child execution arguments directly as keyword arguments.

{% code overflow="wrap" %}

```python
def child(hp: HP, multiplier: int, offset: int = 0) -> int:
    base = hp.int(5, name="base")
    return base * multiplier + offset

def parent(hp: HP):
    calc1 = hp.nest(child, name="calc1", multiplier=2)
    calc2 = hp.nest(child, name="calc2", multiplier=3, offset=10)
    return {"calc1": calc1, "calc2": calc2}

result = instantiate(parent)
assert result == {"calc1": 10, "calc2": 25}
```

{% endcode %}


# Select Return Values

Hypster config functions return exactly what you decide to return. There is no framework-specific config object to build. In most application code, the cleanest return value is the initialized object that the caller will actually use.

## Return Initialized Runtime Objects

The common Hypster style is to make the config a typed factory for a runtime object.

{% code overflow="wrap" %}

```python
from sklearn.ensemble import RandomForestClassifier
from hypster import HP, instantiate

def classifier_config(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    min_samples_leaf = hp.int(2, name="min_samples_leaf", min=1, max=50)
    random_state = hp.int(42, name="random_state", min=0)

    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_leaf=min_samples_leaf,
        random_state=random_state,
    )

model = instantiate(classifier_config, values={"n_estimators": 500})
assert isinstance(model, RandomForestClassifier)
```

{% endcode %}

The return type annotation helps readers, IDEs, and downstream code understand what instantiation produces.

## Return A Dict When The Runtime Shape Is A Dict

Returning a dict is fine when the object your application needs is actually a mapping. Avoid using dicts as a generic bag of settings that must be assembled somewhere else.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def metric_tags_config(hp: HP) -> dict[str, str]:
    environment = hp.select(["dev", "staging", "prod"], name="environment", default="dev", options_only=True)
    owner = hp.text("ml-platform", name="owner")
    return {"environment": environment, "owner": owner}

tags = instantiate(metric_tags_config, values={"environment": "prod"})
assert tags == {"environment": "prod", "owner": "ml-platform"}
```

{% endcode %}

## Use Dict-Backed Selects For Swappable Objects

Use dict-backed `select` when a parameter should log a simple key but return a more complex runtime value. For swappable configs, map keys to config functions and nest the selected function.

{% code overflow="wrap" %}

```python
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP

def logistic_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    min_samples_leaf = hp.int(2, name="min_samples_leaf", min=1, max=50)
    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_leaf=min_samples_leaf,
        random_state=42,
    )

model_options = {
    "logistic": logistic_model,
    "forest": forest_model,
}

def classifier_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="model_family", default="forest", options_only=True)
    return hp.nest(selected_config, name="model")
```

{% endcode %}

The named `model_options` dict is part of the pattern. If the option set is long, keeping it separate from the parent config is usually clearer than embedding a large dictionary inside `hp.select(...)` or growing an `if`/`elif` chain. Nesting also gives UI renderers a natural group for the selected child's controls.

## Use hp.collect

`hp.collect()` helps gather local variables while excluding `hp`, private names, and anything you explicitly exclude.

{% code overflow="wrap" %}

```python
from hypster import HP

def config(hp: HP):
    batch_size = hp.int(32, name="batch_size")
    learning_rate = hp.float(0.001, name="learning_rate")
    helper = "not returned"
    return hp.collect(locals(), exclude=["helper"])
```

{% endcode %}

Use `include=[...]` when you want to whitelist outputs:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    batch_size = hp.int(32, name="batch_size")
    learning_rate = hp.float(0.001, name="learning_rate")
    debug_label = "local"
    return hp.collect(locals(), include=["batch_size", "learning_rate"])
```

{% endcode %}

## Keep Expensive Side Effects Outside Configs

Initializing in-memory objects is a good fit for Hypster configs. Keep expensive effects such as training, network calls, or database writes outside the config function when possible. That keeps `explore()`, HPO, UI generation, and replay fast and predictable.


# Interactive Instantiation UI

Use `interact()` in a notebook when you want to instantiate a configuration through a live widget UI.

`interact()` uses the same pure-Python define-by-run model as `explore()`: it runs the config to discover the currently reachable controls. In auto-apply mode it can rerun the config on every valid widget change, and in manual mode it still explores draft changes so dependent controls stay current. Keep configs used with `interact()` fast and side-effect-free.

Install the notebook renderer with the visualization extra:

{% code overflow="wrap" %}

```bash
uv add "hypster[viz]"
```

{% endcode %}

or:

{% code overflow="wrap" %}

```bash
pip install "hypster[viz]"
```

{% endcode %}

The `viz` extra installs the widget runtime needed by Jupyter Notebook, JupyterLab, and VS Code notebooks. In VS Code, the Jupyter extension may ask to enable downloads for `anywidget` support files the first time a widget is displayed. Accept that prompt, then rerun the cell.

## Start An Interaction

{% code overflow="wrap" %}

```python
from hypster import HP, interact
from my_app.llms import AnthropicClient, OpenAIClient


def openai_config(hp: HP) -> OpenAIClient:
    model_name = hp.select(
        ["gpt-5.5-mini", "gpt-5.5"],
        name="model_name",
        default="gpt-5.5-mini",
        options_only=True,
    )
    temperature = hp.float(0.2, name="temperature", min=0.0, max=1.0)
    cache = hp.bool(True, name="cache")
    return OpenAIClient(model=model_name, temperature=temperature, cache=cache)


def anthropic_config(hp: HP) -> AnthropicClient:
    model_name = hp.select(
        ["claude-sonnet", "claude-opus"],
        name="model_name",
        default="claude-sonnet",
        options_only=True,
    )
    temperature = hp.float(0.2, name="temperature", min=0.0, max=1.0)
    cache = hp.bool(True, name="cache")
    return AnthropicClient(model=model_name, temperature=temperature, cache=cache)


model_options = {
    "openai": openai_config,
    "anthropic": anthropic_config,
}


def model_config(hp: HP):
    selected_config = hp.select(
        model_options,
        name="provider",
        default="openai",
        options_only=True,
        description="Chooses which provider branch is active.",
    )
    return hp.nest(selected_config, name="model")


result = interact(model_config)
```

{% endcode %}

`interact()` returns an interactive result handle, not the raw configured object. After changing the widget, read the current applied object and replayable selected params from Python:

{% code overflow="wrap" %}

```python
client = result.value
params = result.params
```

{% endcode %}

`result.value` has the same type as the config function return value. In this example it is an initialized `OpenAIClient` or `AnthropicClient`.

`result.params` is a flat dotted-path dictionary that can be replayed through `instantiate(..., values=result.params)` or logged to experiment-tracking tools.

## Applying Changes

By default, widget changes apply immediately. Valid changes update `result.value` and `result.params` in the running kernel.

Use manual apply mode when you want to stage widget edits before updating the applied result:

{% code overflow="wrap" %}

```python
result = interact(model_config, auto_apply=False)
```

{% endcode %}

In manual mode, the UI continues to explore draft values so dependent controls stay current, but `result.value` and `result.params` keep returning the last applied state until Apply succeeds.

If a widget selection is invalid, the UI shows the current error. In auto-apply mode, `result.value` and `result.params` raise that error until the widget state is fixed; snapshots report `selected_params=None` so renderer code does not confuse stale params with the current invalid state.

| State                                         | What the widget shows                                | `result.value` / `result.params`                                       |
| --------------------------------------------- | ---------------------------------------------------- | ---------------------------------------------------------------------- |
| Auto-apply, valid edit                        | The edit is applied immediately.                     | Updated to the new applied value and params.                           |
| Auto-apply, invalid edit                      | The UI shows the validation error.                   | Raise `RuntimeError` until the state is fixed.                         |
| Manual mode, valid draft edit                 | Controls update and dependent branches are explored. | Keep returning the last applied value and params until Apply succeeds. |
| Manual mode, invalid draft edit               | The UI shows a draft error and disables Apply.       | Keep returning the last successfully applied value and params.         |
| Manual mode, Apply succeeds                   | The draft becomes the applied state.                 | Updated to the new applied value and params.                           |
| Manual mode, Apply fails during instantiation | The UI shows the apply error.                        | Raise `RuntimeError` until a valid state is applied.                   |

## Continuing An Interaction

Call `result.interact()` to render another live view of the same interaction:

{% code overflow="wrap" %}

```python
result.interact()
```

{% endcode %}

To start a fresh session from a previous selection, pass selected params explicitly:

{% code overflow="wrap" %}

```python
result2 = interact(model_config, values=result.params)
```

{% endcode %}

For framework-specific UIs outside notebooks, use the schema returned by `explore(..., return_schema=True)`. See [Build an Interactive UI](/hypster/how-to-guides/build-an-interactive-ui).


# Examples Overview

Use these examples as copyable shapes for real projects. Each page focuses on one workflow and only uses public Hypster APIs.

Hypster config functions are plain Python functions:

{% code overflow="wrap" %}

```python
from pathlib import Path

from hypster import HP

def config(hp: HP) -> Path:
    mode = hp.select(["fast", "accurate"], name="mode", default="fast")
    return Path("runs") / mode
```

{% endcode %}

The same function can serve several jobs:

* `explore(config)` prints the active parameter tree.
* `explore(config, return_schema=True)` returns a schema object for tools and UIs.
* `instantiate(config, values={...})` returns the runtime object.
* `instantiate_with_params(config, values={...})` returns the runtime object plus replayable selected params.
* `suggest_values(trial, config)` lets Optuna choose values for the reachable parameters.

## Example Map

| Page                                                           | Use it when you need                                                                           |
| -------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| [Machine Learning](/hypster/examples/machine-learning)         | Model-family branches, numeric bounds, nullable values, and HPO-ready configs.                 |
| [Data Processing](/hypster/examples/data-processing)           | Ingestion, cleaning, feature flags, and export options in one pipeline config.                 |
| [AI Workflows](/hypster/examples/ai-workflows)                 | Provider selection, RAG knobs, prompt settings, and dict-backed complex choices.               |
| [Nested Workflows](/hypster/examples/nested-workflows)         | Reusable child configs, conditional nesting, deep value paths, and branch exploration.         |
| [Interactive UI From Schema](/hypster/examples/interactive-ui) | Generate form state from `explore(..., return_schema=True)` and feed it back to `instantiate`. |
| [Experiment Tracking](/hypster/examples/experiment-tracking)   | Capture selected params for logs, cards, and replay.                                           |

## A Small End-to-End Pattern

{% code overflow="wrap" %}

```python
from hypster import HP, explore, instantiate_with_params
from my_app.processing import PipelineRun

def pipeline_config(hp: HP) -> PipelineRun:
    stage = hp.select(["debug", "full"], name="stage", default="debug")

    if stage == "full":
        sample_rows = hp.int(1_000_000, name="sample_rows", min=100, max=10_000_000)
    else:
        sample_rows = hp.int(1000, name="sample_rows", min=100, max=1_000_000)

    return PipelineRun(stage=stage, sample_rows=sample_rows)

explore(pipeline_config)

run = instantiate_with_params(
    pipeline_config,
    values={"stage": "full", "sample_rows": 250_000},
)

assert run.value.stage == "full"
assert run.value.sample_rows == 250_000
assert run.params == {"stage": "full", "sample_rows": 250_000}
```

{% endcode %}

{% hint style="info" %}
Hypster raises when `values=` includes unknown parameters or parameters from a branch that was not reached. That default protects replayability. Use `explore(config, values=...)` to inspect the branch you are about to run.
{% endhint %}


# Machine Learning

This example shows the recommended ML shape: each Hypster config returns an initialized model object with a precise return type annotation. The snippets assume your project has scikit-learn installed.

## Define Model Factories

{% code overflow="wrap" %}

```python
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP, explore, instantiate_with_params

def logistic_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    solver = hp.select(["lbfgs", "liblinear"], name="solver", default="lbfgs", options_only=True)

    return LogisticRegression(
        C=C,
        solver=solver,
        max_iter=1000,
    )

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    min_samples_leaf = hp.int(2, name="min_samples_leaf", min=1, max=50)
    random_state = hp.int(42, name="random_state", min=0)

    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_leaf=min_samples_leaf,
        random_state=random_state,
    )
```

{% endcode %}

## Choose The Active Model

For model families, keep the selectable options in a dict from replayable keys to child config functions. This scales better than a long conditional, keeps each model's hyperparameters close to the object it initializes, and lets UIs show the selected model parameters as one nested group.

{% code overflow="wrap" %}

```python
model_options = {
    "logistic": logistic_model,
    "forest": forest_model,
}

def classifier_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="model_family", default="forest", options_only=True)
    return hp.nest(selected_config, name="model")
```

{% endcode %}

The config returns a ready-to-use classifier, not a dictionary that must be assembled later. The selected params record `"forest"` or `"logistic"`, while the application receives the initialized estimator.

## Explore And Instantiate

{% code overflow="wrap" %}

```python
explore(classifier_config)

run = instantiate_with_params(
    classifier_config,
    values={
        "model_family": "forest",
        "model.n_estimators": 500,
        "model.max_depth": 12,
    },
)

model = run.value
params = run.params
```

{% endcode %}

Expected selected params:

{% code overflow="wrap" %}

```python
{
    "model_family": "forest",
    "model.n_estimators": 500,
    "model.max_depth": 12,
    "model.min_samples_leaf": 2,
    "model.random_state": 42,
}
```

{% endcode %}

Use `model.fit(X_train, y_train)` in your training code after instantiation. Keep training itself outside the config function so `explore()`, HPO, and UI generation stay fast.

## Build A Full Pipeline Config

For larger experiments, nest preprocessing and model configs under one parent. Each nested config owns one part of the workflow, while the selected params remain a flat replayable dictionary.

{% code overflow="wrap" %}

```python
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import RobustScaler, StandardScaler


def preprocessing_config(hp: HP) -> Pipeline:
    scaler = hp.select(
        [None, "standard", "robust"],
        name="scaler",
        default="standard",
        allow_none=True,
        options_only=True,
    )
    impute_strategy = hp.select(["median", "mean"], name="impute_strategy", default="median", options_only=True)

    steps = [("imputer", SimpleImputer(strategy=impute_strategy))]
    if scaler == "standard":
        steps.append(("scaler", StandardScaler()))
    elif scaler == "robust":
        steps.append(("scaler", RobustScaler()))

    return Pipeline(steps)


def experiment_config(hp: HP) -> Pipeline:
    preprocessing = hp.nest(preprocessing_config, name="preprocessing")
    classifier = hp.nest(classifier_config, name="classifier")
    return Pipeline([("preprocessing", preprocessing), ("classifier", classifier)])


run = instantiate_with_params(
    experiment_config,
    values={
        "preprocessing.scaler": "robust",
        "classifier.model_family": "forest",
        "classifier.model.n_estimators": 500,
    },
)

assert isinstance(run.value.named_steps["classifier"], RandomForestClassifier)
assert run.params["preprocessing.scaler"] == "robust"
assert run.params["classifier.model.n_estimators"] == 500
```

{% endcode %}

Use `run.value.fit(X_train, y_train)` in your training code, and log `run.params` as the exact experiment record.

## Make It HPO-Ready

Add `hpo_spec=` where search semantics matter. Normal instantiation ignores these specs; the Optuna adapter consumes them.

{% code overflow="wrap" %}

```python
from hypster.hpo.types import HpoFloat, HpoInt

def tunable_forest(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(
        200,
        name="n_estimators",
        min=50,
        max=1000,
        hpo_spec=HpoInt(step=50),
    )
    max_depth = hp.int(
        12,
        name="max_depth",
        min=2,
        max=64,
        hpo_spec=HpoInt(scale="log"),
    )

    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=42,
    )

def regularized_logistic(hp: HP) -> LogisticRegression:
    C = hp.float(
        1.0,
        name="C",
        min=1e-4,
        max=100.0,
        hpo_spec=HpoFloat(scale="log"),
    )

    return LogisticRegression(C=C, max_iter=1000)
```

{% endcode %}

See [Perform Hyperparameter Optimization](/hypster/how-to-guides/perform-hyperparameter-optimization) for the Optuna objective pattern.


# Data Processing

Data pipelines often mix environment choices, schema decisions, cleaning rules, and export settings. Hypster keeps those choices explicit and replayable.

## Configure The Pipeline

{% code overflow="wrap" %}

```python
from hypster import HP, explore, instantiate
from my_app.data import Cleaner, CsvReader, DataPipeline
from my_app.exporters import CsvExporter, JsonLinesExporter, ParquetExporter

def input_config(hp: HP) -> CsvReader:
    path = hp.text("data/raw/events.csv", name="path")
    delimiter = hp.select([",", "\t", "|"], name="delimiter", default=",", options_only=True)
    encoding = hp.select(["utf-8", "latin-1"], name="encoding", default="utf-8", options_only=True)

    return CsvReader(
        path=path,
        delimiter=delimiter,
        encoding=encoding,
    )

def cleaning_config(hp: HP) -> Cleaner:
    drop_empty_rows = hp.bool(True, name="drop_empty_rows")
    normalize_columns = hp.bool(True, name="normalize_columns")
    fill_missing_numeric = hp.float(None, name="fill_missing_numeric", allow_none=True)
    date_columns = hp.multi_text(["created_at"], name="date_columns")

    return Cleaner(
        drop_empty_rows=drop_empty_rows,
        normalize_columns=normalize_columns,
        fill_missing_numeric=fill_missing_numeric,
        date_columns=date_columns,
    )

def parquet_export_config(hp: HP) -> ParquetExporter:
    path = hp.text("data/processed/events.parquet", name="path")
    return ParquetExporter(path=path)

def csv_export_config(hp: HP) -> CsvExporter:
    path = hp.text("data/processed/events.csv", name="path")
    include_header = hp.bool(True, name="include_header")

    return CsvExporter(
        path=path,
        include_header=include_header,
    )

def jsonl_export_config(hp: HP) -> JsonLinesExporter:
    path = hp.text("data/processed/events.jsonl", name="path")
    return JsonLinesExporter(path=path)

export_options = {
    "parquet": parquet_export_config,
    "csv": csv_export_config,
    "jsonl": jsonl_export_config,
}

def export_config(hp: HP):
    selected_config = hp.select(export_options, name="format", default="parquet", options_only=True)
    return hp.nest(selected_config, name="settings")

def data_pipeline_config(hp: HP) -> DataPipeline:
    mode = hp.select(["sample", "full"], name="mode", default="sample", options_only=True)

    if mode == "full":
        row_limit = hp.int(10_000_000, name="row_limit", min=1)
    else:
        row_limit = hp.int(10_000, name="row_limit", min=1)

    reader = hp.nest(input_config, name="input")
    cleaner = hp.nest(cleaning_config, name="cleaning")
    exporter = hp.nest(export_config, name="export")

    return DataPipeline(
        mode=mode,
        row_limit=row_limit,
        reader=reader,
        cleaner=cleaner,
        exporter=exporter,
    )
```

{% endcode %}

This shape assumes the reader, cleaner, and exporter constructors are lightweight and do not read or write data. Run the actual pipeline after `instantiate()` returns.

## Explore A Production Branch

{% code overflow="wrap" %}

```python
explore(
    data_pipeline_config,
    values={
        "mode": "full",
        "input.path": "s3://warehouse/events/2026-05-24.csv",
        "export.format": "jsonl",
        "export.settings.path": "s3://warehouse/processed/events.jsonl",
    },
)
```

{% endcode %}

## Instantiate A Run

{% code overflow="wrap" %}

```python
pipeline = instantiate(
    data_pipeline_config,
    values={
        "mode": "full",
        "input.path": "s3://warehouse/events/2026-05-24.csv",
        "input.delimiter": ",",
        "cleaning.fill_missing_numeric": 0.0,
        "export.format": "jsonl",
        "export.settings.path": "s3://warehouse/processed/events.jsonl",
    },
)

assert pipeline.mode == "full"
assert pipeline.reader.path.startswith("s3://")
assert pipeline.cleaner.fill_missing_numeric == 0.0
```

{% endcode %}

## Why This Shape Works

* The `mode` branch changes the default row limit while keeping the same parameter path.
* Dotted keys keep nested runtime objects replayable without returning a settings dictionary.
* `options_only=True` prevents typos in finite choices such as export formats.
* Nullable numeric values use `allow_none=True`, which makes `None` an explicit, replayable value.


# AI Workflows

Hypster is useful when an AI workflow needs to switch providers, retrieval modes, prompts, safety settings, or output formats while keeping a replayable record of the selected path.

## Provider Configs

{% code overflow="wrap" %}

```python
from hypster import HP, explore, instantiate_with_params
from my_app.llms import GeminiClient, OpenAIClient

def openai_config(hp: HP) -> OpenAIClient:
    model_name = hp.select(
        ["gpt-5.5-mini", "gpt-5.5"],
        name="model_name",
        default="gpt-5.5-mini",
        options_only=True,
    )
    temperature = hp.float(0.2, name="temperature", min=0.0, max=2.0)
    max_tokens = hp.int(1024, name="max_tokens", min=1, max=16_384)
    reasoning_effort = hp.select(
        [None, "low", "medium", "high"],
        name="reasoning_effort",
        default=None,
        allow_none=True,
        options_only=True,
    )
    return OpenAIClient(
        model=model_name,
        temperature=temperature,
        max_tokens=max_tokens,
        reasoning_effort=reasoning_effort,
    )

def gemini_config(hp: HP) -> GeminiClient:
    model_name = hp.select(
        ["gemini-3.5-flash", "gemini-3.1-pro-preview"],
        name="model_name",
        default="gemini-3.5-flash",
        options_only=True,
    )
    temperature = hp.float(0.3, name="temperature", min=0.0, max=2.0)
    max_tokens = hp.int(2048, name="max_tokens", min=1, max=16_384)
    return GeminiClient(model=model_name, temperature=temperature, max_tokens=max_tokens)

provider_options = {
    "openai": openai_config,
    "gemini": gemini_config,
}
```

{% endcode %}

These examples assume client and retriever constructors are local, cheap, and lazy. If construction opens network connections, loads indexes, or calls paid APIs, keep that work outside the config and build it after `instantiate()`.

## RAG And Output Settings

Use dict-backed `select` when the runtime value is complex. The parameter records the simple key, while your app receives the mapped object or selected config function.

{% code overflow="wrap" %}

```python
from my_app.rag import BM25Retriever, DenseRetriever, HybridRetriever
from my_app.rendering import JsonRenderer, MarkdownRenderer, TextRenderer

def keyword_retriever(hp: HP) -> BM25Retriever:
    index_name = hp.text("docs-v1", name="index_name")
    top_k = hp.int(8, name="top_k", min=1, max=50)
    return BM25Retriever(index=index_name, top_k=top_k)

def vector_retriever(hp: HP) -> DenseRetriever:
    index_name = hp.text("docs-embeddings-v3", name="index_name")
    top_k = hp.int(8, name="top_k", min=1, max=50)
    return DenseRetriever(index=index_name, top_k=top_k)

def hybrid_retriever(hp: HP) -> HybridRetriever:
    index_name = hp.text("docs-hybrid-v2", name="index_name")
    keyword_weight = hp.float(0.35, name="keyword_weight", min=0.0, max=1.0)
    top_k = hp.int(8, name="top_k", min=1, max=50)
    return HybridRetriever(index=index_name, keyword_weight=keyword_weight, top_k=top_k)

retriever_options = {
    "keyword": keyword_retriever,
    "vector": vector_retriever,
    "hybrid": hybrid_retriever,
}

def retrieval_config(hp: HP):
    selected_config = hp.select(retriever_options, name="retriever_kind", default="hybrid", options_only=True)
    return hp.nest(selected_config, name="retriever")

def output_config(hp: HP):
    renderer_cls = hp.select(
        {
            "text": TextRenderer,
            "json": JsonRenderer,
            "markdown": MarkdownRenderer,
        },
        name="format",
        default="text",
        options_only=True,
    )
    include_citations = hp.bool(True, name="include_citations")
    system_prompt = hp.text("Answer with concise, sourced reasoning.", name="system_prompt")
    return renderer_cls(
        include_citations=include_citations,
        system_prompt=system_prompt,
    )
```

{% endcode %}

## Compose The Workflow

{% code overflow="wrap" %}

```python
from my_app.workflows import QAWorkflow

def qa_workflow_config(hp: HP) -> QAWorkflow:
    selected_provider = hp.select(provider_options, name="provider", default="openai", options_only=True)
    llm = hp.nest(selected_provider, name="llm")
    retriever = hp.nest(retrieval_config, name="retrieval")
    output = hp.nest(output_config, name="output")

    return QAWorkflow(
        llm=llm,
        retriever=retriever,
        output=output,
    )
```

{% endcode %}

## Explore And Replay

{% code overflow="wrap" %}

```python
explore(
    qa_workflow_config,
    values={"provider": "gemini", "llm.temperature": 0.1, "retrieval.retriever.top_k": 12},
)

run = instantiate_with_params(
    qa_workflow_config,
    values={
        "provider": "gemini",
        "llm.model_name": "gemini-3.1-pro-preview",
        "llm.temperature": 0.1,
        "retrieval.retriever_kind": "vector",
        "output.format": "markdown",
    },
)

assert run.params["provider"] == "gemini"
assert run.params["llm.model_name"] == "gemini-3.1-pro-preview"
assert run.params["retrieval.retriever_kind"] == "vector"
assert run.value.retriever.index == "docs-embeddings-v3"
```

{% endcode %}

## Branch Safety

This raises by default because `llm.reasoning_effort` is only reachable when the `openai` branch is selected:

{% code overflow="wrap" %}

```python
from hypster import instantiate

instantiate(
    qa_workflow_config,
    values={"provider": "gemini", "llm.reasoning_effort": "high"},
)
```

{% endcode %}

Run `explore(qa_workflow_config, values={"provider": "gemini"})` before instantiation to inspect the reachable parameter paths for that branch.


# Nested Workflows

Use `hp.nest()` to split a large workflow into smaller config functions. Nested config values receive dotted parameter paths such as `trainer.optimizer.settings.learning_rate`.

## A Deeply Nested Training Workflow

{% code overflow="wrap" %}

```python
from hypster import HP, explore, instantiate
from my_app.training import Adam, Optimizer, SGD, Trainer, TrainingExperiment

def adam_config(hp: HP) -> Adam:
    learning_rate = hp.float(0.001, name="learning_rate", min=1e-6, max=1.0)
    beta1 = hp.float(0.9, name="beta1", min=0.0, max=0.999)
    return Adam(learning_rate=learning_rate, beta1=beta1)

def sgd_config(hp: HP) -> SGD:
    learning_rate = hp.float(0.01, name="learning_rate", min=1e-6, max=1.0)
    momentum = hp.float(0.9, name="momentum", min=0.0, max=1.0)
    return SGD(learning_rate=learning_rate, momentum=momentum)

optimizer_options = {
    "adam": adam_config,
    "sgd": sgd_config,
}

def optimizer_config(hp: HP) -> Optimizer:
    selected_config = hp.select(optimizer_options, name="algorithm", default="adam", options_only=True)
    return hp.nest(selected_config, name="settings")

def trainer_config(hp: HP, epochs_default: int = 10) -> Trainer:
    epochs = hp.int(epochs_default, name="epochs", min=1, max=1000)
    batch_size = hp.int(64, name="batch_size", min=1, max=2048)
    optimizer = hp.nest(optimizer_config, name="optimizer")
    return Trainer(epochs=epochs, batch_size=batch_size, optimizer=optimizer)

def experiment_config(hp: HP) -> TrainingExperiment:
    dataset = hp.select(["mnist", "cifar10"], name="dataset", default="mnist", options_only=True)
    trainer = hp.nest(trainer_config, name="trainer", epochs_default=20)
    return TrainingExperiment(dataset=dataset, trainer=trainer)
```

{% endcode %}

The `optimizer_options` dict is deliberately separate from `optimizer_config()`. For nested workflows, that keeps the parent function small while making the set of selectable components obvious.

## Override Nested Values

{% code overflow="wrap" %}

```python
cfg = instantiate(
    experiment_config,
    values={
        "dataset": "cifar10",
        "trainer.epochs": 50,
        "trainer.optimizer.algorithm": "sgd",
        "trainer.optimizer.settings.momentum": 0.95,
    },
)

assert isinstance(cfg.trainer.optimizer, SGD)
assert cfg.trainer.optimizer.momentum == 0.95
```

{% endcode %}

The same values can be expressed as nested dictionaries:

{% code overflow="wrap" %}

```python
cfg = instantiate(
    experiment_config,
    values={
        "dataset": "cifar10",
        "trainer": {
            "epochs": 50,
            "optimizer": {
                "algorithm": "sgd",
                "settings": {"momentum": 0.95},
            },
        },
    },
)
```

{% endcode %}

Do not provide both spellings for the same final parameter path. Hypster raises duplicate-path errors so logs and replays stay unambiguous.

## Explore A Nested Branch

{% code overflow="wrap" %}

```python
explore(
    experiment_config,
    values={"trainer.optimizer.algorithm": "sgd"},
)
```

{% endcode %}

The printed tree includes only the active optimizer branch, so you can see that `momentum` is reachable and `beta1` is not.


# Interactive UI From Schema

Hypster ships a notebook UI through `interact()`. This example shows the lower-level schema path for custom Streamlit, Gradio, Panel, web app, or internal dashboard UIs.

The public schema hook is `explore(config, return_schema=True)`: it returns metadata that you can map to controls and then replay through `instantiate()`.

## Turn A Config Into Field Metadata

{% code overflow="wrap" %}

```python
from hypster import HP, explore, instantiate
from my_app.backends import AppRuntime, LocalBackend, RemoteBackend

def app_config(hp: HP) -> AppRuntime:
    provider = hp.select(["local", "remote"], name="provider", default="local", options_only=True)
    batch_size = hp.int(32, name="batch_size", min=1, max=512)

    if provider == "remote":
        endpoint = hp.text("https://api.example.com", name="endpoint")
        timeout = hp.float(10.0, name="timeout", min=0.1, max=120.0)
        backend = RemoteBackend(endpoint=endpoint, timeout=timeout)
        return AppRuntime(provider=provider, batch_size=batch_size, backend=backend)

    threads = hp.int(4, name="threads", min=1, max=64)
    backend = LocalBackend(threads=threads)
    return AppRuntime(provider=provider, batch_size=batch_size, backend=backend)

def flatten_parameters(parameters):
    fields = []
    for parameter in parameters:
        if parameter["kind"] == "group":
            fields.extend(flatten_parameters(parameter["children"]))
        else:
            fields.append(parameter)
    return fields

schema = explore(app_config, return_schema=True)
fields = flatten_parameters(schema.to_dict()["parameters"])

for field in fields:
    print(field["path"], field["kind"], field["selected_value"])
```

{% endcode %}

## Feed UI State Back Into Hypster

Your UI state should be a dictionary whose keys are Hypster parameter paths:

{% code overflow="wrap" %}

```python
ui_values = {
    "provider": "remote",
    "batch_size": 64,
    "endpoint": "https://staging.example.com",
    "timeout": 30.0,
}

schema = explore(app_config, values=ui_values, return_schema=True)
cfg = instantiate(app_config, values=ui_values)

assert schema.defaults()["provider"] == "local"
assert cfg.provider == "remote"
assert cfg.backend.timeout == 30.0
```

{% endcode %}

## Control Mapping

| Hypster kind   | Typical UI control                            |
| -------------- | --------------------------------------------- |
| `select`       | dropdown, segmented control, radio group      |
| `multi_select` | multiselect checklist                         |
| `int`          | integer input or slider                       |
| `float`        | number input or slider                        |
| `bool`         | checkbox or switch                            |
| `text`         | text input or textarea                        |
| `group`        | fieldset, section, accordion, or nested panel |

## Streamlit Shape

This is the adapter shape. It assumes you already installed Streamlit and are running inside a Streamlit app.

{% code overflow="wrap" %}

```python
def render_field(st, field):
    path = field["path"]
    kind = field["kind"]
    value = field["selected_value"]

    if kind == "select":
        options = field["options"] or []
        return st.selectbox(path, options, index=options.index(value))
    if kind == "bool":
        return st.checkbox(path, value=value)
    if kind == "int":
        return st.number_input(path, value=value, step=1)
    if kind == "float":
        return st.number_input(path, value=value)
    if kind == "text":
        return st.text_input(path, value=value)

    raise ValueError(f"Unsupported field kind: {kind}")
```

{% endcode %}

For conditional UIs, rerun `explore(config, values=current_ui_values, return_schema=True)` whenever a branch-selecting value changes. That keeps the rendered fields aligned with the active branch.


# Experiment Tracking

Use `instantiate_with_params()` when you need both the runtime object and the exact selected parameters to log to MLflow, Weights & Biases, a database, or a JSON file.

## Capture The Value And Params

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate, instantiate_with_params
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from my_app.experiments import TrainingJob
from my_app.features import FeatureSelector

def feature_config(hp: HP) -> FeatureSelector:
    numeric = hp.multi_text(["age", "income"], name="numeric")
    categorical = hp.multi_text(["country"], name="categorical")
    scale = hp.bool(True, name="scale")
    return FeatureSelector(numeric=numeric, categorical=categorical, scale=scale)

def linear_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    return LogisticRegression(C=C, max_iter=1000)

def tree_model(hp: HP) -> RandomForestClassifier:
    max_depth = hp.int(6, name="max_depth", min=1, max=64)
    return RandomForestClassifier(max_depth=max_depth, random_state=42)

model_options = {
    "linear": linear_model,
    "tree": tree_model,
}

def training_job_config(hp: HP) -> TrainingJob:
    selected_model = hp.select(model_options, name="model_family", default="linear", options_only=True)
    features = hp.nest(feature_config, name="features")
    model = hp.nest(selected_model, name="model")
    return TrainingJob(features=features, model=model)

run = instantiate_with_params(
    training_job_config,
    values={
        "model_family": "tree",
        "model.max_depth": 12,
        "features.numeric": ["age", "income", "days_active"],
    },
)

assert run.value.model.max_depth == 12
assert run.params == {
    "model_family": "tree",
    "features.numeric": ["age", "income", "days_active"],
    "features.categorical": ["country"],
    "features.scale": True,
    "model.max_depth": 12,
}
replayed = instantiate(training_job_config, values=run.params)
assert replayed.model.max_depth == run.value.model.max_depth
```

{% endcode %}

## Log Params In Your Tracker

{% code overflow="wrap" %}

```python
def log_to_tracker(tracker, run):
    for path, value in run.params.items():
        tracker.log_param(path, value)
```

{% endcode %}

The important detail is that `run.params` contains defaulted values as well as user overrides. That makes it suitable for exact replay.


# Create a Replayable Config

Use this recipe when you want a small config that can be inspected, instantiated, logged, and replayed later.

The config is a normal Python function. Keep it cheap and side-effect-free so `explore()` can run it to inspect parameters and replay tools can execute it safely.

## 1. Define A Typed Config

{% code overflow="wrap" %}

```python
from hypster import HP
from my_app.data import CsvDataset


def data_config(hp: HP) -> CsvDataset:
    path = hp.text("data/train.csv", name="path")
    batch_size = hp.int(64, name="batch_size", min=1)
    shuffle = hp.bool(True, name="shuffle")
    return CsvDataset(path=path, batch_size=batch_size, shuffle=shuffle)
```

{% endcode %}

## 2. Inspect The Parameters

Print the active tree when you want a quick human check:

{% code overflow="wrap" %}

```python
from hypster import explore

explore(data_config)
```

{% endcode %}

Use structured metadata when another tool needs to render fields:

{% code overflow="wrap" %}

```python
schema = explore(data_config, return_schema=True)
fields = schema.to_dict()["parameters"]
```

{% endcode %}

## 3. Instantiate With Overrides

{% code overflow="wrap" %}

```python
from hypster import instantiate

dataset = instantiate(data_config, values={"batch_size": 128})

assert dataset.path == "data/train.csv"
assert dataset.batch_size == 128
assert dataset.shuffle is True
```

{% endcode %}

## 4. Capture Params For Replay

{% code overflow="wrap" %}

```python
from hypster import instantiate_with_params

run = instantiate_with_params(data_config, values={"batch_size": 128})

assert run.value.batch_size == dataset.batch_size
assert run.params == {
    "path": "data/train.csv",
    "batch_size": 128,
    "shuffle": True,
}
```

{% endcode %}

## 5. Store And Replay

{% code overflow="wrap" %}

```python
import json

payload = json.dumps(run.params, sort_keys=True)
restored_params = json.loads(payload)

replayed = instantiate(data_config, values=restored_params)
assert replayed.batch_size == run.value.batch_size
```

{% endcode %}

Because `run.params` includes defaulted values, replay does not silently change if the config's defaults are edited later.


# Compose Nested Configs

Use this guide when one workflow should be assembled from reusable smaller config functions.

## Start With Child Configs

{% code overflow="wrap" %}

```python
from hypster import HP, explore, instantiate
from my_app.embeddings import BpeTokenizer, EmbeddingPipeline, Encoder, Tokenizer, WordPieceTokenizer

tokenizer_options = {
    "wordpiece": WordPieceTokenizer,
    "bpe": BpeTokenizer,
}

def tokenizer_config(hp: HP) -> Tokenizer:
    tokenizer_cls = hp.select(tokenizer_options, name="kind", default="bpe", options_only=True)
    lowercase = hp.bool(True, name="lowercase")
    return tokenizer_cls(lowercase=lowercase)

def encoder_config(hp: HP) -> Encoder:
    size = hp.select(["small", "base"], name="size", default="small", options_only=True)
    hidden_size = 384 if size == "small" else 768
    return Encoder(size=size, hidden_size=hidden_size)
```

{% endcode %}

Use this named-options shape when a parent config chooses between swappable children. The dictionary provides the stable keys Hypster logs, while the values can be classes, callables, or full child config functions.

## Nest Them In A Parent

{% code overflow="wrap" %}

```python
def embedding_pipeline(hp: HP) -> EmbeddingPipeline:
    tokenizer = hp.nest(tokenizer_config, name="tokenizer")
    encoder = hp.nest(encoder_config, name="encoder")
    normalize = hp.bool(True, name="normalize")
    return EmbeddingPipeline(tokenizer=tokenizer, encoder=encoder, normalize=normalize)
```

{% endcode %}

## Override Nested Values

{% code overflow="wrap" %}

```python
cfg = instantiate(
    embedding_pipeline,
    values={
        "tokenizer.kind": "wordpiece",
        "encoder.size": "base",
        "normalize": False,
    },
)

assert cfg.encoder.hidden_size == 768
```

{% endcode %}

Nested dictionaries are equivalent:

{% code overflow="wrap" %}

```python
cfg = instantiate(
    embedding_pipeline,
    values={
        "tokenizer": {"kind": "wordpiece"},
        "encoder": {"size": "base"},
        "normalize": False,
    },
)
```

{% endcode %}

The nested scope name is a prefix, not a leaf value. `values={"tokenizer": "wordpiece"}` raises as unknown because it does not target `tokenizer.kind`.

When you pass child-local values through `hp.nest(child, name="child", values=...)`, Hypster validates those explicit child values after the child runs. Typos and inactive child-branch keys raise instead of being ignored.

## Pass Execution Arguments To Children

{% code overflow="wrap" %}

```python
from my_app.training import BatchSampler, TrainingInputs

def sampler_config(hp: HP, default_batch_size: int) -> BatchSampler:
    batch_size = hp.int(default_batch_size, name="batch_size", min=1)
    shuffle = hp.bool(True, name="shuffle")
    return BatchSampler(batch_size=batch_size, shuffle=shuffle)

def training_config(hp: HP) -> TrainingInputs:
    train = hp.nest(sampler_config, name="train", default_batch_size=128)
    eval = hp.nest(sampler_config, name="eval", default_batch_size=256)
    return TrainingInputs(train=train, eval=eval)
```

{% endcode %}

## Inspect Branches Before Running

{% code overflow="wrap" %}

```python
explore(embedding_pipeline, values={"encoder.size": "base"})
```

{% endcode %}

If an override points at a parameter that is not reached on the active branch, Hypster raises by default. That keeps `values=` safe to log and replay.


# Build an Interactive UI

Use [`interact()`](/hypster/getting-started/interactive-instantiation-ui) when you want Hypster's built-in notebook widget. This guide is for custom Streamlit, Gradio, Panel, web app, or internal dashboard UIs.

Use `explore(config, return_schema=True)` to discover fields, render controls in your UI framework, then pass the collected values to `instantiate()`.

`explore()` executes the config function to discover the active branch. A custom UI may call it on every branch-changing edit, so keep config functions cheap and side-effect-free for interactive use; avoid paid API calls, database calls, file writes, training loops, and costly resource initialization in paths that the UI will explore.

## 1. Get A Schema

{% code overflow="wrap" %}

```python
from hypster import HP, explore
from my_app.search import KeywordRetriever, SearchRuntime, VectorRetriever


def keyword_retrieval(hp: HP) -> KeywordRetriever:
    index = hp.text("documents-v1", name="index", description="Keyword index name.")
    top_k = hp.int(20, name="top_k", min=1, max=100)
    return KeywordRetriever(index=index, top_k=top_k)


def vector_retrieval(hp: HP) -> VectorRetriever:
    index = hp.text("embeddings-v1", name="index", description="Vector index name.")
    top_k = hp.int(10, name="top_k", min=1, max=100)
    score_threshold = hp.float(0.2, name="score_threshold", min=0.0, max=1.0)
    return VectorRetriever(index=index, top_k=top_k, score_threshold=score_threshold)


retrieval_options = {
    "keyword": keyword_retrieval,
    "vector": vector_retrieval,
}


def search_config(hp: HP) -> SearchRuntime:
    selected_config = hp.select(
        retrieval_options,
        name="backend",
        default="keyword",
        options_only=True,
        description="Chooses the retrieval branch.",
    )
    retrieval = hp.nest(selected_config, name="retrieval")

    features = hp.multi_select(
        [None, "cache", "trace"],
        name="features",
        default=["cache"],
        allow_none=True,
    )

    return SearchRuntime(retrieval=retrieval, features=features)


schema = explore(search_config, return_schema=True)
metadata = schema.to_dict()
```

{% endcode %}

## 2. Flatten Field Metadata

{% code overflow="wrap" %}

```python
def flatten_fields(parameters):
    for parameter in parameters:
        if parameter["kind"] == "group":
            yield from flatten_fields(parameter["children"])
        else:
            yield parameter

fields = list(flatten_fields(metadata["parameters"]))
```

{% endcode %}

Each field has `path`, `kind`, `default_value`, `selected_value`, optional `options`, optional `minimum`, and optional `maximum`.

Schema metadata is JSON-serializable. After exploring the vector branch with values such as `{"backend": "vector", "features": ["cache", None], "retrieval.index": "embeddings-v3", "retrieval.top_k": 12, "retrieval.score_threshold": 0.35}`, the payload looks like this shape:

{% code overflow="wrap" %}

```python
{
    "name": "search_config",
    "display_label": "Search Config",
    "parameters": [
        {
            "name": "backend",
            "path": "backend",
            "kind": "select",
            "default_value": "keyword",
            "selected_value": "vector",
            "options": ["keyword", "vector"],
            "minimum": None,
            "maximum": None,
            "description": "Chooses the retrieval branch.",
            "display_label": "Backend",
            "children": [],
        },
        {
            "name": "retrieval",
            "path": "retrieval",
            "kind": "group",
            "default_value": None,
            "selected_value": None,
            "options": None,
            "minimum": None,
            "maximum": None,
            "description": None,
            "display_label": "Retrieval",
            "children": [
                {
                    "name": "index",
                    "path": "retrieval.index",
                    "kind": "text",
                    "default_value": "embeddings-v1",
                    "selected_value": "embeddings-v3",
                    "options": None,
                    "minimum": None,
                    "maximum": None,
                    "description": "Vector index name.",
                    "display_label": "Index",
                    "children": [],
                },
                {
                    "name": "top_k",
                    "path": "retrieval.top_k",
                    "kind": "int",
                    "default_value": 10,
                    "selected_value": 12,
                    "options": None,
                    "minimum": 1,
                    "maximum": 100,
                    "description": None,
                    "display_label": "Top K",
                    "children": [],
                },
                {
                    "name": "score_threshold",
                    "path": "retrieval.score_threshold",
                    "kind": "float",
                    "default_value": 0.2,
                    "selected_value": 0.35,
                    "options": None,
                    "minimum": 0.0,
                    "maximum": 1.0,
                    "description": None,
                    "display_label": "Score Threshold",
                    "children": [],
                },
            ],
        },
        {
            "name": "features",
            "path": "features",
            "kind": "multi_select",
            "default_value": ["cache"],
            "selected_value": ["cache", None],
            "options": [None, "cache", "trace"],
            "minimum": None,
            "maximum": None,
            "description": None,
            "display_label": "Features",
            "children": [],
        },
    ],
}
```

{% endcode %}

For dict-backed selects, `options` contains the replayable keys, not the mapped runtime objects.

This is another reason to prefer named option dictionaries for swappable components: UIs can render simple stable keys such as `"keyword"` and `"vector"`, while the config still receives the mapped class or child config function.

## 3. Render Controls

Map field kinds to controls in your UI framework:

{% code overflow="wrap" %}

```python
def control_spec(field):
    if field["kind"] == "select":
        return {"widget": "dropdown", "options": field["options"], "value": field["selected_value"]}
    if field["kind"] == "bool":
        return {"widget": "checkbox", "value": field["selected_value"]}
    if field["kind"] in {"int", "float"}:
        return {
            "widget": "number",
            "value": field["selected_value"],
            "min": field["minimum"],
            "max": field["maximum"],
        }
    if field["kind"] == "text":
        return {"widget": "text", "value": field["selected_value"]}
    if field["kind"] in {"multi_int", "multi_float", "multi_text", "multi_bool", "multi_select"}:
        return {"widget": "list", "value": field["selected_value"], "options": field["options"]}
    raise ValueError(f"Unsupported kind: {field['kind']}")

controls = {field["path"]: control_spec(field) for field in fields}
```

{% endcode %}

## 4. Recompute Conditional Branches

When a branch-selecting field changes, call `explore()` again with current UI values:

{% code overflow="wrap" %}

```python
ui_values = {"backend": "vector"}
schema = explore(search_config, values=ui_values, return_schema=True)
fields = list(flatten_fields(schema.to_dict()["parameters"]))

assert [field["path"] for field in fields] == [
    "backend",
    "retrieval.index",
    "retrieval.top_k",
    "retrieval.score_threshold",
    "features",
]
```

{% endcode %}

Custom UIs should submit only paths present in the latest schema. A robust branch-change loop is:

{% code overflow="wrap" %}

```python
def reachable_paths(schema):
    return {field["path"] for field in flatten_fields(schema.to_dict()["parameters"])}

def refresh_schema(config, current_values):
    schema = explore(config, values=current_values, on_unknown="ignore", return_schema=True)
    reachable = reachable_paths(schema)
    pruned_values = {path: value for path, value in current_values.items() if path in reachable}
    schema = explore(config, values=pruned_values, return_schema=True)
    return schema, pruned_values
```

{% endcode %}

If your UI remembers draft values per branch, keep that memory outside the submitted `values=` dictionary. Before calling `instantiate()`, remove stale paths from inactive branches.

Use `on_unknown="ignore"` only for this schema-refresh pruning pass. For the final submit, use the default `on_unknown="raise"` so typos and stale inactive paths are surfaced.

## 5. Instantiate From UI State

{% code overflow="wrap" %}

```python
from hypster import instantiate

ui_values = {
    "backend": "vector",
    "features": ["cache", "trace"],
    "retrieval.index": "embeddings-v3",
    "retrieval.top_k": 12,
    "retrieval.score_threshold": 0.35,
}

cfg = instantiate(search_config, values=ui_values)
assert isinstance(cfg.retrieval, VectorRetriever)
assert cfg.retrieval.top_k == 12
```

{% endcode %}

## 6. Submit And Show Errors

{% code overflow="wrap" %}

```python
from hypster import instantiate

try:
    cfg = instantiate(search_config, values=ui_values)
except ValueError as exc:
    show_form_error(str(exc))
else:
    run_search(cfg)
```

{% endcode %}

If you intentionally allow old UI payloads while users are editing, use `on_unknown="warn"` and capture warnings near the form. Keep final run submissions strict unless you have a migration path for ignored fields.

{% hint style="warning" %}
Do not send stale fields from inactive branches. If the user switches from `vector` back to `keyword`, remove `retrieval.score_threshold` from the submitted values or call `instantiate(..., on_unknown="ignore")` only when you intentionally want softer handling.
{% endhint %}


# Capture Replayable Params

Use `instantiate_with_params()` when a run needs a durable parameter record.

## Capture Params

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate, instantiate_with_params
from my_app.reporting import ReportRequest

def report_config(hp: HP) -> ReportRequest:
    audience = hp.select(["exec", "technical"], name="audience", default="technical", options_only=True)
    include_appendix = hp.bool(True, name="include_appendix")
    max_pages = hp.int(12, name="max_pages", min=1, max=100)
    return ReportRequest(audience=audience, include_appendix=include_appendix, max_pages=max_pages)

run = instantiate_with_params(
    report_config,
    values={"audience": "exec", "max_pages": 6},
)

assert run.value.audience == "exec"
assert run.value.include_appendix is True
assert run.value.max_pages == 6
assert run.params == {
    "audience": "exec",
    "include_appendix": True,
    "max_pages": 6,
}
```

{% endcode %}

## Replay Later

{% code overflow="wrap" %}

```python
replayed = instantiate(report_config, values=run.params)
assert replayed.max_pages == run.value.max_pages
```

{% endcode %}

Captured params include defaults, so replay does not silently pick up later default changes:

{% code overflow="wrap" %}

```python
def old_config(hp: HP) -> int:
    return hp.int(64, name="batch_size")


old_run = instantiate_with_params(old_config)
assert old_run.params == {"batch_size": 64}


def new_config(hp: HP) -> int:
    return hp.int(128, name="batch_size")


assert instantiate(new_config, values=old_run.params) == 64
```

{% endcode %}

## Store Params As JSON

`run.params` only contains values selected by `hp.*` calls. It is intended to be JSON-friendly when your parameter values are JSON-friendly.

{% code overflow="wrap" %}

```python
import json

payload = json.dumps(run.params, sort_keys=True)
restored_params = json.loads(payload)

assert instantiate(report_config, values=restored_params) == run.value
```

{% endcode %}

## Complex Runtime Objects

Use dict-backed `select` when a runtime choice is a complex object. The params record the simple key, not the complex mapped value.

{% code overflow="wrap" %}

```python
from my_app.models import LargeMLP, SmallMLP

def small_model(hp: HP) -> SmallMLP:
    return SmallMLP(layers=2, units=[64, 32])

def large_model(hp: HP) -> LargeMLP:
    return LargeMLP(layers=4, units=[256, 128])

model_options = {
    "small": small_model,
    "large": large_model,
}

def model_config(hp: HP):
    selected_config = hp.select(model_options, name="model", default="small", options_only=True)
    return hp.nest(selected_config, name="settings")

run = instantiate_with_params(model_config, values={"model": "large"})

assert isinstance(run.value, LargeMLP)
assert run.params == {"model": "large"}
```

{% endcode %}


# Perform Hyperparameter Optimization

Use this guide when you want Optuna to sample values for the active branch of a Hypster config.

## Install Optuna Support

{% code overflow="wrap" %}

```bash
uv add 'hypster[optuna]'
```

{% endcode %}

or:

{% code overflow="wrap" %}

```bash
pip install 'hypster[optuna]'
```

{% endcode %}

## Define A Searchable Config

{% code overflow="wrap" %}

```python
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP, instantiate_with_params
from hypster.hpo.types import HpoFloat, HpoInt

def linear_model(hp: HP) -> LogisticRegression:
    C = hp.float(
        1.0,
        name="C",
        min=1e-4,
        max=10.0,
        hpo_spec=HpoFloat(scale="log"),
    )
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(
        200,
        name="n_estimators",
        min=50,
        max=1000,
        hpo_spec=HpoInt(step=50),
    )
    max_depth = hp.int(12, name="max_depth", min=2, max=64, hpo_spec=HpoInt(scale="log"))
    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=42,
    )

model_options = {
    "linear": linear_model,
    "forest": forest_model,
}

def model_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="model_family", default="forest", options_only=True)
    return hp.nest(selected_config, name="model")
```

{% endcode %}

## Use It In An Objective

{% code overflow="wrap" %}

```python
import optuna
from sklearn.model_selection import cross_val_score

from hypster.hpo.optuna import suggest_values

def train_and_score(model: ClassifierMixin) -> float:
    # Replace X_train and y_train with your dataset.
    scores = cross_val_score(model, X_train, y_train, cv=3, scoring="accuracy")
    return float(scores.mean())

def objective(trial: optuna.Trial) -> float:
    values = suggest_values(trial, model_config)
    run = instantiate_with_params(model_config, values=values)
    trial.set_user_attr("hypster_params", run.params)
    return train_and_score(run.value)

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=30)
```

{% endcode %}

## Fix Part Of The Search

Wrap the config when you want a fixed branch. This keeps Optuna from sampling parameters that will later become unreachable.

{% code overflow="wrap" %}

```python
def forest_only_config(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(
        200,
        name="n_estimators",
        min=50,
        max=1000,
        hpo_spec=HpoInt(step=50),
    )
    max_depth = hp.int(12, name="max_depth", min=2, max=64, hpo_spec=HpoInt(scale="log"))
    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=42,
    )

def objective(trial: optuna.Trial) -> float:
    values = suggest_values(trial, forest_only_config)
    run = instantiate_with_params(forest_only_config, values=values)
    return train_and_score(run.value)
```

{% endcode %}

Prefer encoding fixed branches in the config itself when possible. It keeps the search space smaller and the replay payload cleaner.

## Supported HPO Calls

| Surface          | Supported                                                               | Unsupported                                                                                                       | Workaround                                                               |
| ---------------- | ----------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ |
| `hp.int(...)`    | `HpoInt(step=..., scale="linear"\|"log", include_max=...)`              | custom `base=...`, nullable numeric suggestions                                                                   | Use default `base=10.0`; model nullable choices as categorical branches. |
| `hp.float(...)`  | `HpoFloat(step=..., scale=...)`, `distribution="uniform"\|"loguniform"` | custom `base=...`, `distribution="normal"\|"lognormal"`, `center=...`, `spread=...`, nullable numeric suggestions | Use Optuna-compatible float ranges or write a custom objective branch.   |
| `hp.select(...)` | `HpoCategorical(ordered=False, weights=None)`                           | `ordered=True`, `weights=...`                                                                                     | Encode ordering/weights in your objective or sampler setup.              |
| `hp.nest(...)`   | Nested paths are prefixed and branch-aware.                             | Unknown child-local overrides                                                                                     | Keep child-local `values=` reachable for the selected branch.            |
| `multi_*` calls  | Not expanded by the adapter.                                            | `multi_int`, `multi_float`, `multi_text`, `multi_bool`, `multi_select` search spaces                              | Model each optimized choice as scalar or categorical parameters.         |

If an HPO numeric call omits `min` or `max`, the adapter uses the parameter default for the missing bound. Omitting both bounds collapses that parameter to the default value, so set explicit `min` and `max` for real search ranges.

## Multi-Value Search Choices

The Optuna adapter does not expand `multi_*` calls into a search space. Keep `multi_*` for runtime lists you want to log but not optimize, and model optimized list-like choices with scalar or categorical parameters.

Use categorical booleans for per-feature include/exclude decisions:

{% code overflow="wrap" %}

```python
def feature_config(hp: HP) -> list[str]:
    features = []
    if hp.select([False, True], name="include_age", default=True, options_only=True):
        features.append("age")
    if hp.select([False, True], name="include_income", default=True, options_only=True):
        features.append("income")
    if hp.select([False, True], name="include_days_active", default=False, options_only=True):
        features.append("days_active")
    return features
```

{% endcode %}

Use `hp.select([False, True], ...)` here rather than `hp.bool(...)` because the Optuna adapter samples categorical `select` calls, not boolean HP calls.

Use fixed slots when position matters:

{% code overflow="wrap" %}

```python
def top_features_config(hp: HP) -> list[str]:
    first = hp.select(["age", "income", "days_active"], name="feature_1", options_only=True)
    second = hp.select(["age", "income", "days_active"], name="feature_2", options_only=True)
    return [first, second]
```

{% endcode %}

Use dict-backed categorical presets for finite feature subsets:

{% code overflow="wrap" %}

```python
def preset_features_config(hp: HP) -> list[str]:
    return hp.select(
        {
            "core": ["age", "income"],
            "engagement": ["days_active", "sessions"],
            "full": ["age", "income", "days_active", "sessions"],
        },
        name="feature_preset",
        default="core",
        options_only=True,
    )
```

{% endcode %}


# Values & Overrides

`values=` is the dictionary you pass to `instantiate()`, `instantiate_with_params()`, or `explore()` to select concrete parameter values.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def child(hp: HP):
    return {
        "x": hp.int(10, name="x"),
        "y": hp.int(20, name="y"),
    }

def parent(hp: HP):
    return {"child": hp.nest(child, name="child")}
```

{% endcode %}

## Top-Level Values

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return {"batch_size": hp.int(32, name="batch_size")}

instantiate(config, values={"batch_size": 64})
# => {"batch_size": 64}
```

{% endcode %}

## Dotted Keys

Use dotted keys for nested parameters:

{% code overflow="wrap" %}

```python
instantiate(parent, values={"child.x": 15})
# => {"child": {"x": 15, "y": 20}}
```

{% endcode %}

## Nested Dictionaries

Nested dictionaries are normalized to the same dotted paths:

{% code overflow="wrap" %}

```python
instantiate(parent, values={"child": {"x": 25}})
# => {"child": {"x": 25, "y": 20}}
```

{% endcode %}

You can mix dotted keys and nested dictionaries as long as each final parameter path appears once.

## Nested Scope Names Are Not Leaves

A nested scope name is a prefix for child parameters, not a parameter leaf by itself. These forms are valid because they target `child.x`:

{% code overflow="wrap" %}

```python
instantiate(parent, values={"child.x": 15})
instantiate(parent, values={"child": {"x": 15}})
```

{% endcode %}

This raises because `child` is a scope, not a selectable parameter:

{% code overflow="wrap" %}

```python
instantiate(parent, values={"child": 123})
# ValueError: Unknown or unreachable parameters
```

{% endcode %}

## Duplicate Paths

This raises because both entries target `child.x`:

{% code overflow="wrap" %}

```python
instantiate(
    parent,
    values={
        "child.x": 100,
        "child": {"x": 100},
    },
)
# ValueError: Duplicate value for 'child.x'
```

{% endcode %}

Hypster raises even when the duplicate values are identical. A single canonical path keeps experiment logs and replay payloads unambiguous.

## Conditional Reachability

Only parameters touched by the active branch may appear in `values=`.

{% code overflow="wrap" %}

```python
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

def linear_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10)
    return RandomForestClassifier(n_estimators=n_estimators, random_state=42)

model_options = {"linear": linear_model, "forest": forest_model}

def model_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="family", default="linear", options_only=True)
    return hp.nest(selected_config, name="model")

instantiate(model_config, values={"family": "linear", "model.n_estimators": 500})
# ValueError: Unknown or unreachable parameters
```

{% endcode %}

Use `explore(model_config, values={"family": "forest"})` to inspect the branch before instantiating it.

## Unknown Policies

`instantiate()`, `instantiate_with_params()`, and `explore()` accept the same `on_unknown` policy:

| Policy     | Behavior                                         |
| ---------- | ------------------------------------------------ |
| `"raise"`  | Default. Raise on unknown or unreachable values. |
| `"warn"`   | Emit a warning and continue.                     |
| `"ignore"` | Ignore unknown or unreachable values.            |

Prefer the default for experiments and production replay. Softer policies are useful when migrating old payloads or rendering exploratory UIs.

## Select Keys vs Complex Values

Nested dictionaries inside `values=` are interpreted as nested parameter paths. If you need a select option whose runtime value is a dictionary, use dict-backed `select`:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.select(
        {
            "small": {"layers": 2},
            "large": {"layers": 4},
        },
        name="model",
        default="small",
    )

instantiate(config, values={"model": "large"})
# => {"layers": 4}
```

{% endcode %}

Do not pass `values={"model": {"layers": 4}}`; Hypster will treat that as a nested parameter path.


# Nested Configurations

`hp.nest()` lets one config function call another config function under a named scope. This keeps large workflows readable and gives nested parameters stable dotted paths.

## Basic Nesting

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate
from my_app.training import AdamW, DataLoaders, BatchSampler, TrainingRun

def optimizer_config(hp: HP) -> AdamW:
    learning_rate = hp.float(0.001, name="learning_rate", min=1e-6, max=1.0)
    weight_decay = hp.float(0.0, name="weight_decay", min=0.0, max=1.0)
    return AdamW(learning_rate=learning_rate, weight_decay=weight_decay)

def training_config(hp: HP) -> TrainingRun:
    epochs = hp.int(10, name="epochs", min=1)
    optimizer = hp.nest(optimizer_config, name="optimizer")
    return TrainingRun(epochs=epochs, optimizer=optimizer)

cfg = instantiate(training_config, values={"optimizer.learning_rate": 0.01})
assert cfg.optimizer.learning_rate == 0.01
```

{% endcode %}

## Signature

{% code overflow="wrap" %}

```python
hp.nest(
    child,
    *,
    name,
    values=None,
    **kwargs,
)
```

{% endcode %}

| Argument   | Meaning                                         |
| ---------- | ----------------------------------------------- |
| `child`    | Config function whose first parameter is `hp`.  |
| `name`     | Scope name. Must be a valid Python identifier.  |
| `values`   | Child-local values merged into the nested call. |
| `**kwargs` | Execution arguments forwarded to the child.     |

## Child-Local Values

`values=` inside `hp.nest()` is local to the child and is merged after parent-provided values for that child. Use it when the parent intentionally fixes or supplies child defaults.

{% code overflow="wrap" %}

```python
def parent(hp: HP):
    return hp.nest(
        optimizer_config,
        name="optimizer",
        values={"learning_rate": 0.005},
)

assert instantiate(parent).learning_rate == 0.005
assert instantiate(parent, values={"optimizer.learning_rate": 0.02}).learning_rate == 0.005
```

{% endcode %}

Think of this as a parent-fixed child value: the parent is choosing what the child sees. If you want callers to override the child value, leave `values=` off the `hp.nest()` call and put the default in the child parameter:

{% code overflow="wrap" %}

```python
def overridable_parent(hp: HP):
    return hp.nest(optimizer_config, name="optimizer")

cfg = instantiate(overridable_parent, values={"optimizer.learning_rate": 0.02})
assert cfg.learning_rate == 0.02
```

{% endcode %}

Explicit child values are validated after the child config runs. Unknown or unreachable child keys raise instead of being ignored:

{% code overflow="wrap" %}

```python
def parent_with_typo(hp: HP):
    return hp.nest(optimizer_config, name="optimizer", values={"learnig_rate": 0.005})

instantiate(parent_with_typo)
# ValueError: Unknown or unreachable parameters
```

{% endcode %}

Use child-local `values=` for parent-owned policy, test fixtures, or internal composition defaults that should win over caller-provided nested values.

## Execution Arguments

{% code overflow="wrap" %}

```python
def sampler_config(hp: HP, default_batch_size: int) -> BatchSampler:
    batch_size = hp.int(default_batch_size, name="batch_size", min=1)
    shuffle = hp.bool(True, name="shuffle")
    return BatchSampler(batch_size=batch_size, shuffle=shuffle)

def data_config(hp: HP) -> DataLoaders:
    train = hp.nest(sampler_config, name="train", default_batch_size=128)
    eval = hp.nest(sampler_config, name="eval", default_batch_size=256)
    return DataLoaders(train=train, eval=eval)
```

{% endcode %}

## Conditional Nesting

You can choose which child config to run:

{% code overflow="wrap" %}

```python
from my_app.backends import AppRuntime, LocalBackend, RemoteBackend

def local_config(hp: HP) -> LocalBackend:
    threads = hp.int(4, name="threads", min=1)
    return LocalBackend(threads=threads)

def remote_config(hp: HP) -> RemoteBackend:
    endpoint = hp.text("https://api.example.com", name="endpoint")
    return RemoteBackend(endpoint=endpoint)

backend_options = {"local": local_config, "remote": remote_config}

def app_config(hp: HP) -> AppRuntime:
    selected_config = hp.select(backend_options, name="backend", default="local", options_only=True)
    backend = hp.nest(selected_config, name="settings")
    return AppRuntime(backend=backend)
```

{% endcode %}

This value is valid:

{% code overflow="wrap" %}

```python
instantiate(app_config, values={"backend": "remote", "settings.endpoint": "https://staging.example.com"})
```

{% endcode %}

This value raises by default because `settings.threads` is unreachable on the `remote` branch:

{% code overflow="wrap" %}

```python
instantiate(app_config, values={"backend": "remote", "settings.threads": 8})
```

{% endcode %}

## Name Collisions

Nested scopes share one parameter namespace. Hypster raises if the same full path is defined twice or if a parent parameter reserves a prefix needed by a nested child. Use unique scope names such as `encoder`, `decoder`, `train_loader`, and `eval_loader`.


# Best Practices

These practices keep Hypster configs easy to explore, optimize, log, and replay.

## Embrace Pure Python

Hypster configs are ordinary Python functions rather than a DSL. Use `if` statements, loops, local variables, lists, helpers, imports, and typed return values when they make the config clearer.

The implication is that Hypster discovers the available parameters by running your function. Design config functions so they can be run repeatedly by `explore()`, HPO, and interactive UIs without causing side effects or surprising costs.

## Return Typed Runtime Objects

A strong Hypster pattern is to make each config function a typed factory for the object the caller needs:

{% code overflow="wrap" %}

```python
from hypster import HP
from sklearn.ensemble import RandomForestClassifier

def classifier_config(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", allow_none=True)

    return RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=42,
    )
```

{% endcode %}

Use a return type annotation for config functions whenever the output is a meaningful object. It makes the config easier to read, test, and compose with `hp.nest()`.

## Keep Config Functions Side-Effect-Light

`explore()`, HPO, and UI builders execute your config function to discover parameters. Interactive UIs may rerun it on every value change. Initializing cheap in-memory runtime objects is a good fit for config functions; effects and expensive work should stay outside the config body:

* train the model after `instantiate()`
* make paid API or network calls after `instantiate()`
* write files or database rows after `instantiate()`
* load indexes, large datasets, or heavyweight clients after `instantiate()`
* defer costly resource construction when exploratory safety matters

Use this boundary when deciding what a config should return:

| Return from the config                                                      | Usually safe during `explore()`? | Notes                                                                                |
| --------------------------------------------------------------------------- | -------------------------------- | ------------------------------------------------------------------------------------ |
| Enums, paths, mappings your runtime actually consumes, small Python objects | Yes                              | Good for UI generation, experiment tracking, and replay.                             |
| In-memory model estimators or pipeline objects                              | Usually                          | Good when construction is cheap and does not open files, sockets, or remote handles. |
| SDK clients, database handles, loaded indexes, network retrievers           | Usually no                       | Return lightweight settings or factories, then build these after `instantiate()`.    |
| Training jobs, writes, API calls, migrations                                | No                               | Run these outside the config function.                                               |

## Name Everything Explicitly

Every `hp.*` call needs a stable `name=`. Names become the keys in `values=`, `explore()` output, and `instantiate_with_params().params`.

{% code overflow="wrap" %}

```python
hp.float(0.001, name="learning_rate")
```

{% endcode %}

Use Python identifier-style names:

* Good: `learning_rate`, `max_depth`, `retriever_kind`
* Avoid: `learning-rate`, `model.lr`, `max depth`

Let `hp.nest()` create dotted paths.

## Use Branches For Real Runtime Decisions

Branch when downstream structure changes:

{% code overflow="wrap" %}

```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP

def model_config(hp: HP):
    family = hp.select(["linear", "forest"], name="family", default="forest", options_only=True)

    if family == "linear":
        C = hp.float(1.0, name="C", min=1e-4, max=100.0)
        return LogisticRegression(C=C, max_iter=1000)

    n_estimators = hp.int(200, name="n_estimators", min=10, max=1000)
    max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
    return RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
```

{% endcode %}

Avoid carrying irrelevant parameters for inactive branches. Branch-aware configs make experiment logs cleaner and HPO search spaces smaller.

Use inline branches for small, local differences. When the branch chooses between reusable components, prefer composition with `hp.nest()` and a dict-backed `select` over a long `if`/`elif` chain. The key stays replayable, each component keeps its parameters local, and interactive UIs can render the selected child config as a contained group.

## Prefer Dict-Backed Selects For Swappable Components

Select keys should be simple and replayable. For swappable runtime components, map those keys to config functions and then nest the selected function:

{% code overflow="wrap" %}

```python
from my_app.tokenizers import SimpleTokenizer, Tokenizer, WordPieceTokenizer

def simple_tokenizer_config(hp: HP) -> SimpleTokenizer:
    lowercase = hp.bool(True, name="lowercase")
    return SimpleTokenizer(lowercase=lowercase)

def wordpiece_tokenizer_config(hp: HP) -> WordPieceTokenizer:
    vocab_path = hp.text("vocab.txt", name="vocab_path")
    return WordPieceTokenizer(vocab_path=vocab_path)

tokenizer_options = {
    "simple": simple_tokenizer_config,
    "wordpiece": wordpiece_tokenizer_config,
}

def tokenizer_config(hp: HP) -> Tokenizer:
    selected_config = hp.select(tokenizer_options, name="tokenizer", default="wordpiece", options_only=True)
    return hp.nest(selected_config, name="settings")
```

{% endcode %}

This keeps `params={"tokenizer": "wordpiece", "settings.vocab_path": "vocab.txt"}` while your app receives the selected tokenizer object.

Keep the options mapping in a named variable such as `tokenizer_options`, `model_options`, or `retriever_options`. That keeps the parent config readable, especially when the mapping is long, and makes it easy to reuse the same option set in HPO, interactive UIs, tests, and nested configs.

For a tiny branch with one or two scalar differences, an `if` statement is fine. Once each branch has its own parameters or returns a different runtime type, split the branches into child config functions and choose between them with a dict-backed select.

## Turn On `options_only=True` For Enums

By default, `select` allows custom scalar values outside the listed options. Use `options_only=True` when the option list is closed:

{% code overflow="wrap" %}

```python
provider = hp.select(["openai", "gemini"], name="provider", default="openai", options_only=True)
```

{% endcode %}

## Use `allow_none=True` Deliberately

`None` is a real value, not an unspecified value. Mark it explicitly:

{% code overflow="wrap" %}

```python
max_depth = hp.int(None, name="max_depth", min=1, max=100, allow_none=True)
```

{% endcode %}

For nullable choices, you can put `None` directly in the options:

{% code overflow="wrap" %}

```python
tokenizer = hp.select([None, "basic"], name="tokenizer", default=None, allow_none=True)
```

{% endcode %}

## Use Numeric Coercion Deliberately

Hypster safely coerces common numeric inputs by default. Integral floats can be used for integer parameters, and integers can be used for float parameters:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return {
        "epochs": hp.int(10, name="epochs"),
        "lr": hp.float(0.1, name="lr"),
    }

instantiate(config, values={"epochs": 20.0, "lr": 1})
# => {"epochs": 20, "lr": 1.0}
```

{% endcode %}

Use `strict=True` when the input type itself matters:

{% code overflow="wrap" %}

```python
def strict_config(hp: HP):
    return {
        "epochs": hp.int(10, name="epochs", strict=True),
        "lr": hp.float(0.1, name="lr", strict=True),
    }
```

{% endcode %}

`True` and `False` are rejected by numeric parameters. Use `hp.bool()` for boolean choices.

## Capture Params For Anything You May Replay

Use `instantiate_with_params()` for experiments, UI submissions, scheduled jobs, and production runs:

{% code overflow="wrap" %}

```python
run = instantiate_with_params(config, values={"learning_rate": 0.01})
# tracker.log_params(run.params)
```

{% endcode %}

The params include defaults as well as explicit overrides, so later replay does not depend on changing defaults.

## Explore Before Instantiating Conditional Values

When overriding a branch, inspect it first:

{% code overflow="wrap" %}

```python
explore(config, values={"provider": "gemini"})
```

{% endcode %}

This prevents stale values from inactive branches from leaking into logs.

## Keep Return Values Narrow

Return what the caller needs. A small return surface makes configs easier to test and less likely to couple unrelated workflow stages.

{% code overflow="wrap" %}

```python
def training_config(hp: HP) -> TrainingRunner:
    model = hp.nest(model_config, name="model")
    optimizer = hp.nest(optimizer_config, name="optimizer")
    return TrainingRunner(model=model, optimizer=optimizer)
```

{% endcode %}

Use `hp.collect(locals(), include=[...])` when the caller genuinely wants a mapping and that makes the return explicit and concise.


# HP Call Types

`HP` methods define the public parameters in a config function. Each call records a parameter path, validates overrides, and can be explored or replayed.

The examples in this reference sometimes return small dictionaries to keep the call behavior visible. In application code, prefer returning the initialized runtime object unless the mapping itself is the object your caller needs.

## Scalar Calls

| Call        | Use for                   | Notes                                                                                                  |
| ----------- | ------------------------- | ------------------------------------------------------------------------------------------------------ |
| `hp.int`    | Integer parameters        | Accepts integral floats by default, optional bounds, optional strict mode, optional `allow_none=True`. |
| `hp.float`  | Floating-point parameters | Accepts integer values by default, optional bounds, optional strict mode, optional `allow_none=True`.  |
| `hp.text`   | Strings                   | Use for prompts, paths, IDs, and labels.                                                               |
| `hp.bool`   | Booleans                  | Requires actual `True` or `False`, not string values.                                                  |
| `hp.select` | One categorical choice    | Supports list options or dict-backed key-to-value mapping.                                             |

## Multi-Value Calls

| Call              | Use for                     | Notes                                                                       |
| ----------------- | --------------------------- | --------------------------------------------------------------------------- |
| `hp.multi_int`    | List of integers            | Elements use the same safe coercion and strict-mode behavior as `hp.int`.   |
| `hp.multi_float`  | List of floats              | Elements use the same safe coercion and strict-mode behavior as `hp.float`. |
| `hp.multi_text`   | List of strings             | Useful for columns, tags, stop sequences, and feature names.                |
| `hp.multi_bool`   | List of booleans            | Useful when each position has meaning.                                      |
| `hp.multi_select` | List of categorical choices | Supports nullable choices with `allow_none=True`.                           |

Nullable elements are not supported for `multi_int`, `multi_float`, `multi_text`, or `multi_bool`. Use `multi_select(..., allow_none=True)` for nullable categorical lists.

## Composition Calls

| Call         | Use for                                                      |
| ------------ | ------------------------------------------------------------ |
| `hp.nest`    | Run another config function under a named scope.             |
| `hp.collect` | Collect selected local variables into a returned dictionary. |

## Shared Rules

* `name=` is required for every `hp.*` parameter call.
* Names must be valid Python identifiers and cannot contain dots, spaces, or hyphens.
* `values=` may use dotted paths such as `optimizer.learning_rate`.
* Unknown or unreachable values raise by default.
* Dict-backed `select` is the right way to return complex objects while logging simple keys.
* Numeric parameters reject `True` and `False` even though Python treats `bool` as a subclass of `int`.

See [Public API](/hypster/reference/api) for exact signatures.


# Selectable Types

Use `hp.select()` and `hp.multi_select()` for categorical choices.

Selected choices are part of Hypster's reproducibility surface. They are what `instantiate_with_params(...).params` records and what you pass back through `values=...` to replay a run.

## Signatures

{% code overflow="wrap" %}

```python
hp.select(options, *, name, default=NO_DEFAULT, options_only=False, allow_none=False, hpo_spec=None)
hp.multi_select(options, *, name, default=None, options_only=False, allow_none=False)
```

{% endcode %}

## List Form

Use list form when the logged choice and returned value are the same simple value:

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def config(hp: HP):
    model = hp.select(
        ["claude-haiku-4-5", "claude-sonnet-4-6"],
        name="model",
        default="claude-haiku-4-5",
    )
    features = hp.multi_select(["cache", "trace"], name="features", default=["cache"])
    return {"model": model, "features": features}

instantiate(
    config,
    values={"model": "claude-sonnet-4-6", "features": ["cache", "trace"]},
)
# => {"model": "claude-sonnet-4-6", "features": ["cache", "trace"]}
```

{% endcode %}

List-form choices must be logging-safe scalar values: `None`, `bool`, `int`, `float`, or `str`. If you need a complex object, use dictionary form.

## Dictionary Form

Use dictionary form when a simple logged key should return a different value. The key is logged and replayed; the mapped value is returned from the config.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate_with_params

def config(hp: HP):
    model = hp.select(
        {
            "small": {"layers": 2, "units": [64, 32]},
            "large": {"layers": 4, "units": [256, 128]},
        },
        name="model",
        default="small",
    )
    return {"model": model}

run = instantiate_with_params(config, values={"model": "large"})

assert run.value == {"model": {"layers": 4, "units": [256, 128]}}
assert run.params == {"model": "large"}
```

{% endcode %}

Use `options_only=True` with dictionary form when the logged keys are a closed enum:

{% code overflow="wrap" %}

```python
def strict_config(hp: HP):
    model = hp.select(
        {
            "small": {"layers": 2},
            "large": {"layers": 4},
        },
        name="model",
        default="small",
        options_only=True,
    )
    return {"model": model}

run = instantiate_with_params(strict_config, values={"model": "large"})

assert run.value == {"model": {"layers": 4}}
assert run.params == {"model": "large"}
```

{% endcode %}

Dictionary form is the recommended way to return:

* objects or callables
* dictionaries, lists, or tuples that your runtime actually consumes
* long provider/model IDs behind short aliases

{% code overflow="wrap" %}

```python
architecture = hp.select(
    {
        "small": {"layers": 2, "units": [64, 32]},
        "large": {"layers": 4, "units": [256, 128]},
    },
    name="architecture",
    default="small",
)
```

{% endcode %}

For nullable choices, you can use `None` directly in list-form options with `allow_none=True`.

## Explicit None

If `None` itself is a selectable choice or override, mark the parameter as nullable with `allow_none=True`:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    thinking_level = hp.select(
        [None, "low", "medium", "high"],
        name="thinking_level",
        default=None,
        allow_none=True,
    )
    features = hp.multi_select(
        [None, "cache", "trace"],
        name="features",
        default=[None],
        allow_none=True,
    )
    return {"thinking_level": thinking_level, "features": features}
```

{% endcode %}

Without `allow_none=True`, `None` defaults, choices, and overrides raise with guidance.

## Empty Nullable Selects

An empty option list can default to `None` when the parameter is explicitly nullable:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.select([], name="choice", allow_none=True)

assert instantiate(config) is None
```

{% endcode %}

Without `allow_none=True`, an empty option list with no explicit default raises because Hypster has no safe value to select.

## Custom Choices

By default, `options_only=False`, so callers may provide a custom choice outside the declared options:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.select(["claude-haiku-4-5", "claude-sonnet-4-6"], name="model")

assert instantiate(config, values={"model": "claude-opus-4-7"}) == "claude-opus-4-7"
```

{% endcode %}

Custom choices must still be logging-safe scalar values. Use `options_only=True` to reject anything outside the declared options:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.select(
        ["claude-haiku-4-5", "claude-sonnet-4-6"],
        name="model",
        options_only=True,
    )

instantiate(config, values={"model": "claude-opus-4-7"})
# ValueError: 'claude-opus-4-7' not in allowed options
```

{% endcode %}

## Names

`name=` must be a valid Python identifier and cannot be a Python keyword. Hypster composes dotted parameter paths from nested names, so literal dots, spaces, and hyphens are not allowed in individual names.


# Numeric Types

Use `hp.int`, `hp.float`, `hp.multi_int`, and `hp.multi_float` for numeric parameters. By default, Hypster uses safe numeric coercion. Use `strict=True` when callers must provide the exact numeric type.

## Signatures

{% code overflow="wrap" %}

```python
hp.int(default, *, name, min=None, max=None, strict=False, allow_none=False, hpo_spec=None)
hp.float(default, *, name, min=None, max=None, strict=False, allow_none=False, hpo_spec=None)
hp.multi_int(default, *, name, min=None, max=None, strict=False, allow_none=False)
hp.multi_float(default, *, name, min=None, max=None, strict=False, allow_none=False)
```

{% endcode %}

`hpo_spec` is ignored by normal instantiation and consumed by the Optuna adapter.

## Bounds

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def training_config(hp: HP):
    return {
        "learning_rate": hp.float(0.001, name="learning_rate", min=1e-6, max=1.0),
        "batch_size": hp.int(64, name="batch_size", min=1, max=2048),
        "layers": hp.multi_int([256, 128], name="layers", min=1, max=4096),
    }

cfg = instantiate(
    training_config,
    values={"learning_rate": 0.01, "batch_size": 128, "layers": [512, 256]},
)
```

{% endcode %}

Values outside bounds raise.

## Safe Numeric Coercion

With `strict=False`, the default:

* `hp.int` and `hp.multi_int` accept real integers and integral floats such as `3.0`, which become `3`.
* `hp.float` and `hp.multi_float` accept real floats and integers such as `1`, which become `1.0`.
* `True` and `False` are never accepted as numeric values, even though `bool` is a subclass of `int` in Python.

This behavior is the same for top-level parameters and nested paths.

The policy is defined for Python numeric scalars. If a data library gives you NumPy, pandas, or framework-specific scalar objects, convert them to plain `int` or `float` values before passing them through `values=`.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def config(hp: HP):
    return {
        "epochs": hp.int(10, name="epochs"),
        "lr": hp.float(0.1, name="lr"),
    }

assert instantiate(config, values={"epochs": 20.0, "lr": 1}) == {
    "epochs": 20,
    "lr": 1.0,
}
```

{% endcode %}

Precision-losing integer conversions are rejected:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.int(10, name="epochs")

instantiate(config, values={"epochs": 20.5})
# ValueError: float 20.5 would lose precision when converted to int
```

{% endcode %}

Bool values are rejected for numeric parameters:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return {
        "epochs": hp.int(10, name="epochs"),
        "lr": hp.float(0.1, name="lr"),
    }

instantiate(config, values={"epochs": True})
# ValueError: expected int but got bool
```

{% endcode %}

## Strict Numeric Types

With `strict=True`:

* `hp.int` and `hp.multi_int` accept real integers only, excluding bool.
* `hp.float` and `hp.multi_float` accept real floats only, excluding integers and bool.

{% code overflow="wrap" %}

```python
def strict_config(hp: HP):
    return {
        "epochs": hp.int(10, name="epochs", strict=True),
        "lr": hp.float(0.1, name="lr", strict=True),
    }

instantiate(strict_config, values={"epochs": 20.0})
# ValueError: expected int but got float

instantiate(strict_config, values={"lr": 1})
# ValueError: expected float but got int
```

{% endcode %}

## Nullable Numeric Values

Use `allow_none=True` when `None` is an intentional scalar value:

{% code overflow="wrap" %}

```python
def tree_config(hp: HP):
    return {
        "max_depth": hp.int(None, name="max_depth", allow_none=True),
        "dropout": hp.float(0.1, name="dropout", min=0.0, max=1.0, allow_none=True),
    }

assert instantiate(tree_config, values={"dropout": None}) == {
    "max_depth": None,
    "dropout": None,
}
```

{% endcode %}

Nullable elements are not supported for `multi_int` or `multi_float`. Use `multi_select(..., allow_none=True)` for nullable categorical lists.

## HPO Specs

{% code overflow="wrap" %}

```python
from hypster.hpo.types import HpoFloat, HpoInt

def search_config(hp: HP):
    return {
        "learning_rate": hp.float(
            0.001,
            name="learning_rate",
            min=1e-6,
            max=1.0,
            hpo_spec=HpoFloat(scale="log"),
        ),
        "batch_size": hp.int(
            64,
            name="batch_size",
            min=16,
            max=512,
            hpo_spec=HpoInt(step=16),
        ),
    }
```

{% endcode %}


# Boolean Types

Use `hp.bool()` for one boolean and `hp.multi_bool()` for a list of booleans.

## Signatures

{% code overflow="wrap" %}

```python
hp.bool(default, *, name, allow_none=False)
hp.multi_bool(default, *, name, allow_none=False)
```

{% endcode %}

## Single Boolean

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def config(hp: HP):
    return {
        "stream": hp.bool(True, name="stream"),
        "use_cache": hp.bool(True, name="use_cache"),
    }

cfg = instantiate(config, values={"stream": False})
assert cfg == {"stream": False, "use_cache": True}
```

{% endcode %}

`hp.bool` requires actual booleans. Strings such as `"true"` are rejected.

## Nullable Boolean

Use `allow_none=True` for tri-state values:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.bool(None, name="stream", allow_none=True)

assert instantiate(config) is None
assert instantiate(config, values={"stream": True}) is True
```

{% endcode %}

## Multiple Booleans

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.multi_bool([True, False, True], name="feature_flags")

assert instantiate(config, values={"feature_flags": [False, False, True]}) == [False, False, True]
```

{% endcode %}

Nullable elements are not supported for `multi_bool`. Use `multi_select(..., allow_none=True)` for nullable categorical lists.


# Textual Types

Use `hp.text()` for one string and `hp.multi_text()` for a list of strings.

## Signatures

{% code overflow="wrap" %}

```python
hp.text(default, *, name, allow_none=False)
hp.multi_text(default, *, name, allow_none=False)
```

{% endcode %}

## Single Text Values

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def llm_config(hp: HP):
    return {
        "model_name": hp.text("gpt-5.5-mini", name="model_name"),
        "system_prompt": hp.text("Answer concisely.", name="system_prompt"),
    }

cfg = instantiate(
    llm_config,
    values={"system_prompt": "Answer with citations."},
)

assert cfg["system_prompt"] == "Answer with citations."
```

{% endcode %}

## Nullable Text

Use `allow_none=True` when `None` is an intentional text value:

{% code overflow="wrap" %}

```python
def config(hp: HP):
    return hp.text(None, name="system_prompt", allow_none=True)

assert instantiate(config) is None
```

{% endcode %}

## Multiple Text Values

{% code overflow="wrap" %}

```python
def generation_config(hp: HP):
    return {
        "stop_sequences": hp.multi_text(["###", "END"], name="stop_sequences"),
        "columns": hp.multi_text(["title", "body"], name="columns"),
    }

cfg = instantiate(
    generation_config,
    values={"stop_sequences": ["STOP", "DONE"]},
)

assert cfg["stop_sequences"] == ["STOP", "DONE"]
```

{% endcode %}

Nullable elements are not supported for `multi_text`; use `multi_select(..., allow_none=True)` for nullable categorical lists.


# Experiment Tracking

Use `instantiate_with_params()` to log the exact parameters selected by a run.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate_with_params
from my_app.training import TrainingJob

def training_config(hp: HP) -> TrainingJob:
    model_family = hp.select(["linear", "forest"], name="model_family", default="forest", options_only=True)
    seed = hp.int(42, name="seed", min=0)
    batch_size = hp.int(64, name="batch_size", min=1)
    return TrainingJob(model_family=model_family, seed=seed, batch_size=batch_size)

run = instantiate_with_params(
    training_config,
    values={"model_family": "linear", "batch_size": 128},
)

assert run.params == {
    "model_family": "linear",
    "seed": 42,
    "batch_size": 128,
}
```

{% endcode %}

## Log To A Tracker

{% code overflow="wrap" %}

```python
def log_hypster_run(tracker, run):
    for path, value in run.params.items():
        tracker.log_param(path, value)
```

{% endcode %}

The params include defaults as well as explicit overrides. That matters because defaults may change between versions.

## Recommended Run Record

Use `run.params` for replay, then store adjacent metadata that explains the code, data, and outputs that produced the run:

{% code overflow="wrap" %}

```python
import hypster

record = {
    "params": run.params,
    "metrics": {"accuracy": 0.91, "loss": 0.24},
    "outputs": {"model_uri": "models:/churn/17"},
    "artifacts": {"confusion_matrix": "artifacts/confusion_matrix.png"},
    "hypster_version": hypster.__version__,
    "app_version": "2026.05.24",
    "git_commit": "abc1234",
    "dataset_id": "warehouse/churn/2026-05-01",
}
```

{% endcode %}

For trackers with separate concepts, log `params` as parameters, versions and dataset IDs as tags, metrics as metrics, and large files as artifacts.

## Record Package And Code Versions

At minimum, log:

* `hypster.__version__`
* your package or application version
* the git commit or build ID
* `run.params`
* the metric values produced by the run

{% code overflow="wrap" %}

```python
import hypster

metadata = {
    "hypster_version": hypster.__version__,
    "params": run.params,
}
```

{% endcode %}

## Replay

{% code overflow="wrap" %}

```python
from hypster import instantiate

replayed = instantiate(training_config, values=run.params)
assert replayed.model_family == run.value.model_family
assert replayed.batch_size == run.value.batch_size
```

{% endcode %}

If replay fails because a parameter is now unknown, inspect the old payload with `explore(training_config, values=old_params, on_unknown="warn")` and migrate it deliberately.


# Serialization

Hypster does not define a custom serialization format. The recommended reproducibility artifact is the `params` dictionary returned by `instantiate_with_params()`.

## JSON Params

{% code overflow="wrap" %}

```python
import json
from hypster import HP, explore, instantiate, instantiate_with_params
from my_app.llms import LLMClient

def config(hp: HP) -> LLMClient:
    provider = hp.select(["openai", "gemini"], name="provider", default="openai", options_only=True)
    temperature = hp.float(0.2, name="temperature", min=0.0, max=2.0)
    return LLMClient(provider=provider, temperature=temperature)

run = instantiate_with_params(config, values={"provider": "gemini"})
payload = json.dumps(run.params, sort_keys=True)
restored = json.loads(payload)

assert instantiate(config, values=restored).provider == run.value.provider
```

{% endcode %}

## Complex Runtime Values

Use dict-backed `select` so the serialized params contain simple keys:

{% code overflow="wrap" %}

```python
from my_app.models import LargeMLP, SmallMLP

model_options = {
    "small": SmallMLP,
    "large": LargeMLP,
}

def model_config(hp: HP):
    model_cls = hp.select(model_options, name="model", default="small", options_only=True)
    return model_cls()
```

{% endcode %}

`instantiate_with_params(model_config, values={"model": "large"}).params` records `{"model": "large"}`, not the mapped class.

## Schema Serialization

`explore(config, return_schema=True).to_dict()` returns JSON-serializable schema metadata for UIs, catalogs, and validation tools.

{% code overflow="wrap" %}

```python
schema = explore(config, return_schema=True).to_dict()
json.dumps(schema)
```

{% endcode %}

Schema metadata is not a replacement for selected params. Use schema for rendering and params for replay.

## Versioned Replay Artifacts

When params leave the current process, store them with enough identity to understand which code and data produced the original run:

{% code overflow="wrap" %}

```python
import json
import hypster

artifact = {
    "kind": "hypster-run-params",
    "hypster_version": hypster.__version__,
    "config_name": "config",
    "app_version": "2026.05.24",
    "git_commit": "abc1234",
    "dataset_id": "warehouse/churn/2026-05-01",
    "params": run.params,
}

payload = json.dumps(artifact, sort_keys=True)
restored = json.loads(payload)

replayed = instantiate(config, values=restored["params"])
```

{% endcode %}

If replay fails after the config evolves, inspect the old payload with `explore(config, values=restored["params"], on_unknown="warn")`, migrate the parameter names deliberately, and save the migrated artifact as a new record.

## Replay After Defaults Change

The versioned artifact still protects you when defaults change, because replay uses stored params:

{% code overflow="wrap" %}

```python
import json
import hypster
from hypster import HP, instantiate, instantiate_with_params

def training_config(hp: HP) -> int:
    return hp.int(64, name="batch_size")


old_run = instantiate_with_params(training_config)
artifact = {
    "kind": "hypster-run-params",
    "hypster_version": hypster.__version__,
    "config_name": "training_config",
    "app_version": "2026.05.24",
    "git_commit": "abc1234",
    "params": old_run.params,
}

payload = json.dumps(artifact, sort_keys=True)
restored = json.loads(payload)


def training_config(hp: HP) -> int:
    return hp.int(128, name="batch_size")


assert instantiate(training_config, values=restored["params"]) == 64
```

{% endcode %}


# Observing Past Runs

When a past run stores Hypster params, you can inspect what those params mean against the current config function.

## Inspect The Active Branch

{% code overflow="wrap" %}

```python
from hypster import HP, explore
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

def linear_model(hp: HP) -> LogisticRegression:
    C = hp.float(1.0, name="C", min=1e-4, max=100.0)
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(200, name="n_estimators", min=10)
    return RandomForestClassifier(n_estimators=n_estimators, random_state=42)

model_options = {"linear": linear_model, "forest": forest_model}

def config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="model_family", default="linear", options_only=True)
    return hp.nest(selected_config, name="model")

past_params = {"model_family": "forest", "model.n_estimators": 500}

explore(config, values=past_params)
```

{% endcode %}

The tree shows the reachable parameters for that recorded branch.

## Handle Old Or Partial Payloads

Use `on_unknown="warn"` when reviewing old payloads that may include stale fields:

{% code overflow="wrap" %}

```python
explore(config, values={"model_family": "linear", "model.n_estimators": 500}, on_unknown="warn")
```

{% endcode %}

Do not replay with `on_unknown="ignore"` until you have decided which old values should be dropped or migrated.

## Compare Defaults

{% code overflow="wrap" %}

```python
schema = explore(config, values=past_params, return_schema=True)
current_branch_defaults = schema.defaults()
```

{% endcode %}

This is useful when you want to know which values were explicit in a past run and which current defaults would apply if they were omitted.


# Deploying to Production

When you deploy Hypster, treat selected params as part of the release artifact.

## Recommended Pattern

1. Define config functions in version-controlled Python modules.
2. Validate production values in CI with `instantiate(config, values=prod_params)`.
3. Store the exact `params` produced by `instantiate_with_params()`.
4. Log `hypster.__version__`, your app version, and the git commit.
5. Re-run `explore(config, values=prod_params)` during rollout to confirm the active branch.

## Example Smoke Test

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate_with_params
from my_app.deploy import ServiceDeployment

def service_config(hp: HP) -> ServiceDeployment:
    replicas = hp.int(2, name="replicas", min=1, max=20)
    provider = hp.select(["local", "remote"], name="provider", default="remote", options_only=True)
    timeout = hp.float(10.0, name="timeout", min=0.1, max=120.0)
    return ServiceDeployment(replicas=replicas, provider=provider, timeout=timeout)

def test_production_config():
    run = instantiate_with_params(
        service_config,
        values={"replicas": 4, "provider": "remote", "timeout": 30.0},
    )
    assert run.params["replicas"] == 4
```

{% endcode %}

## Avoid Stale Overrides

Keep `on_unknown="raise"` in production. Unknown and unreachable values often indicate a typo, dead branch, or payload from an older config version.

## Keep Configs Portable

Avoid doing network calls, training, or file writes while defining the config. Return values or lightweight objects, then execute effects in application code after instantiation.


# Public API

This page lists the public API exposed by `hypster`.

{% code overflow="wrap" %}

```python
from hypster import HP, InteractiveResult, explore, instantiate, instantiate_with_params, interact
```

{% endcode %}

## Config Function Contract

A config function must be callable and its first positional parameter must be named `hp`. A keyword-only `hp` parameter is rejected before the config executes because Hypster passes the `HP` object positionally.

Config functions are pure Python, not a DSL. `instantiate()`, `explore()`, `interact()`, and the HPO adapter all execute the function to discover or select values, so public API calls inherit any side effects or expensive work in the function body.

{% code overflow="wrap" %}

```python
from hypster import HP

def config(hp: HP) -> int:
    return hp.int(32, name="batch_size")
```

{% endcode %}

The `hp: HP` annotation is recommended but not mandatory. If the first parameter has a type annotation, it must include `HP`. Callable objects are supported when `inspect.signature()` can read their `__call__` signature; signature validation errors use the class name.

Reference examples use small return values for compactness. In application docs and production code, the usual pattern is to return the initialized object your application will use.

Config functions may accept extra keyword-only execution arguments. Pass those directly; Hypster-owned names such as `values`, `on_unknown`, `return_schema`, `auto_apply`, `name`, and `description` are reserved at their API boundaries.

## instantiate

{% code overflow="wrap" %}

```python
instantiate(
    func,
    *,
    values=None,
    on_unknown="raise",
    **kwargs,
)
```

{% endcode %}

Executes a config function and returns whatever the function returns.

| Parameter    | Meaning                                                                                                 |
| ------------ | ------------------------------------------------------------------------------------------------------- |
| `func`       | Config function whose first argument is `hp`.                                                           |
| `values`     | Optional dictionary of parameter paths to overrides. Nested dictionaries are flattened to dotted paths. |
| `on_unknown` | One of `"raise"`, `"warn"`, or `"ignore"`. Defaults to `"raise"`.                                       |
| `**kwargs`   | Extra execution arguments forwarded to `func`.                                                          |

`values=` keys must match parameters reached during the run. Unknown values and inactive-branch values raise by default.

## instantiate\_with\_params

{% code overflow="wrap" %}

```python
instantiate_with_params(
    func,
    *,
    values=None,
    on_unknown="raise",
    **kwargs,
)
```

{% endcode %}

Executes a config function and returns an `InstantiationOutput`.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate_with_params

def config(hp: HP) -> str:
    return hp.select(["small", "large"], name="model", default="small")

run = instantiate_with_params(config, values={"model": "large"})

assert run.value == "large"
assert run.params == {"model": "large"}
```

{% endcode %}

`run.params` includes every reachable `hp.*` parameter selected during the run, including defaults.

## InstantiationOutput

{% code overflow="wrap" %}

```python
InstantiationOutput(value, params)
```

{% endcode %}

| Attribute | Meaning                                                             |
| --------- | ------------------------------------------------------------------- |
| `value`   | The value returned by the config function.                          |
| `params`  | A copied dictionary of selected parameter paths to selected values. |

## explore

{% code overflow="wrap" %}

```python
explore(
    func,
    *,
    values=None,
    on_unknown="raise",
    return_schema=False,
    **kwargs,
)
```

{% endcode %}

Traces a config function with a schema-recording `HP`. This executes the function to discover the active branch.

* With `return_schema=False`, prints a tree and returns `None`.
* With `return_schema=True`, returns a `ConfigSchema`.

{% code overflow="wrap" %}

```python
schema = explore(config, return_schema=True)
print(schema.defaults())
print(schema.to_dict())
```

{% endcode %}

## ConfigSchema

Returned by `explore(..., return_schema=True)`.

| Method          | Meaning                                                            |
| --------------- | ------------------------------------------------------------------ |
| `to_dict()`     | Returns JSON-serializable schema metadata.                         |
| `defaults()`    | Returns a flat dictionary of default values for the active branch. |
| `format_tree()` | Returns the same tree string printed by `explore()`.               |

## interact

{% code overflow="wrap" %}

```python
interact(
    func,
    *,
    values=None,
    on_unknown="raise",
    auto_apply=True,
    **kwargs,
) -> InteractiveResult
```

{% endcode %}

Creates a notebook widget session for a config function and returns an `InteractiveResult` handle. Install the visualization extra before using it:

{% code overflow="wrap" %}

```bash
uv add "hypster[viz]"
```

{% endcode %}

`interact()` explores the config to render reachable controls, then instantiates the config from the current widget state. Widget changes can trigger repeated config execution to keep dependent controls current. The returned handle is not the configured object itself:

{% code overflow="wrap" %}

```python
result = interact(config)
current_value = result.value
current_params = result.params
```

{% endcode %}

| Parameter    | Meaning                                                                                                                              |
| ------------ | ------------------------------------------------------------------------------------------------------------------------------------ |
| `func`       | Config function whose first argument is `hp`.                                                                                        |
| `values`     | Optional starting values, using the same flat or nested path forms as `instantiate()`.                                               |
| `on_unknown` | Unknown-value policy used while exploring and applying widget state.                                                                 |
| `auto_apply` | When `True`, valid widget edits update `.value` and `.params` immediately. When `False`, edits stay draft-only until Apply succeeds. |
| `**kwargs`   | Extra execution arguments forwarded to the config function.                                                                          |

## InteractiveResult

{% code overflow="wrap" %}

```python
result.value
result.params
result.snapshot
result.interact()
```

{% endcode %}

| Attribute or method | Meaning                                                                                                               |
| ------------------- | --------------------------------------------------------------------------------------------------------------------- |
| `value`             | The currently applied config return value. Raises `RuntimeError` while the applied state is invalid.                  |
| `params`            | Replayable selected params for the currently applied state. Raises `RuntimeError` while the applied state is invalid. |
| `snapshot`          | Widget-facing state with schema, draft values, applied values, selected params, mode, status, and error.              |
| `interact()`        | Renders another live widget view backed by the same session.                                                          |

Replay an interactive selection the same way you replay any Hypster run:

{% code overflow="wrap" %}

```python
result = interact(config)
replayed = instantiate(config, values=result.params)
```

{% endcode %}

## ParameterInfo

Each schema parameter contains:

| Field                 | Meaning                                                             |
| --------------------- | ------------------------------------------------------------------- |
| `name`                | Local parameter name.                                               |
| `path`                | Dotted parameter path used in `values=`.                            |
| `display_label`       | Human-friendly label derived from `name`, useful for generated UIs. |
| `kind`                | `int`, `float`, `text`, `bool`, `select`, `multi_*`, or `group`.    |
| `default_value`       | Default value before overrides.                                     |
| `selected_value`      | Value selected in the traced run.                                   |
| `options`             | Select or multi-select options when available.                      |
| `minimum` / `maximum` | Numeric bounds when available.                                      |
| `description`         | Optional human-readable help text from `description=`.              |
| `children`            | Nested parameters for groups.                                       |

`ConfigSchema.to_dict()` is intended for rendering and inspection, not as a complete validation schema. It exposes the active branch, selected values, options, numeric bounds, descriptions, and nested groups. It does not currently expose every HP call option, such as `allow_none`, `options_only`, or `strict`.

UI builders should use schema metadata to render controls, then round-trip user input through `explore(..., values=..., return_schema=True)` and `instantiate(..., values=...)` for authoritative validation. For dict-backed selects, `options` contains replayable keys, not mapped runtime objects.

## HP Scalar Methods

All parameter names must be valid Python identifier-style strings: use letters, numbers, and underscores, and do not include dots, spaces, hyphens, or Python keywords. Hypster composes dotted paths from nesting.

{% code overflow="wrap" %}

```python
hp.int(default, *, name, min=None, max=None, strict=False, allow_none=False, hpo_spec=None, description=None)
hp.float(default, *, name, min=None, max=None, strict=False, allow_none=False, hpo_spec=None, description=None)
hp.text(default, *, name, allow_none=False, description=None)
hp.bool(default, *, name, allow_none=False, description=None)
```

{% endcode %}

| Method     | Selected value                                                                                   |
| ---------- | ------------------------------------------------------------------------------------------------ |
| `hp.int`   | `int`; accepts real integers and integral floats like `64.0` unless `strict=True`; rejects bool. |
| `hp.float` | `float`; accepts real floats and integer overrides unless `strict=True`; rejects bool.           |
| `hp.text`  | `str`.                                                                                           |
| `hp.bool`  | `bool`.                                                                                          |

Use `allow_none=True` when `None` is a real scalar value. Numeric coercion is consistent for top-level parameters and nested paths.

## HP Select Methods

{% code overflow="wrap" %}

```python
hp.select(options, *, name, default=NO_DEFAULT, options_only=False, allow_none=False, hpo_spec=None, description=None)
hp.multi_select(options, *, name, default=None, options_only=False, allow_none=False, description=None)
```

{% endcode %}

`options` may be a list of logging-safe scalar choices or a dictionary from logging-safe scalar keys to any runtime values.

Use `allow_none=True` when `None` is one of the list-form choices:

{% code overflow="wrap" %}

```python
hp.select([None, "basic"], name="tokenizer", default=None, allow_none=True)
```

{% endcode %}

`hp.select([], name="choice", allow_none=True)` is valid and defaults to `None`. Without `allow_none=True`, an empty option list with no explicit default raises.

{% code overflow="wrap" %}

```python
model = hp.select(
    {
        "small": {"layers": 2},
        "large": {"layers": 4},
    },
    name="model",
    default="small",
)
```

{% endcode %}

The selected params record `"small"` or `"large"`, while the config function receives the mapped dictionary value.

By default, `options_only=False`, so custom scalar values outside the listed options are allowed. Use `options_only=True` for finite enums.

## HP Multi-Value Methods

{% code overflow="wrap" %}

```python
hp.multi_int(default, *, name, min=None, max=None, strict=False, allow_none=False, description=None)
hp.multi_float(default, *, name, min=None, max=None, strict=False, allow_none=False, description=None)
hp.multi_text(default, *, name, allow_none=False, description=None)
hp.multi_bool(default, *, name, allow_none=False, description=None)
```

{% endcode %}

These methods select lists whose elements are validated like the corresponding scalar type. `multi_int` accepts integral floats by default, and `multi_float` accepts integers by default. Both reject bool values. Nullable elements are not supported for `multi_int`, `multi_float`, `multi_text`, or `multi_bool`. Use `multi_select(..., allow_none=True)` for nullable categorical lists.

## HP.nest

{% code overflow="wrap" %}

```python
hp.nest(
    child,
    *,
    name,
    values=None,
    description=None,
    **kwargs,
)
```

{% endcode %}

Executes another config function under a named path.

{% code overflow="wrap" %}

```python
def child(hp: HP):
    return {"x": hp.int(1, name="x")}

def parent(hp: HP):
    return hp.nest(child, name="child")
```

{% endcode %}

Override nested values with dotted paths:

{% code overflow="wrap" %}

```python
instantiate(parent, values={"child.x": 2})
```

{% endcode %}

Nested dictionaries are normalized to the same dotted paths, so `values={"child": {"x": 2}}` is also valid. The scope name itself is not a parameter leaf: `values={"child": 2}` raises as unknown or unreachable.

Explicit child-local `values=` passed to `hp.nest(child, name="child", values=...)` are validated after the child config runs. Unknown or unreachable child keys raise instead of being ignored.

## HP.collect

{% code overflow="wrap" %}

```python
hp.collect(locals_dict, include=None, exclude=None)
```

{% endcode %}

Collects local variables into a dictionary while excluding `hp`, `self`, dunder names, and private names.

{% code overflow="wrap" %}

```python
def config(hp: HP):
    batch_size = hp.int(32, name="batch_size")
    learning_rate = hp.float(0.001, name="learning_rate")
    helper = "not returned"
    return hp.collect(locals(), exclude=["helper"])
```

{% endcode %}


# Optuna HPO API

Install Optuna support:

{% code overflow="wrap" %}

```bash
uv add 'hypster[optuna]'
```

{% endcode %}

Public imports:

{% code overflow="wrap" %}

```python
from hypster.hpo.optuna import suggest_values
from hypster.hpo.types import HpoCategorical, HpoFloat, HpoInt
```

{% endcode %}

## suggest\_values

{% code overflow="wrap" %}

```python
suggest_values(trial, config, **kwargs) -> dict
```

{% endcode %}

Runs `config` with a trial-backed `HP` proxy and returns a `values` dictionary that can be passed to `instantiate()`.

{% code overflow="wrap" %}

```python
values = suggest_values(trial, model_config)
cfg = instantiate(model_config, values=values)
```

{% endcode %}

The adapter is branch-aware. It only suggests parameters reached by the sampled execution path.

For numeric suggestions, `min` and `max` on the `hp.int` or `hp.float` call define the Optuna search range. If either bound is omitted, the parameter default is used for that missing bound. If both are omitted, the Optuna suggestion collapses to the default value.

## HpoInt

{% code overflow="wrap" %}

```python
HpoInt(
    step=None,
    scale="linear",
    base=10.0,
    include_max=True,
)
```

{% endcode %}

| Field         | Meaning                                                                                             |
| ------------- | --------------------------------------------------------------------------------------------------- |
| `step`        | Quantization step passed to `trial.suggest_int`.                                                    |
| `scale`       | `"linear"` or `"log"`, passed as Optuna's `log` flag.                                               |
| `base`        | Must remain `10.0`; custom bases are rejected because Optuna `suggest_int()` has no base parameter. |
| `include_max` | When `False`, the Optuna high bound is reduced by `step` or `1`.                                    |

## HpoFloat

{% code overflow="wrap" %}

```python
HpoFloat(
    step=None,
    scale="linear",
    base=10.0,
    distribution=None,
    center=None,
    spread=None,
)
```

{% endcode %}

| Field              | Meaning                                                                                                                                              |
| ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `step`             | Quantization step passed to `trial.suggest_float`.                                                                                                   |
| `scale`            | `"linear"` or `"log"`, passed as Optuna's `log` flag when `distribution` is not set.                                                                 |
| `base`             | Must remain `10.0`; custom bases are rejected because Optuna `suggest_float()` has no base parameter.                                                |
| `distribution`     | `None` and `"uniform"` use non-log `suggest_float()`. `"loguniform"` uses `suggest_float(..., log=True)`. `"normal"` and `"lognormal"` are rejected. |
| `center`, `spread` | Rejected by the Optuna adapter because they only make sense for normal/lognormal distributions.                                                      |

If `distribution="loguniform"`, the adapter uses Optuna's log sampling even if `scale` is left at its default.

## HpoCategorical

{% code overflow="wrap" %}

```python
HpoCategorical(ordered=False, weights=None)
```

{% endcode %}

The current Optuna adapter uses `trial.suggest_categorical()` for `hp.select`. `ordered=True` and `weights=...` are rejected because Optuna categorical suggestions cannot express ordered or weighted categorical semantics.

## Supported HP Calls

| HP call     | Optuna behavior                                           |
| ----------- | --------------------------------------------------------- |
| `hp.int`    | `trial.suggest_int(path, low, high, step=..., log=...)`   |
| `hp.float`  | `trial.suggest_float(path, low, high, step=..., log=...)` |
| `hp.select` | `trial.suggest_categorical(path, keys)`                   |
| `hp.nest`   | Prefixes child parameter paths.                           |

`multi_int`, `multi_float`, `multi_text`, `multi_bool`, and `multi_select` are not expanded by the Optuna adapter.

Explicit child-local overrides passed with `hp.nest(child, name="child", values=...)` are validated before `suggest_values()` returns. Unknown or unreachable child keys raise instead of being silently ignored.

## Nullable Numeric Values

`allow_none=True` is not supported for HPO numeric suggestions. Model nullable search choices as categoricals, then branch to numeric parameters when needed.

## Invalid HPO Specs

`suggest_values()` raises `ValueError` when a backend-agnostic HPO spec asks for semantics that Optuna cannot represent.

{% code overflow="wrap" %}

```python
from hypster import HP
from hypster.hpo.optuna import suggest_values
from hypster.hpo.types import HpoCategorical, HpoFloat


def normal_float_config(hp: HP):
    return hp.float(
        0.1,
        name="dropout",
        min=0.0,
        max=1.0,
        hpo_spec=HpoFloat(distribution="normal", center=0.2, spread=0.05),
    )


def weighted_choice_config(hp: HP):
    return hp.select(
        ["small", "large"],
        name="model",
        hpo_spec=HpoCategorical(weights=[0.8, 0.2]),
    )
```

{% endcode %}

Both fail during `suggest_values(trial, ...)` before a values dictionary is returned. Show the error as configuration feedback:

{% code overflow="wrap" %}

```
This HPO spec cannot be represented by the Optuna adapter. Use uniform/loguniform float sampling, remove categorical weights or ordering, or implement custom sampling inside the objective.
```

{% endcode %}


# Error Handling

Hypster validates names, values, paths, and branch reachability before it treats a run as replayable.

## Public Exception Surface

Public validation failures currently raise `ValueError`. When `on_unknown="warn"`, unknown or unreachable values emit a `UserWarning` and the run continues. Hypster does not expose a structured exception hierarchy today, so application and UI code should catch `ValueError` at the boundary where it can show a user-facing message.

## Unknown Or Unreachable Values

`instantiate()`, `instantiate_with_params()`, and `explore()` default to `on_unknown="raise"`.

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def config(hp: HP):
    mode = hp.select(["a", "b"], name="mode", default="a")
    if mode == "a":
        return {"count": hp.int(1, name="count")}
    return {"mode": mode}

instantiate(config, values={"mode": "b", "count": 3})
```

{% endcode %}

`count` is unreachable on the `mode="b"` branch, so Hypster raises.

Example message:

{% code overflow="wrap" %}

```
Unknown or unreachable parameters:
  - 'count': Unknown parameter

Run explore(config, values=...) to inspect the active branch.
Nested dict values are interpreted as parameter paths; use dict-backed select keys for objects.
```

{% endcode %}

Unknown and unreachable values share the same public error family because both mean "this key was not consumed by the active run." To diagnose the difference, run `explore(config, values=...)` for the selected branch:

* If the path is absent from that branch but present in another branch, it is unreachable or stale.
* If the path is absent from all branches, it is likely a typo or removed parameter.

A typo can include a nearest-name suggestion:

{% code overflow="wrap" %}

```python
instantiate(config, values={"mode": "a", "coutn": 3})
```

{% endcode %}

{% code overflow="wrap" %}

```
Unknown or unreachable parameters:
  - 'coutn': Did you mean 'count'? (similarity: 80%)

Run explore(config, values=...) to inspect the active branch.
Nested dict values are interpreted as parameter paths; use dict-backed select keys for objects.
```

{% endcode %}

Use softer policies only when intentional:

{% code overflow="wrap" %}

```python
instantiate(config, values={"mode": "b", "count": 3}, on_unknown="warn")
instantiate(config, values={"mode": "b", "count": 3}, on_unknown="ignore")
```

{% endcode %}

With `on_unknown="warn"`, use Python warning controls if you need to capture the message in a UI or test:

{% code overflow="wrap" %}

```python
import warnings

with warnings.catch_warnings(record=True) as caught:
    instantiate(config, values={"mode": "b", "count": 3}, on_unknown="warn")

assert caught
```

{% endcode %}

For user-facing tools, a common strategy is:

{% code overflow="wrap" %}

```python
try:
    value = instantiate(config, values=form_values)
except ValueError as exc:
    show_validation_error(str(exc))
else:
    run_workflow(value)
```

{% endcode %}

Interactive widget handles expose the same validation through a different boundary: direct `instantiate()` and `explore()` failures raise `ValueError`, while reading `InteractiveResult.value` or `InteractiveResult.params` during an invalid applied state raises `RuntimeError` with the underlying validation message. Show either message to the user and keep the last valid submitted params separate from draft UI state.

## Invalid Names

Parameter and nest names must be Python identifier-style strings.

{% code overflow="wrap" %}

```python
hp.int(32, name="batch_size")  # valid
hp.int(32, name="batch-size")  # invalid
```

{% endcode %}

Use nesting to create dotted paths:

{% code overflow="wrap" %}

```python
hp.nest(child_config, name="model")
# child_config's "learning_rate" parameter becomes "model.learning_rate"
```

{% endcode %}

## Duplicate Value Paths

These two entries spell the same final parameter path:

{% code overflow="wrap" %}

```python
values = {
    "model.learning_rate": 0.01,
    "model": {"learning_rate": 0.01},
}
```

{% endcode %}

Hypster raises even if both values are identical, because duplicate inputs make logs ambiguous.

## Type And Bounds Errors

Each `hp.*` call validates runtime values:

* `hp.int` accepts integral floats by default, but rejects non-integers, bool values, and precision-losing floats such as `64.5`.
* `hp.float` accepts integer values by default, but rejects non-numeric values and bool values.
* `strict=True` makes `hp.int` require real integers and `hp.float` require real floats.
* `hp.bool` requires actual `bool` values, not strings such as `"true"`.
* Numeric `min` and `max` bounds are enforced.
* `select` choices must be logging-safe scalars unless you use dictionary-backed selects.

## Complex Select Values

Do not put dictionaries or lists directly inside list-backed `select` choices:

{% code overflow="wrap" %}

```python
hp.select([{"layers": 2}, {"layers": 4}], name="model")
```

{% endcode %}

Use dictionary-backed select instead:

{% code overflow="wrap" %}

```python
hp.select(
    {
        "small": {"layers": 2},
        "large": {"layers": 4},
    },
    name="model",
)
```

{% endcode %}

The selected key is replayable, and the runtime value can still be complex.


# Optuna

Optuna is the first supported HPO backend for Hypster. The integration lives in `hypster.hpo.optuna`.

## Install

{% code overflow="wrap" %}

```bash
uv add 'hypster[optuna]'
```

{% endcode %}

or:

{% code overflow="wrap" %}

```bash
pip install 'hypster[optuna]'
```

{% endcode %}

## Basic Pattern

{% code overflow="wrap" %}

```python
import optuna
from sklearn.base import ClassifierMixin
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression

from hypster import HP, instantiate
from hypster.hpo.optuna import suggest_values
from hypster.hpo.types import HpoFloat, HpoInt

def linear_model(hp: HP) -> LogisticRegression:
    C = hp.float(
        1.0,
        name="C",
        min=1e-4,
        max=10.0,
        hpo_spec=HpoFloat(scale="log"),
    )
    return LogisticRegression(C=C, max_iter=1000)

def forest_model(hp: HP) -> RandomForestClassifier:
    n_estimators = hp.int(
        200,
        name="n_estimators",
        min=50,
        max=1000,
        hpo_spec=HpoInt(step=50),
    )
    return RandomForestClassifier(n_estimators=n_estimators, random_state=42)

model_options = {
    "linear": linear_model,
    "forest": forest_model,
}

def model_config(hp: HP) -> ClassifierMixin:
    selected_config = hp.select(model_options, name="model_family", default="forest", options_only=True)
    return hp.nest(selected_config, name="model")

def objective(trial: optuna.Trial) -> float:
    values = suggest_values(trial, model_config)
    model = instantiate(model_config, values=values)
    return train_and_score(model)

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=30)
```

{% endcode %}

## What Is Supported

* `hp.int`, backed by `trial.suggest_int`
* `hp.float`, backed by `trial.suggest_float`
* `hp.select`, backed by `trial.suggest_categorical`
* `hp.nest`, which prefixes nested parameter paths

Multi-value HP calls are not expanded by the current adapter.

The adapter only accepts HPO spec fields that Optuna can represent. Supported fields include `HpoInt(step=..., scale=..., include_max=...)`, `HpoFloat(step=..., scale=...)`, `HpoFloat(distribution="uniform"|"loguniform")`, and `HpoCategorical(ordered=False, weights=None)`. Unsupported fields such as custom `base=...`, normal/lognormal float distributions, `center=...`, `spread=...`, ordered categoricals, and categorical weights raise instead of being ignored.

Nested explicit overrides passed through `hp.nest(..., values=...)` are validated before `suggest_values()` returns.

## More

* [Perform Hyperparameter Optimization](/hypster/how-to-guides/perform-hyperparameter-optimization)
* [Optuna HPO API](/hypster/reference/optuna-hpo)


# Hamilton

Hamilton and Hypster can be used together by keeping graph construction and parameter selection separate:

* Use Hypster to select a small, replayable set of workflow parameters.
* Pass the instantiated value into your Hamilton driver, module, or execution wrapper.
* Log `instantiate_with_params(...).params` next to Hamilton run metadata.

## Shape

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate_with_params

def hamilton_run_config(hp: HP):
    dataset = hp.select(["sample", "warehouse"], name="dataset", default="sample", options_only=True)
    feature_set = hp.select(["basic", "extended"], name="feature_set", default="basic", options_only=True)
    model_family = hp.select(["linear", "forest"], name="model_family", default="linear", options_only=True)

    return {
        "dataset": dataset,
        "feature_set": feature_set,
        "model_family": model_family,
    }

run = instantiate_with_params(
    hamilton_run_config,
    values={"dataset": "warehouse", "feature_set": "extended"},
)

# driver.execute(..., inputs=run.value)
# tracker.log_params(run.params)
```

{% endcode %}

## Execution Boundary

Use `run.value` at the Hamilton execution boundary, and log `run.params` beside Hamilton metadata:

{% code overflow="wrap" %}

```python
# from hamilton import driver
# import my_hamilton_nodes

run = instantiate_with_params(
    hamilton_run_config,
    values={"dataset": "warehouse", "feature_set": "extended"},
)

# dr = driver.Builder().with_modules(my_hamilton_nodes).build()
# result = dr.execute(
#     ["trained_model", "validation_metrics"],
#     inputs=run.value,
# )
# tracker.log_params(run.params)
# tracker.log_metrics(result["validation_metrics"])
```

{% endcode %}

Use Hamilton `inputs=` for runtime choices such as dataset, feature set, and model family. Use Hamilton `config=` only for static graph-construction choices in your Hamilton project. Hypster stays outside Hamilton's graph; it selects and records the values that you pass into the graph.

Hypster does not ship a Hamilton-specific adapter today. This page documents the integration pattern for projects that already use Hamilton.


# Haystack

Haystack pipelines often have swappable retrievers, rankers, prompts, and generation models. Hypster can select those pieces before you build or run the pipeline.

## Shape

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate_with_params

def retrieval_config(hp: HP):
    retrieval = hp.select(
        {
            "keyword": {"type": "bm25", "index": "docs"},
            "vector": {"type": "embedding", "index": "docs-embeddings"},
            "hybrid": {"type": "hybrid", "keyword_weight": 0.35},
        },
        name="kind",
        default="hybrid",
        options_only=True,
    )
    return retrieval

def haystack_pipeline_config(hp: HP):
    retrieval = hp.nest(retrieval_config, name="retrieval")
    top_k = hp.int(8, name="top_k", min=1, max=50)
    rerank = hp.bool(True, name="rerank")
    answer_style = hp.select(
        ["brief", "sourced"],
        name="answer_style",
        default="sourced",
        options_only=True,
    )

    return {
        "retrieval": retrieval,
        "top_k": top_k,
        "rerank": rerank,
        "answer_style": answer_style,
    }

run = instantiate_with_params(
    haystack_pipeline_config,
    values={"retrieval.kind": "vector", "top_k": 12},
)

# pipeline = build_haystack_pipeline(run.value)
# tracker.log_params(run.params)
```

{% endcode %}

## Build The Pipeline After Instantiation

Keep Hypster configs focused on replayable settings. Build Haystack components after `instantiate_with_params()` so exploration and UI generation do not open indexes, clients, or network connections.

{% code overflow="wrap" %}

```python
def build_haystack_pipeline(settings):
    retrieval = settings["retrieval"]

    if retrieval["type"] == "bm25":
        retriever = build_bm25_retriever(index=retrieval["index"], top_k=settings["top_k"])
    elif retrieval["type"] == "embedding":
        retriever = build_embedding_retriever(index=retrieval["index"], top_k=settings["top_k"])
    else:
        retriever = build_hybrid_retriever(
            keyword_weight=retrieval["keyword_weight"],
            top_k=settings["top_k"],
        )

    pipeline = make_pipeline(
        retriever=retriever,
        rerank=settings["rerank"],
        answer_style=settings["answer_style"],
    )
    return pipeline


run = instantiate_with_params(
    haystack_pipeline_config,
    values={"retrieval.kind": "hybrid", "answer_style": "sourced"},
)

pipeline = build_haystack_pipeline(run.value)
tracker.log_params(run.params)
```

{% endcode %}

If your components are cheap pure-Python objects, a config can return factories or initialized components. For retrievers, indexes, remote LLM clients, and pipelines that allocate resources, prefer returning lightweight settings and building the Haystack pipeline outside the config function.

Hypster does not ship a Haystack-specific adapter today. Use this pattern when you want replayable parameter selection around an existing Haystack builder.


# Use Cases

Hypster is a good fit when configuration is conditional, nested, or part of an experiment record.

## Machine Learning

* Choose model families such as `linear`, `forest`, `boosted`, or `neural`.
* Keep family-specific parameters on their active branch.
* Use `hpo_spec=` to make the same config searchable by Optuna.
* Log `instantiate_with_params().params` with metrics.

See [Machine Learning](/hypster/examples/machine-learning).

## Data Processing

* Configure ingestion paths, delimiters, schemas, cleaning rules, and export formats.
* Represent environment choices such as `sample` vs `full`.
* Replay a pipeline run from selected params.

See [Data Processing](/hypster/examples/data-processing).

## AI Workflows

* Switch providers, models, prompts, retrieval strategies, and output modes.
* Use dict-backed selects for complex provider or retriever objects.
* Explore the active branch before rendering a UI or submitting a job.

See [AI Workflows](/hypster/examples/ai-workflows).

## Internal Tools And UIs

* Generate form controls from `explore(..., return_schema=True)`.
* Submit UI state as `values=`.
* Recompute the schema when a branch-selecting value changes.

See [Interactive UI From Schema](/hypster/examples/interactive-ui).

## Production Replay

* Store selected params next to a versioned config function.
* Keep `on_unknown="raise"` so stale payloads fail visibly.
* Smoke-test production parameter payloads in CI.


# Unique Features

## Define-By-Run Configuration

Hypster config functions are normal Python rather than a DSL. Branches, loops, helper functions, lists, imports, and object construction work the way Python developers expect.

The price of that flexibility is honest execution: Hypster discovers parameters by running the config function. Keep discovery paths fast and side-effect-free so exploration, UI rendering, and HPO can rerun them safely.

## Branch-Aware Exploration

`explore(config, values=...)` traces the same branch that `instantiate(config, values=...)` would run. That makes it useful for UIs, HPO, schema exports, and debugging stale values.

## Replayable Params Sidecar

`instantiate_with_params()` returns both the runtime value and the selected parameter dictionary. Defaults are included, so replay does not depend on future default changes.

## Nested Composition

`hp.nest()` gives reusable child configs their own dotted parameter paths without requiring a global registry or decorator.

## Dict-Backed Selects

Hypster separates the logged key from the runtime value. This lets you log `"large"` while returning an object, dictionary, tuple, or factory.

For swappable components, the idiomatic shape is a named options dictionary from simple keys to config functions, followed by `hp.nest()` on the selected function. That keeps logs stable, UI options simple, and the parent config readable even when the option set grows.

## Strict Unknown Values By Default

Unknown and unreachable values raise by default. This is stricter than many config systems, but it protects experiment logs and production replays from silently accepting stale parameters.


# Hypster vs. Alternatives

Hypster is not trying to replace every configuration tool. It is focused on Python workflows where configuration is executable, conditional, nested, and part of a reproducibility record.

## When Hypster Fits

Use Hypster when:

* parameter choices change which code path runs
* a config needs reusable child configs
* you want to explore or render the active parameter schema
* you need replayable selected params, including defaults
* the same config should support manual runs, UI runs, and HPO

## Compared With Static Config Files

YAML, TOML, and JSON are excellent for static settings. Hypster is better when the configuration space itself contains logic:

{% code overflow="wrap" %}

```python
if provider == "openai":
    llm = hp.nest(openai_config, name="openai")
else:
    llm = hp.nest(gemini_config, name="gemini")
```

{% endcode %}

The active branch determines which parameters exist for a run.

## Compared With Hydra

Hydra is powerful for hierarchical file-based composition and command-line overrides. Hypster is smaller and plain-Python-first:

* no config file format required
* no decorator required
* no global registry required
* config functions return normal Python values
* branch exploration and HPO use the same function

Hydra is likely a better fit if your project already depends on file-based config groups and large CLI-driven sweeps.

## Compared With Pydantic Settings

Pydantic is excellent for validating known fields. Hypster is focused on discovering which fields are active at runtime and replaying the selected parameter paths. You can still return Pydantic models from a Hypster config if that is useful in your application.

## Compared With Optuna Directly

Optuna's define-by-run API is excellent for optimization. Hypster lets you use a similar shape for normal configuration, schema exploration, experiment tracking, and UI generation, then hand the same config to Optuna when you want HPO.


# Origin Story

Hypster grew from a recurring problem in AI and ML projects: configuration is often more dynamic than a settings file, but experiment logs still need stable, replayable parameters.

In practice, a workflow might need to switch between model families, providers, retrieval modes, preprocessing steps, local and remote execution, or production and evaluation paths. Each branch has different parameters. Logging every possible value is noisy, but silently ignoring inactive values makes experiments hard to trust.

Hypster's answer is a small define-by-run API:

* write the config as plain Python
* use `hp.*` calls for the values that matter
* compose nested configs with `hp.nest`
* inspect active branches with `explore`
* run with `instantiate`
* log and replay with `instantiate_with_params`

The goal is not to hide Python behind a configuration language. The goal is to make the parts of Python configuration that matter for reproducibility explicit.


# Articles

These articles explain the ideas and use cases that led to Hypster.

{% embed url="<https://medium.com/@giladrubin/introducing-hypster-a-pythonic-framework-for-managing-configurations-to-build-highly-optimized-ai-5ee004dbd6a5>" %}

{% embed url="<https://towardsdatascience.com/implementing-modular-rag-with-haystack-and-hypster-d2f0ecc88b8f>" %}

{% embed url="<https://medium.com/@giladrubin/5-pillars-for-a-hyper-optimized-ai-workflow-21fcaefe48ca>" %}


# ADR: Strict Unknown Values

Hypster treats `values` as a reproducibility surface, so unknown or unreachable overrides should fail loudly by default instead of being silently logged or replayed incorrectly. We changed the default unknown-parameter policy for `instantiate`, `instantiate_with_params`, and `explore` to `raise`, while keeping `warn` and `ignore` available for callers who intentionally want softer behavior.

## Consequences

* Nested-dict `values` must participate in unknown/unreachable checking as their equivalent dotted parameter paths.
* A parameter path specified more than once through mixed dotted and nested forms always raises as malformed input, even when the duplicate values are equal.
* Unknown/unreachable errors should guide users to `explore(config, values=...)` to inspect the active branch, but `instantiate` must not execute extra branch exploration automatically.
* This is stricter than the previous default, but it better matches logging and replay workflows where ignored overrides are usually bugs.

## Example

{% code overflow="wrap" %}

```python
from hypster import HP, instantiate

def config(hp: HP):
    branch = hp.select(["a", "b"], name="branch", default="a")
    if branch == "a":
        return {"x": hp.int(1, name="x")}
    return {"y": hp.int(2, name="y")}

instantiate(config, values={"branch": "b", "x": 10})
# ValueError: Unknown or unreachable parameters
```

{% endcode %}

`x` is a real parameter, but it is not reachable on the selected branch. Raising by default prevents a run from looking as though it used `x=10` when that value had no effect.


