Circuit Structure Search

Find better circuit designs by exploring the space of rewrite-equivalent structures.

When to Use

Asgard offers two optimization paradigms. Choose based on what you want to change:

	Gradient Optimization	Circuit Search
Changes	Parameter values (`scalar`, `const`)	Circuit topology (structure)
Method	JAX numerical gradients	Rewrite-rule exploration
Speed	Fast (20–100 evaluations)	Slower (hundreds–thousands)
Result	Same structure, better parameters	Different structure entirely
Use when	You know the right circuit shape	You want to discover a shape

Use gradient optimization when the circuit structure is correct but parameter values need tuning (e.g. finding the right coefficient in scalar(a)). See the Gradient Optimization guide.

Use circuit search when you want to explore alternative structures — for instance, discovering that composition(add, split) (double the input) fits your data better than a single scalar.

How It Works

Circuit search has three components:

CircuitSearchSpace generates neighbor circuits by applying rewrite rules at every node in the parse tree
A search strategy decides which neighbors to explore next
A fitness evaluator scores each candidate circuit against your data

initial circuit
      │
      ▼
┌─────────────┐     ┌───────────────┐
│ SearchSpace  │────▶│  neighbors[]  │
│ (rewrite     │     └───────┬───────┘
│  rules)      │             │
└─────────────┘     ┌────────▼────────┐
                    │ Fitness Evaluator│
                    │ (score each)     │
                    └────────┬─────────┘
                             │
                    ┌────────▼────────┐
                    │ Search Strategy  │
                    │ (pick next)      │
                    └────────┬─────────┘
                             │
                      best circuit

Neighbor generation

CircuitSearchSpace walks the Lark parse tree of a circuit and tries every applicable rewrite rule at every node. Each successful rewrite produces a new candidate circuit. Invalid rewrites (e.g. degree mismatches) are silently discarded. Duplicate circuits are removed based on their string representation.

Available rule groups:

circuit_category_rules — composition associativity, identity laws
circuit_monoidal_rules — tensor product associativity, interchange laws
circuit_atomic_rules — commutativity of add/multiply, zero/one identities
circuit_trace_rules — trace naturality, cyclic properties
circuit_sde_rules — constant drift/diffusion specializations

Search Strategies

Greedy (fast, local)

Hill-climbing: always moves to the best neighbor. Stops when no neighbor improves on the current circuit.

Pros: Fast convergence, low resource usage
Cons: Easily stuck in local optima
Best for: Quick exploration, simple search spaces

Beam search (balanced)

Maintains a "beam" of the top-k circuits. At each step, expands all neighbors of all circuits in the beam, then keeps the best k overall.

Pros: Explores multiple paths simultaneously, good trade-off
Cons: Memory scales with beam width × neighborhood size
Best for: Most practical use cases

Genetic algorithm (thorough)

Population-based evolution with tournament selection, elite preservation, and random mutation via rewrite rules.

Pros: Escapes local optima, explores broadly
Cons: Slowest, needs careful tuning of population/mutation parameters
Best for: Large search spaces, when greedy/beam get stuck

Comparison

Strategy	Evaluations	Exploration	Tuning
Greedy	Fewest	Local only	None
Beam search	Moderate	Multiple paths	`beam_width`
Genetic	Most	Broad	`population_size`, `mutation_rate`, `elite_size`

YAML Example Walkthrough

The example optimization/11_circuit_optimization_demo.yaml demonstrates all three strategies on the same problem:

name: Circuit Optimization Demo
description: |
  Compare greedy, beam search, and genetic algorithms.
  Given data from y = x (identity), find the simplest circuit.

pipeline:
  dataset:
    generator:
      function: "y = x"
      x_range: [-3, 3]
      n_samples: 25
      noise_level: 0.1

  initial_circuit: "id"

  optimizer:
    method: circuit_rewrite
    compare_strategies: true
    strategies:
      greedy:
        max_iterations: 30
      beam_search:
        beam_width: 8
        max_iterations: 15
      genetic:
        population_size: 20
        mutation_rate: 0.3
        elite_size: 3
        max_iterations: 20
    rule_groups:
      - circuit_category_rules
      - circuit_monoidal_rules
      - circuit_atomic_rules
    max_rewrites_per_step: 30
    complexity_weight: 0.002

output:
  type: CircuitOptimizationResult

Key fields:

initial_circuit — starting point for the search. All strategies begin here.
rule_groups — which rewrite rule groups to use for neighbor generation. More groups = larger neighborhood = slower but more thorough.
max_rewrites_per_step — caps the number of rewrites considered per node. Useful for controlling runtime on deeply nested circuits.
compare_strategies: true — runs all listed strategies and reports results side-by-side.
complexity_weight — controls the multi-objective trade-off (see below).

Run it with:

uv run asgard example optimization/11_circuit_optimization_demo.yaml

Multi-Objective Fitness

By default, fitness evaluators use prediction error alone (e.g. MSE via MSEFitnessEvaluator). MultiObjectiveFitnessEvaluator wraps any primary evaluator and adds a penalty for circuit size:

fitness = primary_loss + complexity_weight × circuit_length

where circuit_length is the string length of the circuit representation (a proxy for structural complexity).

This prevents the search from producing circuits that overfit by growing arbitrarily complex. A typical starting value is 0.001–0.01; increase it if circuits grow too large, decrease it if accuracy is more important.

Python API

from gimle.asgard.circuit.circuit import Circuit
from gimle.asgard.circuit.circuit_search import CircuitSearchSpace
from gimle.asgard.circuit.circuit_optimizer import BeamSearchOptimizer
from gimle.asgard.circuit.circuit_fitness import (
    MSEFitnessEvaluator,
    MultiObjectiveFitnessEvaluator,
    DatasetSample,
)
from gimle.asgard.runtime.stream import Stream
import jax.numpy as jnp

# 1. Build a dataset
dataset = []
for x in [1.0, 2.0, 3.0, 4.0, 5.0]:
    inp = Stream(data=jnp.array([[x]]), dim_labels=(), chunk_size=1)
    out = Stream(data=jnp.array([[2.0 * x]]), dim_labels=(), chunk_size=1)
    dataset.append(DatasetSample([inp], [out]))

# 2. Define search space and fitness
search_space = CircuitSearchSpace(
    rule_groups=["circuit_category_rules", "circuit_atomic_rules"],
    max_rewrites_per_step=20,
)
mse = MSEFitnessEvaluator(dataset)
fitness = MultiObjectiveFitnessEvaluator(
    dataset,
    primary_evaluator=mse,
    complexity_weight=0.005,
)

# 3. Search
optimizer = BeamSearchOptimizer(
    search_space, fitness, beam_width=8, verbose=True
)
result = optimizer.optimize(
    Circuit.from_string("id"),
    max_iterations=20,
)

print(f"Best circuit: {result.best_circuit}")
print(f"Fitness: {result.best_fitness.loss:.4f}")

Tips

Start simple. Use id, add, or scalar(1.0) as your initial circuit. A complex starting point has a large neighborhood that slows every iteration.
Use candidate pre-evaluation. The YAML field candidate_circuits lets you list several starting points; the runner picks the best one before invoking the search, giving you a head start without manual experimentation.
Tune complexity weight gradually. Start at 0 to find the most accurate circuit, then increase to find simpler alternatives with similar accuracy.
Limit rule groups. Start with circuit_category_rules and circuit_atomic_rules — the smallest, most predictable rule sets. Add circuit_monoidal_rules only if the search needs tensor-product rewrites.
Beam width 5–10 is usually enough. Beyond 10 the cost grows fast with little improvement.
Combine with gradient optimization. Search finds the structure, then gradient optimization tunes any scalar/const parameters in the result.

Next Steps

Gradient Optimization — parameter tuning
Jacobians — sensitivity analysis
API Reference — complete API documentation