Module 03: Cage Generator
Goal: search for candidate (k,g)-cages using multiple generators and expose live progress via session polling.
Why This Module Is Important
This module shifts from prediction to constrained construction. Instead of estimating a property on a fixed graph, the system must iteratively build a graph that satisfies regularity and girth constraints. That makes it an end-to-end reasoning and search benchmark, not only a regression benchmark.
It also gives direct user-facing feedback about algorithm behavior: speed, quality of partial states, and convergence or failure modes are visible live through session polling.
Available Generation Methods
- Random Walk (fast stochastic search)
- Bruteforce (systematic/backtracking search)
- A* Search (best-first heuristic search)
- RL Agent (policy-guided action selection)
Runtime Behavior Stats
| Parameter | Current Value | Meaning |
|---|---|---|
| MAX_PARALLEL_GENERATIONS | 3 | At most 3 active generation threads at once. |
| POLL_TIMEOUT | 5 seconds | Session is marked stopped if client stops polling. |
| Queue-full response | HTTP 429 | Returned when generation limit is reached. |
RL Checkpoints Present
Source: ai/trained/cage_rl/.
| Checkpoint | Size | Notes |
|---|---|---|
| ppo_generalist_gin_final.pt | 474 KB | Generalist PPO checkpoint. |
| ppo_k3_g5_gin_final.pt | 128 KB | Specialized checkpoint (target inferred from filename). |
Why Methods Behave Differently Here
Random walk and bruteforce trade speed for completeness in opposite ways, while A* depends strongly on heuristic quality. The RL generator can be faster on some trajectories if policy priors align with the target structure, but can also stall when generalization is weak.
In the current repository, RL checkpoints are available but not accompanied by standardized metrics metadata. That means the most reliable comparison is still behavioral: success rate, time to valid cage, and stability under concurrent sessions.
info.json, so
quality comparison is operational (generation behavior/speed/success)
rather than a single offline score table.