Module 03: Cage Generator

Goal: search for candidate (k,g)-cages using multiple generators and expose live progress via session polling.

Why This Module Is Important

This module shifts from prediction to constrained construction. Instead of estimating a property on a fixed graph, the system must iteratively build a graph that satisfies regularity and girth constraints. That makes it an end-to-end reasoning and search benchmark, not only a regression benchmark.

It also gives direct user-facing feedback about algorithm behavior: speed, quality of partial states, and convergence or failure modes are visible live through session polling.

Available Generation Methods

Random Walk (fast stochastic search)
Bruteforce (systematic/backtracking search)
A* Search (best-first heuristic search)
RL Agent (policy-guided action selection)

Runtime Behavior Stats

Parameter	Current Value	Meaning
MAX_PARALLEL_GENERATIONS	3	At most 3 active generation threads at once.
POLL_TIMEOUT	5 seconds	Session is marked stopped if client stops polling.
Queue-full response	HTTP 429	Returned when generation limit is reached.

RL Checkpoints Present

Source: ai/trained/cage_rl/.

Checkpoint	Size	Notes
ppo_generalist_gin_final.pt	474 KB	Generalist PPO checkpoint.
ppo_k3_g5_gin_final.pt	128 KB	Specialized checkpoint (target inferred from filename).

Why Methods Behave Differently Here

Random walk and bruteforce trade speed for completeness in opposite ways, while A* depends strongly on heuristic quality. The RL generator can be faster on some trajectories if policy priors align with the target structure, but can also stall when generalization is weak.

In the current repository, RL checkpoints are available but not accompanied by standardized metrics metadata. That means the most reliable comparison is still behavioral: success rate, time to valid cage, and stability under concurrent sessions.

Interpretation: unlike degree/min-cycle, these RL checkpoints currently do not have companion metrics metadata in info.json, so quality comparison is operational (generation behavior/speed/success) rather than a single offline score table.