Module 01: Degree Prediction
Goal: for each node, predict its degree based only on the graph structure and node features generated at training time.
Why This Module Is Important
Degree is a foundational structural property. If a GNN cannot predict degree reliably, it is unlikely to be trustworthy for more demanding graph reasoning tasks. This module is the baseline checkpoint for the rest of the project: it tests whether the model family and data generation pipeline are internally consistent before moving to harder objectives.
Practically, this module also exposes calibration behavior. Exact-match degree accuracy and absolute error directly show whether the model is learning stable local counting behavior or just approximate trends.
How It Was Trained
- Models trained on random graphs generated each epoch.
- 4 input features per node, hidden size 64, 4 layers, dropout 0.2.
- 5000 epochs, learning rate 0.001, 50 graphs per epoch.
- Evaluation rounds predictions and computes exact-match node accuracy.
uv run python -m ai.degree.train --model gcn --name v1 --epochs 5000
uv run python -m ai.degree.train --model sage --name v1 --epochs 5000
uv run python -m ai.degree.train --model gin --name v1 --epochs 5000
uv run python -m ai.degree.train --model loopy --name r3_v1 --r 3 --epochs 5000
Saved Results
Source: ai/trained/degree/*/info.json.
| Model | Accuracy (%) | MAE | MSE | Best Epoch |
|---|---|---|---|---|
| gin_v1 | 100.00 | 0.0000 | 0.0000 | 450 |
| sage_v1 | 100.00 | 0.0000 | 0.0000 | 450 |
| gcn_v1 | 23.76 | 1.8810 | 3.5382 | 150 |
| loopy_r3_v1 | 25.89 | 6.7847 | 46.0323 | 250 |
Why Models Behave Differently Here
Degree is highly local, so architectures with strong neighborhood aggregation can excel quickly. In your current runs, GIN and SAGE hit perfect rounded-node accuracy, while GCN and Loopy underperform. That pattern suggests this specific training distribution favors local aggregation fidelity and is less forgiving to architectural settings that were tuned for other structural objectives.
The large MAE on loopy_r3_v1 indicates mismatch between
its bias and this exact target, not necessarily a universally weaker
model. It can still dominate on cycle-sensitive tasks, which is why
cross-task comparison matters more than single-task ranking.