If a proof depends on a fact, the answer follows.
stable
Releasing soon
Sophontic cultivates geometric density in the substrate to achieve reasoning performance exceeding that of models up to 60× its size. We have pioneered methods of directly training the internal geometry of the model.
Observed signal
The prototype is being evaluated through paired perturbations, calibrated anchors, and public transfer surfaces. The measurement standard remains stable as the rail set evolves.
60×
Larger models exceeded on reasoning evaluations.
Methodology
We measure reasoning by flip rate — a paired-item test that precludes the surface-feature heuristics most models use to game conventional evaluations. The pair is the unit of measurement, not the item. Reasoning, not recall.
Inspect the eval kitIf a proof depends on a fact, the answer follows.
stable
Change the load-bearing fact. The answer must change.
flip
The model
fig. iv.aOur first prototype demonstrates that reasoning need not scale with size.
The eval kit
fig. iv.bPublic and synthetic reasoning surfaces, extended with paired perturbation records and released open-source.