Releasing soon
We reject the paradigm of scale.
Sophontic cultivates geometric density in the substrate to achieve reasoning performance on public benchmarks exceeding that of models up to 60× its size. We have pioneered methods of directly training the internal geometry of the model.
Methodology
The perturbation paradigm.
We measure reasoning by flip rate — a paired-item test that precludes the surface-feature heuristics most models use to game conventional benchmarks. The pair is the unit of measurement, not the item. Reasoning, not recall.
The model and the evaluation
The model
fig. iv.aA compact reasoning model.
Our first prototype demonstrates that reasoning need not scale with size.
The eval kit
fig. iv.bFive benchmarks, paired.
Five public reasoning benchmarks, extended with paired perturbation items and released open-source.