I’ve been reading a bit about the new S1 AI model and was wondering how does it really compare to other popular AI models out there, like GPT or Claude? Has anyone tested it or seen any benchmarks?
The S1 model, which uses only a little data and occupies minimal space, does better than the O1-Preview from OpenAI on math and science assessments. Test-time scaling is applied in order to improve how the model reasons, without requiring a large amount of data for training.