Request for small sonar model

vikvang · May 15, 2025, 12:17am

We’re currently using the llama-3.1-sonar-small-128k-online model, which is set to be deprecated on 2025-02-22. We’re moving over to the new sonar model but noticing it is significantly slower than the legacy small model - it seems to have similar speed to the legacy large model.

For our use case speed it rather critical… are there any plans to release a smaller, faster variation of the sonar model?

vikvang · May 15, 2025, 12:17am

I also ask for the mini model when I test the answers in the current Sonar, for my language (Polish) the better answers were in the older model than in Sonar and it actually works much faster, when I have a csv file with 150 queries the earlier model worked faster with the current one it takes a little too long to do so

vikvang · May 15, 2025, 12:17am

+1 for this feature request, real need for a 7/8B parameter model for low latency responses, or the existing sonar model inference significantly increased.

Topic		Replies	Views
Stream responses are 5s slow to get the first token Bug Reports	1	26	May 15, 2025
Model DeepSeek v3.1 Feature Requests sonar	0	40	August 27, 2025
Is sonar-reasoning the same as deepseek-reasoner? Bug Reports	0	38	May 15, 2025
Will we be getting Sonar Huge chat or llama-3.1-405B-instruct as options? Feature Requests	0	21	May 15, 2025
After updating: Network timeout at: https://api.perplexity.ai/chat/completions Bug Reports	4	163	May 15, 2025

Request for small sonar model

Related topics