What is the max output length for each model type?

kyesh · November 16, 2025, 5:29pm

I keep trying to use the Sonar Reasoning model but sometimes end up with no usable result because it cuts for length at 32k output tokens while it’s still reasoning. The documentation makes me think it should be around 128k given it advertises a 128K context length

I tried setting “max_tokens”:100000 but it’s still cutting off at 32k. Are there default and max values for each model type posted somewhere? I’m able to get outputs consistently when using the web UI with the same prompt. Do the API end points have lower cut offs than the web UI? Is there another way I should be trying to configure the max_token value?

Topic		Replies	Views
Not having internal COT <Think> in output of reasoning models General sonar-reasoning-pro	1	44	November 3, 2025
When calling the sonar-deep-research model, the output report is always truncated and cannot be used normally. General sonar-deep-research , search-api	2	57	October 22, 2025
Reasoning not always correctly enclosed in `<think>...</think>` Bug Reports	4	129	June 2, 2025
Deep Research API response content is cut off unexpectedly while having finish reason "stop" Bug Reports	13	319	October 30, 2025
Repeated reasoning and literal token loops in sonar-reasoning / sonar-reasoning-pro leading to finish_reason = length Bug Reports sonar-reasoning , chat-completions	0	12	November 17, 2025

What is the max output length for each model type?

Related topics