When calling the sonar-deep-research model, the output report is always truncated and cannot be used normally.

When using sonar-deep-research for research, the output report is always truncated at around 10,000 tokens, preventing a complete research report from being obtained.

I have tried setting the max_tokens parameter to 8,000 tokens, 15,000 tokens, and 20,000 tokens, but the situation remains the same — the report is cut off in the middle.

After checking, it seems that Perplexity’s reply output is limited to 10,000 tokens. If this is the case, then sonar-deep-research is difficult to use effectively, as a formal research report can easily exceed this number.

If the 10,000-token limit cannot be broken, how can I configure it so that sonar-deep-research outputs a complete report within the 10,000-token limit, allowing me to at least obtain a shorter but relatively complete report?

Hi @vim — could you share an example request (including model, parameters, and payload) so we can test this on our end? I’m not able to reproduce this issue so far.

System environment:
AI client: Cherry Studio
Model: sonar-deep-research
OS: Windows 10

The prompt is the same one used for deep research on Gemini; it’s rather long, about 2,500–3,000 Chinese characters (≈ 2,000–2,500 tokens), so I won’t paste it here.

Gemini’s DR produces full reports of roughly 15,000–20,000 Chinese characters (≈ 12,000–16,000 tokens).

When calling Perplexity’s API with sonar-deep-research, the output is almost always cut off at around 10,000–13,000 Chinese characters (≈ 10,000 tokens). Judging by the prompt’s structure, that’s only about 60–70 % complete; the final length should be close to Gemini’s DR.

At present, it looks like a token-limit issue with sonar-deep-research itself; the cap appears to be roughly 10,000 output tokens.

For simple DR tasks the lower limit may be fine, but for complex ones exceeding 10,000 tokens, truncation will occur.

Since many studies are fairly complex and API calls are billed, if the cutoff is indeed due to an output-token limit, could the official side raise sonar-deep-research’s output ceiling to 20,000 tokens?

Splitting a complex study into several simple ones might mitigate the problem to some extent, yet the results could suffer, and sources might be queried repeatedly or diverge—hardly the best solution.