Describe the Bug
We are seeing a recurring failure mode with Perplexity’s reasoning models where the model gets stuck in a loop, repeating the same paragraphs or literal strings tens to hundreds (or thousands) of times until it hits the token limit (finish_reason = length).
This is significantly impacting our integration: in my current dataset, 9 out of 34 calls (≈26.5%) were terminated due to length, and most of those tokens were consumed by redundant loops rather than new reasoning or usable JSON output.
Expected Behavior
Get usable JSON Output
Actual Behavior
Model loops endlessly and hits token limit before generating useful output
Steps to Reproduce
It’s not super consistent but using larger context sizes for search seems to cause this issue more often but I’ve also run into it with a low context size.
Environment
API: Perplexity API (server-side integration)
Models involved:
sonar-reasoning (7 runs with loops)
sonar-reasoning-pro (2 runs with loops)
sonar standard model (1 run with a numeric token loop)
Additional Context
Usage pattern:
We send a relatively long prompt asking the model to:
Analyze a set of search results (events in specific cities / dates).
Produce a structured JSON list of opportunities.
We do not set an explicit max_tokens; we rely on default limits.
We store full responseText in a Postgres/Supabase table, and also store finish_reason.
Observed failure modes
1) Reasoning paragraph loops (7 runs)
-
In the majority of problematic runs, the model gets stuck repeating the same reasoning paragraphs over and over. Some clear examples:
- Run 2ca2fc59 (sonar-reasoning-pro, 115k chars)
- The sentence:
Let me check if there are any other events that might be happening during this week:
appears 378 times in a single response.
- It is accompanied by four short paragraphs summarizing the same search results (“From Result 5…”, “From Result 6…”, “From Result 7…”, “From Result 8…”), each repeated 47 times.
- The response is effectively a five-paragraph block looped hundreds of times, with no stable final JSON output before length termination.
- The sentence:
- Run b94f1d3a (sonar-reasoning, 145k chars)
- The paragraph:
Given the limitations of the information available, I’ll need to be transparent about what I can and cannot confirm from the search results.
is repeated 120 times in the same response.
- Additional paragraphs (e.g., describing Novi events, noting that other cities’ events are not in the target area, and warning not to invent events) each appear ~61 times.
- The bulk of the 145k-character output is just a handful of paragraphs looping.
- The paragraph:
- Run aab97946 (sonar-reasoning, 128k chars)
- Meta-reasoning sentences like:
Given the limited information, I’ll need to focus on the events I can confidently identify and make educated guesses for others based on typical event patterns in these areas.
- and
Let me check if there are any events mentioned that might be in nearby areas that could attract people from Farmington and Farmington Hills:
- both appear 64 times.
- Similar “Let me check…” / “From result [1]…” paragraphs are also repeated 34–64 times.
- Meta-reasoning sentences like:
- Run 004a1ee7 (sonar-reasoning-pro, 128k chars)
- The sentence:
Let me try to construct the JSON response with the information available:
appears 72 times.- Event summary paragraphs for “For Friday–Sunday…” and “For Monday–Thursday…” appear 70–71 times each.
- The response looks like the model continuously “restarts” its JSON construction and event-summary explanation loop.
- The sentence:
- Run 3ccf59c7 (sonar-reasoning, 153k chars)
- For each day of the week (Tuesday–Friday), the model repeats a pair of paragraphs:
- “Let me check if there are any events at the Suburban Collection Showplace that might be happening on [weekday]…”
- Followed by “I don’t see any specific events listed for [weekday]…”
- Each of the weekday paragraphs appears 20 times, so the response cycles through the same “check day X / no additional events” loop repeatedly.
- For each day of the week (Tuesday–Friday), the model repeats a pair of paragraphs:
- Run ecc17d7f (sonar-reasoning, 107k chars)
- Paragraphs like:
I don’t see specific events listed for Farmington or Farmington Hills in the search results. However, I should note that Farmington Hills is adjacent to Novi, so events in Novi would be appropriate.
appear 32 times. - Additional paragraphs (“Let me check if there are any events specifically in Farmington or Farmington Hills:”, and event descriptions like Power Connections Cannabis) appear ~29–30 times.
- Paragraphs like:
- Run cd47a78c (sonar-reasoning, 124k chars)
- A Monday schedule paragraph (e.g., Agile & Scrum training in Farmington Hills + succession planning workshop in Novi) appears 31 times.
- Several other day-specific schedules and justification paragraphs each appear 29 times.
- Run 2ca2fc59 (sonar-reasoning-pro, 115k chars)
Common characteristics:
- The repeated content is semantically coherent and non-gibberish (e.g., “Let me check if there are any other events…”, “Given the limited information…”, etc.).
- The paragraphs often talk about:
- Limited information,
- Checking search results,
- Not inventing events,
- Summaries of the same handful of events/venues.
- The model seems to be “stuck” in a loop of:
- Re-announcing that it will check for events.
- Re-scanning the same search results.
- Re-stating the same conclusions.
- The loop continues until the output hits the length limit (finish_reason = length), rather than converging to a final structured JSON.
- The paragraphs often talk about:
This looks like a runaway reasoning loop, possibly where the reasoning stream is not being properly truncated or the model is repeatedly re-invoking the same internal “check events” behavior.
2) Literal token/string loops (2 runs)
In two runs, the problem is even lower-level: the model repeats a very short literal string thousands of times.
- Run 80a3eae0 (sonar, 32k chars)
- The literal string “1234567890” appears 3,180 times.
- That sequence alone accounts for ≈31,800 of the 32,232 characters in the response.
- In our logs it shows up inside a JSON field (an eventDetailsUrl string with thousands of trailing digits), suggesting a decoding/sampling issue where the same digit pattern is emitted in a loop until the length limit.
- Run d7a788f4 (sonar-reasoning, 55.9k chars)
- The literal pattern: “2025-11-17-2025-11-21-” (22 characters) appears 972 times.
- That single pattern contributes ≈21,384 characters (about 38% of the response).
- The string corresponds to the Monday–Friday date range for the query; instead of generating structured JSON or new content, the model degenerates into repeating the same date range string over and over.
These look less like high-level reasoning loops and more like token-level repetition (decoder or sampling instability), but they occur in the same overall context (long prompts, event/search-based tasks).
Quantification
For the length-terminated runs:
- All 9/9 show significant repetition.
- 7/9 are dominated by repeated paragraphs (exact paragraph text appears 20–378 times).
- 2/9 are dominated by repeated short strings (digits or date range strings) repeated hundreds to thousands of times.
- In several runs, the majority of the output characters are in the repeated segments rather than unique content.
We detected these patterns by:
- Splitting responseText on blank lines and grouping identical paragraphs with a COUNT(*).
- For literal loops, counting occurrences of specific substrings such as “1234567890” or “2025-11-17-2025-11-21-”.
Impact
- No usable output: Because the model hits finish_reason = length while looping, we often do not get a complete, consistent JSON payload.
- Operational impact: In our application (event discovery & structuring), this means:
- We can’t safely import these responses.
- We must implement extra guardrails / post-processing just to detect, retry and discard obviously-looped outputs.
- It makes the reasoning models much less reliable for production use, even though the underlying search context and prompts are reasonable.
Questions and requests for Perplexity
- Is this a known issue with sonar-reasoning / sonar-reasoning-pro (or the standard sonar decoder) where:
- The reasoning stream can get stuck repeating the same paragraphs, or
- The model can fall into token-level repetition loops for simple strings?
- Are there recommended mitigations on the client side for reasoning models, such as:
- Specific sampling parameters (e.g., temperature, presence_penalty, repetition_penalty)?
- A way to limit or suppress verbose reasoning sections while still getting high-quality final JSON?
- A recommended max length / early stop heuristic for reasoning streams?
- Would it be helpful if I shared full raw responses (including sections) and prompts for these runs?
- I’ve already extracted and summarized each problematic run into a separate text file (IDs and counts as above), and I can provide those as attachments.
Attachments
I can attach or provide links to one text file per problematic run, each containing:
- Run metadata (id, model, created_at, finish_reason, response length).
- The top repeated paragraphs or literal patterns.
- Exact repeat counts.
- Short narrative description of the loop behavior.
The run IDs covered are:
- 004a1ee7-0f91-4eeb-a0d9-e8353ac97775
- cd47a78c-070a-40a3-98b2-eb3a406185b8
- b94f1d3a-fc8b-4018-8220-66ac01ec67c9
- aab97946-c832-4848-a81b-526715c675fc
- 2ca2fc59-5351-42de-a8b1-67cefd588a2d
- 3ccf59c7-8e8c-4297-a080-1cf4e71ef223
- ecc17d7f-3f95-439a-841d-a910685288e5
- 80a3eae0-9f7d-4a9f-ad33-07b809556a95
- d7a788f4-ea1e-407b-b7cf-5d3453bcc105