Summary
I’m having trouble calling the “llama-3.1-sonar-huge-128k-online” model.
When I make the call, I see that the request is being made. It takes about 60 seconds, then returns an error.
Note: Using the “llama-3.1-sonar-large-128k-online” model works fine!
Is anyone experiencing something similar ?
Output of the huge-online
model:
INPUT TOKENS: 3428
Error querying Perplexity API: SyntaxError: Unexpected token '<', "<html clas"... is not valid JSON
at JSON.parse (<anonymous>)
at parseJSONFromBytes (node:internal/deps/undici/undici:5329:19)
at successSteps (node:internal/deps/undici/undici:5300:27)
at fullyReadBody (node:internal/deps/undici/undici:1447:9)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at async specConsumeBody (node:internal/deps/undici/undici:5309:7)
SyntaxError: Unexpected token '<', "<html clas"... is not valid JSON
at JSON.parse (<anonymous>)
at parseJSONFromBytes (node:internal/deps/undici/undici:5329:19)
at successSteps (node:internal/deps/undici/undici:5300:27)
at fullyReadBody (node:internal/deps/undici/undici:1447:9)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at async specConsumeBody (node:internal/deps/undici/undici:5309:7)
error Command failed with exit code 1.
Output of the large-online
model:
INPUT TOKENS: 3447
OUTPUT TOKENS: 1273
TOTAL TOKENS: 4720
citations:
[...]
usage:
{ prompt_tokens: 3576, completion_tokens: 1306, total_tokens: 4882 }
{...response...}
Code:
enum PerplexityModel {
SONAR_SMALL_ONLINE = "llama-3.1-sonar-small-128k-online",
SONAR_LARGE_ONLINE = "llama-3.1-sonar-large-128k-online",
SONAR_HUGE_ONLINE = "llama-3.1-sonar-huge-128k-online",
}
export async function callPerplexity(
primer: string,
query: string
): Promise<string> {
const messages = [
{ role: "system", content: "You are a helpful researcher." },
{ role: "user", content: primer },
{ role: "assistant", content: "Understood. I'm ready to research." },
{ role: "user", content: query },
];
const inputTokens = getTokenCount(messages);
console.log("INPUT TOKENS: ", inputTokens);
const options = {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.PERPLEXITY_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
// model: PerplexityModel.SONAR_LARGE_ONLINE, // works fine
model: PerplexityModel.SONAR_HUGE_ONLINE,
messages,
temperature: 0.2,
top_p: 0.9,
top_k: 0,
stream: false,
presence_penalty: 0,
frequency_penalty: 0.3,
}),
};
try {
const response = await fetch("https://api.perplexity.ai/chat/completions", options);
const data = await response.json();
const textOutput = data.choices[0].message.content;
const citations = data.citations;
const usage = data.usage;
const outputTokens = getTokenCount(textOutput);
console.log("OUTPUT TOKENS: ", outputTokens);
console.log("TOTAL TOKENS: ", inputTokens + outputTokens);
console.log("citations:\n", citations);
console.log("usage:\n", usage);
return textOutput;
} catch (error) {
console.error("Error querying Perplexity API:", error);
throw error;
}
}