Multi-language search results

Hi. We are currently evaluating the use of perplexity API in our product for automating risk and incident retrieval for our customers. We are experimenting with the sonar model.

The search works well for well known, internationally covered topics. Usually our relevant results will be covered by negative news. We are retrieving the results in english which is aligned with big data providers in this area like the world compliance dataset by lexisnexis or the world check dataset by LSEG. That’s why our prompt is also english.

However, we noticed that the search results skew heavily towards english results because our prompt is also written in english. This is also the case when trying to prompt for more international sources, or for sources in a relevant country’s language. The user location did also not affect the results. For example when researching a company located in Japan, japanese news should be considered/searched. It seems we can only influence that by writing the entire prompt in japanese.

Is there a way to include more international search results? It seems the underlying search is always executed in the english language when using an english prompt, so we would be interested in solutions other than prompting in many different languages.

This is a very good point. Unfortunately, if the prompt is in English, the search results prioritized will be in English. If you would like to see search results in various languages, you would have to pass that requirement through the user prompt.

There is a couple of things that can be done here:

  1. Pass in the language requirement via a formatted string given that you already have the country code during the API call. E.g.,

query = f"Find me latest news about this country {country}, prioritize sources that come from that country and written in that country's language
  1. Have a mechanism to translate the prompt in the correct language, although I would discourage that just because of the amount of tokens it would consume.