Feature Request
Enable a mid-input system for AI Browser tasks, allowing users to provide additional data or guidance during multi-step automations (e.g., form filling, checkout flows), rather than only before or after the entire process.
Problem Statement
Currently, if Perplexity AI needs information (like age, email, or a selection) during a task, it exits and asks the user to restart with the missing data. This disrupts the workflow and makes complex automations (multi-page forms, registrations, stepwise tasks) cumbersome. Users can’t clarify or correct entries in real time.
Proposed Solution
Implement a live interaction feature where the AI:
-
Pauses and prompts for input whenever a data field or choice is encountered.
-
Allows users to reply instantly (“Enter my age: 15”) and the agent continues the process without starting over.
Example Use Case:
-
User instructs: “Fill out this registration form for me.”
-
AI begins task, then asks: “What should I enter for your birthdate?”
-
User replies: “August 4, 2009.”
-
AI continues to next page, asks for address, user replies, process remains uninterrupted.
API Impact
-
Component: Likely affects the chat completions and browser control APIs.
-
Model: Impacts any model with browser-agent automation (e.g., Sonar Deep Research).
-
Change: Would require new session management capabilities and possibly a streaming/mid-action messaging protocol to enable real-time user input.
Alternatives Considered
The current workaround is replying to failed prompts when the AI agent exits (e.g., “What should I enter for X?” then manually restarting the task). This is inefficient for multi-field workflows and loses context.
Additional Context
-
Improves user experience for lengthy forms, registrations, and stepwise automation.
-
Bridges gap between single-turn automation and true interactive collaboration.
-
Would make Perplexity’s AI agent feel more like a live assistant.