Open Role
Member of Technical Staff (Data Scientist, Evals)
at Perplexity
San Francisco, CA·Posted 1 day ago
About the role
Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and our specialized data sources. We aim to use the latest models as they are released, but the intelligence frontier is a jagged one, and popular benchmarks do not effectively cover our use cases. In this role, you will build specialized evals to improve answer quality across Perplexity, covering search-based LLM answers and other scenarios popular with our users.
Responsibilities
•
Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness
•
Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality
•
Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices
•
Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements
•
Operate within a small, high-impact team where your evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality
Qualifications
•
PhD or MS in a technical field or equivalent experience
•
4+ years of experience in data science or machine learning
•
Strong proficiency in Python and SQL (expected to write production-grade code)
•
Experience building within a modern cloud data stack, specifically AWS and Databricks
•
Comfortable with agentic coding workflows and using AI-assisted development tools to iterate faster
Preferred Qualifications
•
1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups
•
Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale
•
A strong research background, with experience applying research methods to real-world ML problems
•
Experience defining evaluation metrics (e.g., factual consistency, hallucination rate, retrieval precision) and building ground truth datasets
About Perplexity

Redefines AI-powered search.
View full profile →- HQ
- San Francisco, CA
- Stage
- Series C+
- Total Raised
- $2.2B
- Employees
- 1,001-5,000
- Founded
- 2022
More roles at Perplexity
- →Member of Technical Staff (Software Engineer, API Platform)San Francisco, CA · New York City, NY
- →Engineering Manager (API Platform)San Francisco, CA
- →Member of Technical Staff (Software Engineer, Enterprise Platform)San Francisco, CA · New York City, NY
- Member of Technical Staff (Software Engineer, Cloud Infrastructure)San Francisco, CA · New York City, NY · Palo Alto, CA