You.com · 3 hours ago
Senior AI Scientist
You.com is an AI-powered search and productivity platform designed to empower users with personalized, efficient, and trustworthy search experiences. They are seeking a Senior AI Scientist to lead the development of novel evaluation methodologies and customer-facing evaluation research, focusing on improving AI systems and understanding quality requirements from customers.
Artificial Intelligence (AI)ChatbotGenerative AIInformation TechnologyInternetSearch Engine
Responsibilities
Define and own what “good” means for search-augmented and agentic AI systems by designing evaluation frameworks that measure real-world quality, reliability, and user-relevant behavior beyond standard benchmarks
Invent and validate novel evaluation methodologies for non-deterministic systems (LLMs, agents, RAG), including behavioral evals, long-tail and adversarial test sets, and task-specific metrics
Develop rigorous statistical frameworks for model comparison, regression detection, and uncertainty estimation, ensuring evaluation results are defensible and decision-ready
Build and maintain scalable evaluation systems—datasets, gold standards, eval harnesses, scoring pipelines, and analysis tooling—that can be reused across products and customers
Lead customer-facing evaluation research, working directly with enterprise customers to translate domain-specific quality requirements into credible, actionable evals that support product decisions and sales outcomes
Drive competitive evaluations and internal quality reviews, surfacing meaningful performance differences, trade-offs, and failure modes to inform product strategy and prioritization
Partner with engineering and product teams to integrate evals into development loops, release gating, and ongoing quality monitoring
Mentor and set standards for evaluation practice, reviewing eval designs, guiding other scientists, and shaping the long-term evals roadmap as systems become more agentic and complex
End-to-End Project Leadership: Lead the development of new AI-driven projects, encompassing ideation, prototyping, research, infrastructure design, scalability, monitoring, and evaluation
Rapid Iteration: Adapt quickly to user feedback and evolving requirements, ensuring continuous improvement in a fast-paced environment
Qualification
Required
Strong grounding in applied ML and statistics, with experience evaluating non-deterministic AI systems (LLMs, agents, RAG, search)
Deep experience with AI evaluation, including metric design, gold dataset creation, head-to-head comparisons, slicing, and error analysis
Statistical rigor in model comparison, using methods such as paired tests, bootstrap confidence intervals, and robustness analyses
Proficiency in Python for evaluation and analysis, including building eval harnesses, data pipelines, scoring logic, and reproducible analysis workflows
Ability to translate vague product or customer goals into measurable evaluation criteria, and to challenge metrics or conclusions that don't reflect real quality
Comfort engaging directly with customers and cross-functional stakeholders, explaining evaluation results, trade-offs, and limitations clearly
Strong written and verbal communication, including documenting methodologies and contributing to external publications or talks
Benefits
Hubs in San Francisco and New York City offering regular in-person gatherings and co-working sessions
Flexible PTO with U.S. holidays observed and a week shutdown in December to rest and recharge*
A competitive health insurance plan covers 100% of the policyholder and 75% for dependents*
12 weeks of paid parental leave in the US*
401k program, 3% match - vested immediately!*
$500 work-from-home stipend to be used up to a year of your start date*
$1,200 per year Health & Wellness Allowance to support your personal goals*
The chance to collaborate with a team at the forefront of AI research
Company
You.com
You.com is a personalized AI search engine that delivers customized recommendations and allows natural conversation with its AI chatbot.
H1B Sponsorship
You.com has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (4)
2024 (6)
2023 (3)
2022 (4)
Funding
Current Stage
Growth StageTotal Funding
$195MKey Investors
Cox EnterprisesRadical VenturesMarc Benioff
2025-09-03Series C· $100M
2024-06-17Series B· $50M
2022-07-14Series A· $25M
Recent News
2026-01-18
Norwest
2025-12-30
2025-11-03
Company data provided by crunchbase