Evaluation Lead jobs in United States
cer-icon
Apply on Employer Site
company-logo

Archetype AI · 2 months ago

Evaluation Lead

Archetype AI is developing an innovative AI platform aimed at transforming real-world data into valuable insights. The Evaluation Lead will be responsible for designing and implementing evaluation methodologies to assess model performance and collaborating with research and engineering teams to enhance AI technologies.

Artificial Intelligence (AI)Information TechnologySoftware

Responsibilities

Design and implement rigorous evaluation methodologies and benchmarks for measuring model effectiveness, reliability, alignment, and safety
Lead evaluation of model performance, ranging from offline experiments to full production model testing
Design and oversee the pipelines, dashboards, and tools that automate model evaluation
Design and oversee tools for A/B model testing, regression testing, and production model performance
Develop and implement strategies for evaluating physical AI models that can scale across a broad range of real-world use cases, sensor types, and edge cases
Plan, run, and oversee evaluations, across internal teams and external customers
Drive edge case discovery, red-teaming, safety, privacy, and risk evaluation - feeding back knowledge to key stakeholders in research and engineering teams

Qualification

Evaluating AI modelsDesigning evaluation metricsMachine learning expertisePython programmingBuilding data pipelinesCollaboration with stakeholdersStartup-ready mindsetCommunication skills

Required

Extensive expertise in evaluating AI and machine learning models, ideally in physical AI or a related AI field
Experience in designing, implementing, and refining evaluation metrics
Deep understanding of machine learning, AI, and generative models
Excellent python and software engineering skills
Experience designing and building scaleable data pipelines and evaluation tools
Experience collaborating closely with key stakeholders from research, engineering, and product teams
Strong communication and documentation skills, with a bias for creating detailed evaluation reports that help drive model performance
Startup-ready mindset with the ability to thrive in high-velocity, high-ambiguity environments

Preferred

Experience evaluating real-world, real-time algorithms
Experience evaluating a broad range of sensor types, such as cameras, LIDAR, physical sensors, RF sensors, and beyond
A strong scientific approach to evaluation and understanding model performance
Experience in evaluating production algorithms
Experience building and curating data campaigns to create extensive test datasets
Experience managing internal teams and/or external vendors

Company

Archetype AI

twittertwittertwitter
company-logo
Archetype AI develops Physical AI agents that harness real-world sensor data to enhance decision-making and automate processes.

Funding

Current Stage
Early Stage
Total Funding
$48M
Key Investors
Comcast NBCUniversal LIFT LabsVenrock5G Open Innovation Lab
2025-11-20Series A· $35M
2024-10-20Non Equity Assistance
2024-04-05Seed· $13M

Leadership Team

leader-logo
Jaime Lien
Co-Founder & Chief Scientist
linkedin
leader-logo
Nick Gillian
Founder, CTO
linkedin
Company data provided by crunchbase