Apply on Employer Site

Archetype AI · 2 months ago

Evaluation Lead

United States

Full-time

Remote

Senior Level

Archetype AI is developing an innovative AI platform aimed at transforming real-world data into valuable insights. The Evaluation Lead will be responsible for designing and implementing evaluation methodologies to assess model performance and collaborating with research and engineering teams to enhance AI technologies.

Artificial Intelligence (AI)Information TechnologySoftware

Responsibilities

Design and implement rigorous evaluation methodologies and benchmarks for measuring model effectiveness, reliability, alignment, and safety

Lead evaluation of model performance, ranging from offline experiments to full production model testing

Design and oversee the pipelines, dashboards, and tools that automate model evaluation

Design and oversee tools for A/B model testing, regression testing, and production model performance

Develop and implement strategies for evaluating physical AI models that can scale across a broad range of real-world use cases, sensor types, and edge cases

Plan, run, and oversee evaluations, across internal teams and external customers

Drive edge case discovery, red-teaming, safety, privacy, and risk evaluation - feeding back knowledge to key stakeholders in research and engineering teams

Qualification

Evaluating AI modelsDesigning evaluation metricsMachine learning expertisePython programmingBuilding data pipelinesCollaboration with stakeholdersStartup-ready mindsetCommunication skills

Required

Extensive expertise in evaluating AI and machine learning models, ideally in physical AI or a related AI field

Experience in designing, implementing, and refining evaluation metrics

Deep understanding of machine learning, AI, and generative models

Excellent python and software engineering skills

Experience designing and building scaleable data pipelines and evaluation tools

Experience collaborating closely with key stakeholders from research, engineering, and product teams

Strong communication and documentation skills, with a bias for creating detailed evaluation reports that help drive model performance

Startup-ready mindset with the ability to thrive in high-velocity, high-ambiguity environments

Preferred

Experience evaluating real-world, real-time algorithms

Experience evaluating a broad range of sensor types, such as cameras, LIDAR, physical sensors, RF sensors, and beyond

A strong scientific approach to evaluation and understanding model performance

Experience in evaluating production algorithms

Experience building and curating data campaigns to create extensive test datasets

Experience managing internal teams and/or external vendors

Company

Archetype AI

Archetype AI develops Physical AI agents that harness real-world sensor data to enhance decision-making and automate processes.

Founded in 2023

Palo Alto, California, USA

11-50 employees

https://www.archetypeai.io

Funding

Current Stage

Early Stage

Total Funding

$48M

Key Investors

Comcast NBCUniversal LIFT LabsVenrock5G Open Innovation Lab

2025-11-20Series A· $35M

2024-10-20Non Equity Assistance

2024-04-05Seed· $13M

Leadership Team

Jaime Lien

Co-Founder & Chief Scientist

Nick Gillian

Founder, CTO

Recent News

CB Insights

Early-Stage Trends Report: Beyond LLMs, physical AI’s ChatGPT moment, and more in November

2025-12-20

SiliconANGLE

Archetype raises $35M to automate sensor data analysis with AI

2025-11-23

Pulse 2.0

Archetype AI: $35 Million Series A Raised To Advance Physical AI Platform And Deploy Next-Generation Physical Agents

2025-11-23

Company data provided by crunchbase