Apply on Employer Site

LMArena · 3 weeks ago

Machine Learning Scientist - Open Source Lead

San Francisco Bay Area

Full-time

Hybrid

Senior Level

LMArena is the open platform for evaluating how AI models perform in the real world. They are seeking a Machine Learning Scientist to lead open-source research, design experiments, and develop methodologies that enhance the evaluation and understanding of AI models, while promoting transparency and collaboration within the research community.

Artificial Intelligence (AI)Information ServicesMachine LearningProduct Research

Responsibilities

Design and conduct experiments to evaluate AI model behavior across reasoning, style, robustness, and user preference dimensions

Develop new metrics, methodologies, and evaluation protocols that go beyond traditional benchmarks

Analyze large-scale human voting and interaction data to uncover insights into model performance and user preferences

Communicate results with the broader research community via academic papers, educational content, conference talks

Collaborate with engineers to implement and scale research findings into production systems

Prototype and test research ideas rapidly, balancing rigor with iteration speed

Partner with model providers to shape evaluation questions and support responsible model testing

Contribute to the scientific integrity and transparency of the LMArena leaderboard and tools

Qualification

Large-scale model trainingMachine Learning expertiseStatistical analysisPython proficiencyOpen-source project experienceNatural Language ProcessingDeep learning architecturesPublic speakingCommunity engagementScientific communication

Required

Hands-on experience training large-scale models, including reward models, preference models, and fine-tuning LLMs with methods like RLHF, DPO, and contrastive learning

Strong foundation in ML and statistics, with a track record of designing novel training objectives, evaluation schemes, or statistical frameworks to improve model reliability and alignment

Fluent in the full experimental stack, from dataset design and large-batch training to rigorous evaluation and ablation, with an eye for what scales to production

Deeply collaborative mindset, working closely with engineers to productionize research insights and iterating with product teams to align research with user needs

Comfortable being a visible representative of LMArena, engaging openly with the research community, and building a strong personal brand to help shape AI research culture

PhD or equivalent research experience in Machine Learning, Natural Language Processing, Statistics, or a related field

Uses personal and professional platforms to amplify open research initiatives and invite collaboration

Strong understanding of LLMs and modern deep learning architectures (e.g., Transformers, diffusion models, reinforcement learning with human feedback)

Proficiency in Python and ML research libraries such as PyTorch, JAX, or TensorFlow

Demonstrated ability to design and analyze experiments with statistical rigor

Experience publishing research or working on open-source projects in ML, NLP, or AI evaluation

Comfortable working with real-world usage data and designing metrics beyond standard benchmarks

Ability to translate research questions into practical systems and collaborate across engineering and product teams

Passion for open science, reproducibility, and community-driven research

Preferred

Skilled at public speaking, writing, and presenting research work to diverse audiences

Actively participates in conferences, panels, and online forums to foster relationships and thought leadership

Builds trust through transparent communication and consistent community engagement

Serves as a go-to contact for external researchers, journalists, and partners

Benefits

Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs.

Company

LMArena

LMArena is a web-based platform that evaluates large language models (LLMs) through anonymous, crowd-sourced pairwise comparisons.

Founded in 2025

San Francisco, California, USA

11-50 employees

https://lmarena.ai

Funding

Current Stage

Early Stage

Total Funding

$250M

2026-01-06Series A· $150M

2025-05-21Seed· $100M

Leadership Team

Anastasios Angelopoulos

Chief Executive Officer

Recent News

Tech Startups - Tech News, Tech Trends & Startup Funding

Anthropic in talks to raise $10B at $350B valuation as AI frenzy accelerates

2026-01-09

Tech Startups - Tech News, Tech Trends & Startup Funding

Top Startup and Tech Funding News – January 8, 2025

2026-01-09

Crunchbase News

The Week’s 10 Biggest Funding Rounds: xAI Leads As 2026 Is Off To A Brisk Start

2026-01-09

Company data provided by crunchbase