SIGN IN
Senior Research Engineer, Post-training & Evaluation jobs in United States
cer-icon
Apply on Employer Site
company-logo

Reddit, Inc. · 5 hours ago

Senior Research Engineer, Post-training & Evaluation

Reddit is a community of communities, and they are seeking a Senior Research Engineer for Post-training & Evaluation to join their AI Engineering team. This role involves architecting evaluation suites and fine-tuning pipelines for large language models to ensure they are safe and effective for Reddit's unique environment.
PublishingSocial MediaDigital MediaContentNewsSocial Network
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Architect and maintain the "Reddit Benchmark" evaluation suite: A comprehensive harness that rigorously tests model capabilities across Safety, Reasoning, and Reddit-specific knowledge (slang, norms)
Build scalable SFT (Supervised Fine-Tuning) pipelines: Implement efficient, distributed training loops for instruction tuning, converting raw base models into helpful assistants
Develop Model-as-a-Judge systems: Engineer automated evaluation pipelines using strong models (e.g., GPT-5, Nova, Claude) to grade the outputs of our internal models, enabling rapid iteration cycles
Execute Synthetic Data generation strategies: Create and curate high-quality instruction sets to improve model generalization where human data is scarce
Collaborate with Safety Engineering: Translate high-level safety policies into concrete evaluation metrics and unit tests that run in our CI/CD pipelines
Debug post-training instability: Dive deep into loss curves and evaluation logs to identify when fine-tuning is causing alignment tax or capability degradation

Qualification

LLM fine-tuningPythonPyTorchEvaluation PipelinesInstruction TuningDistributed trainingData engineeringSynthetic Data generationExperiment tracking tools

Required

4+ years of professional experience in machine learning engineering, with a focus on LLM fine-tuning or evaluation
Fluency in Python and PyTorch, with experience using libraries like Hugging Face Transformers, vLLM, or lm-eval-harness
Deep understanding of Instruction Tuning (SFT) and how data quality impacts model behavior
Experience building Evaluation Pipelines: You know the difference between MMLU, GSM8K, and how to build a custom domain-specific benchmark
Familiarity with distributed training (FSDP/DeepSpeed) for fine-tuning jobs
Strong data engineering skills for curating and cleaning instruction datasets

Preferred

Experience with MLFlow, Weights & Biases, or other experiment tracking tools
Experience with Synthetic Data generation (e.g., Self-Instruct papers)

Benefits

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits
Flexible Vacation & Paid Volunteer Time Off
Generous Paid Parental Leave

Company

Reddit, Inc.

company-logo
Reddit is the heart of the internet, where millions of people get together to talk about any topic imaginable.

H1B Sponsorship

Reddit, Inc. has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (99)
2024 (63)
2023 (76)
2022 (70)
2021 (68)
2020 (39)

Funding

Current Stage
Public Company
Total Funding
$1.33B
Key Investors
FidelityVy CapitalTencent
2024-03-21IPO
2021-08-12Series F· $410M
2021-02-08Series E· $367.95M

Leadership Team

leader-logo
Steve Huffman
CEO
linkedin
leader-logo
Chris Slowe
CTO
linkedin
Company data provided by crunchbase