Apply on Employer Site

odiggo · 4 hours ago

Applied Research Engineer

Mountain View, CA

Full-time

Onsite

Mid, Senior Level

$180K/yr - $220K/yr

Sully.ai is focused on building impactful healthcare solutions using AI technology. The Applied Research Engineer will design and implement automated evaluation pipelines to enhance the reliability and effectiveness of AI agents in clinical settings.

Computer Software

Responsibilities

Build and scale automated evaluation pipelines (LLM-as-judge + human review) with clinical-grade benchmarks

Audit existing evaluation approaches for clinical and agentic tasks

Define initial benchmarks and build early automated pipelines

Partner with engineering to land first set of CI gates for accuracy, factuality, and safety

Deliver a repeatable evaluation framework with automated pipelines in production

Demonstrate measurable improvements in robustness, hallucination reduction, or safety

Publish or present internal research findings that directly shape product reliability

Qualification

LLM evaluation frameworksPythonMachine LearningPyTorchTensorFlowHugging FaceLangChainExperiment designTechnical writingCommunication skills

Required

Proven experience designing agentic processes and LLM evaluation/benchmarking frameworks

Strong Python and ML background (PyTorch/TensorFlow, Hugging Face, LangChain/LlamaIndex)

Demonstrated ability to design rigorous experiments and translate findings into production

Track record of published research or deep applied work in LLMs and agent evaluation

Strong communication and technical writing skills to articulate complex findings clearly

Company

odiggo

Car Services in minutes

Founded in 2019

San Francisco, US

2-10 employees

https://odiggo.ai

Funding

Current Stage

Early Stage

Company data provided by crunchbase