Senior AI Engineer - APM Experiences jobs in United States
cer-icon
Apply on Employer Site
company-logo

Datadog · 2 days ago

Senior AI Engineer - APM Experiences

Datadog is a global SaaS business focused on delivering solutions for cloud monitoring and digital transformation. The Senior AI Engineer will lead the development of AI-powered capabilities for Application Performance Monitoring, focusing on debugging, performance optimization, and creating intelligent monitors.

AnalyticsCloud ComputingCloud Data ServicesCloud InfrastructureData ManagementDevOpsProductivity ToolsSaaS
check
H1B Sponsor Likelynote

Responsibilities

Shape AI experiences for APM. Design and ship LLM/agentic workflows that analyze traces, metrics, logs, and other telemetry to generate diagnoses, explanations, and guided fixes
Own the full loop. Prototype quickly, define success metrics and evals, run experiments, iterate, and ultimately productionize for scale and reliability
Build robust agent systems. Develop tools, retrieval and planning strategies, and guardrails; manage prompts/evals; design fallbacks and human‑in‑the‑loop paths
Integrate with Datadog’s platform. Leverage surfaces like Trace Explorer, Service Catalog, monitors, and workflows to deliver end‑to‑end value in the APM UI
Partner deeply. Collaborate with PM, Design, and partner teams to build cohesive experiences
Raise the bar on engineering. Write performant, maintainable backend code, own services in production, and improve reliability for high‑throughput, low‑latency data systems

Qualification

LLM features developmentBackend ML systemsMicroservices performanceGo programmingJava programmingPython programmingML lifecycle understandingPrototypingUser journey ownershipCollaborationProblem discovery

Required

4+ years building backend or real-time ML systems; you value simplicity, correctness, and performance
Proven experience delivering LLM/agent features to production (prompting, tooling, evals, safety/guardrails)
Comfortable owning user journeys, iterating from prototype → alpha → GA, and measuring impact with clear product metrics
Solid grasp of the ML lifecycle (task definition, dataset collection, modeling, evaluation, deployment, iteration) and statistics (experiment design, confidence intervals)
Experience choosing/modeling the right technique for the job (e.g., anomaly detection, ranking/recommendation, NLP), and knowing when a heuristic beats a model
Fluency with offline/online evals for AI systems; can build reliable golden sets and automatic regressions
Experience with microservices performance: tracing, latency breakdowns, concurrency, and resiliency patterns
Proficient in Go, Java, or Python; strong API/service design; production ops (monitoring, alerting, on‑call rotation)

Preferred

Hands‑on with distributed tracing stacks (OpenTelemetry/Datadog APM), profilers, and logs/metrics pipelines
Exposure to planning/agent frameworks, tool‑use orchestration, RAG, and retrieval/indexing for observability data
Familiarity with SLO/SLA practices and incident response

Benefits

Healthcare
Dental
Parental planning
Mental health benefits
A 401(k) plan and match
Paid time off
Fitness reimbursements
A discounted employee stock purchase plan

Company

Datadog is an observability and security platform that offers infrastructure, applications, software development, and monitoring services.

H1B Sponsorship

Datadog has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (123)
2024 (66)
2023 (45)
2022 (53)
2021 (31)
2020 (29)

Funding

Current Stage
Public Company
Total Funding
$1.02B
Key Investors
ICONIQ GrowthIndex VenturesOpenView
2024-12-09Post Ipo Debt· $870M
2020-05-28Post Ipo Debt
2019-09-19IPO

Leadership Team

leader-logo
Olivier Pomel
Co-founder, CEO
linkedin
leader-logo
Alexis Le-Quoc
Co-founder & CTO
linkedin
Company data provided by crunchbase