AI Infrastructure Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

kadence · 13 hours ago

AI Infrastructure Engineer

Kadence is a well-funded, early-stage technology company focused on building production-grade, agent-driven AI systems. They are seeking a Senior AI Platform / LLM Agent Infrastructure Engineer to own the deployment, scalability, and operational reliability of AI/LLM agents in production environments.

Staffing & Recruiting
check
Growth Opportunities
Hiring Manager
Aimee Sharpe
linkedin

Responsibilities

Own the production deployment lifecycle of AI/LLM agents from launch through long-term operation
Design and implement cloud-native infrastructure to support agent-based systems at scale
Deploy, operate, and monitor AI agents handling real-world, customer-facing workloads
Build and maintain CI/CD pipelines, deployment workflows, and infrastructure-as-code
Implement monitoring, logging, alerting, and observability for agent behavior, failures, latency, and cost
Optimize systems for reliability, performance, and cost efficiency across LLM providers
Partner closely with AI engineers to harden agent architectures for production usage
Troubleshoot complex production issues across infrastructure, models, and system integrations
Establish best practices for secure, scalable, and maintainable AI systems

Qualification

Cloud systems deploymentDevOps experienceAI/LLM applicationsAWS infrastructurePython programmingCI/CD pipelinesInfrastructure-as-codeTroubleshootingSystem optimizationObservabilitySoft skills

Required

5+ years (strong production ownership required)
Strong experience deploying and operating production cloud systems
Proven background in DevOps, platform engineering, or infrastructure roles, ideally supporting ML or AI workloads
Hands-on experience deploying AI/LLM-powered applications or agents into production
Experience operating systems at scale in either a startup or large technology environment
Deep familiarity with AWS infrastructure (or equivalent cloud platforms)
Strong programming skills in Python and/or TypeScript
Experience with CI/CD, infrastructure-as-code (Terraform, CDK, Pulumi, etc.), and cloud automation
Comfortable owning ambiguous systems and making pragmatic tradeoffs in production

Preferred

Experience supporting LLM agents or ML systems in production
Familiarity with agent frameworks, orchestration, or distributed systems
Background in MLOps, SRE, or platform teams
Experience working in early-stage or fast-scaling environments

Company

kadence

twitter
company-logo
Kadence - We place elite AI talent at the intersection of science & engineering, from PHD to Executive.

Funding

Current Stage
Early Stage
Company data provided by crunchbase