Data Center DevOps Engineer (AI Infrastructure) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Intelliswift - An LTTS Company · 19 hours ago

Data Center DevOps Engineer (AI Infrastructure)

Intelliswift, an LTTS Company, is redefining how AI infrastructure is built and operated. The Data Center DevOps Engineer will be responsible for the reliability, automation, and operational excellence of GPU-based systems supporting AI workloads, working closely with various teams to drive execution from concept to commercialization.

Big DataBusiness IntelligenceCloud ManagementEnterprise SoftwareInformation Technology
check
H1B Sponsor Likelynote
Hiring Manager
Basant Sharma
linkedin

Responsibilities

Own pre-deployment operations, including rack staging, hardware health validation, monitoring, triage, and troubleshooting
Own post-deployment operations, ensuring ongoing system health through monitoring, incident response, and continuous automation improvements
Identify operational gaps and design automation to improve reliability, scalability, and efficiency
Serve as a bridge between Data Center Operations and Software Engineering teams to align infrastructure and software requirements
Contribute to product requirements (PRDs) and sprint planning from an operations and reliability perspective
Develop and maintain deployment pipelines and operational playbooks for large-scale AI infrastructure
Help attract, mentor, and grow engineering talent
Lead by example, fostering a culture of humility, ownership, and innovation

Qualification

KubernetesGPU systemsInfrastructure automationLinux administrationNetworkingAnsibleTerraformPythonSite Reliability EngineeringMonitoring systemsCloud/DevOps certificationsHigh-performance computing

Required

Bachelor's degree in Computer Science, Electrical Engineering, or a related field
5+ years of experience in data center operations, site reliability engineering (SRE), or DevOps
Strong experience with Linux system administration, networking, and hardware troubleshooting
Hands-on experience automating infrastructure using tools such as Ansible, Terraform, and Python

Preferred

Master's degree or relevant Cloud/DevOps certifications
Deep hands-on experience with Kubernetes and container orchestration on bare-metal environments
Experience with GPU platforms (NVIDIA DGX/HGX), high-performance computing (HPC) clusters, and Ethernet-based fabric management
Expertise in building scalable monitoring and alerting systems (Prometheus, Grafana, ELK stack)
Experience implementing 'Day 0, Day 1, and Day 2' automation for large-scale infrastructure deployments

Company

Intelliswift - An LTTS Company

company-logo
"Intelliswift, an LTTS Company, delivers world-class Digital Product Engineering, Data Management, Analytics & AI, and Digital Ent Solutions

H1B Sponsorship

Intelliswift - An LTTS Company has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (29)
2024 (35)
2023 (44)
2022 (27)
2021 (38)
2020 (37)

Funding

Current Stage
Late Stage
Total Funding
unknown
2024-11-11Acquired

Leadership Team

leader-logo
Pat Patel
Founder & Executive Chairman
linkedin
Company data provided by crunchbase