Software Engineer, Cloud Infrastructure jobs in United States
cer-icon
Apply on Employer Site
company-logo

DatologyAI · 1 day ago

Software Engineer, Cloud Infrastructure

DatologyAI is a pioneering company focused on optimizing data curation for machine learning models. They are seeking an experienced Cloud Infrastructure Engineer to design, build, and operate secure and scalable cloud infrastructure, collaborating with various teams to support training and inference pipelines.

Artificial Intelligence (AI)Data CenterData IntegrationDatabaseInformation Technology
check
H1B Sponsor Likelynote

Responsibilities

Architect and maintain our multi-cloud infrastructure (primarily AWS, potentially Azure/GCP), with a focus on reliability, security, and scalability
Define and implement infrastructure-as-code best practices using Terraform, CloudFormation, Pulumi (and similar technologies)
Design and manage Kubernetes-based systems for model training, inference, and data processing workloads
Optimize our CI/CD pipelines and streamline deployment of services across environments
Build monitoring, alerting, and logging systems to ensure high system availability and observability
Collaborate with research and engineering teams to provide infrastructure support for training large-scale ML models
Ensure our infrastructure supports various deployment models (cloud, on-prem, hybrid) for enterprise use cases
Drive cost-efficiency strategies across compute and storage resources
Respond to and resolve infrastructure-related incidents with a sense of ownership and urgency

Qualification

AWSKubernetesTerraformMulti-cloud infrastructureCI/CD pipelinesInfrastructure-as-codeBashPythonGoSystems-level debuggingSecurityScalabilityCollaboration

Required

You've led or helped build robust infrastructure systems at a startup or fast-moving engineering organization
Deep experience working with cloud providers (especially AWS), and ideally exposure to multi-cloud or hybrid-cloud setups
Strong with Kubernetes, Terraform, and containerized architectures
Confident with systems-level debugging—networking issues, memory leaks, resource bottlenecks, etc
Comfortable writing clean, maintainable scripts in Bash, Python, or Go
You care deeply about building secure and scalable systems and take pride in reliable infrastructure
You're collaborative, humble, and ready to own high-impact projects end-to-end
Architect and maintain our multi-cloud infrastructure (primarily AWS, potentially Azure/GCP), with a focus on reliability, security, and scalability
Define and implement infrastructure-as-code best practices using Terraform, CloudFormation, Pulumi (and similar technologies)
Design and manage Kubernetes-based systems for model training, inference, and data processing workloads
Optimize our CI/CD pipelines and streamline deployment of services across environments
Build monitoring, alerting, and logging systems to ensure high system availability and observability
Collaborate with research and engineering teams to provide infrastructure support for training large-scale ML models
Ensure our infrastructure supports various deployment models (cloud, on-prem, hybrid) for enterprise use cases
Drive cost-efficiency strategies across compute and storage resources
Respond to and resolve infrastructure-related incidents with a sense of ownership and urgency

Preferred

Experience supporting infrastructure for ML workloads (training pipelines, inference clusters, GPU orchestration)
Built or scaled infrastructure for teams working with large-scale datasets
Exposure to cost monitoring and optimization tools in cloud environments
Background supporting compliance and security in enterprise deployments

Benefits

100% covered health benefits (medical, vision, and dental).
401(k) plan with a generous 4% company match.
Unlimited PTO policy
Annual $2,000 wellness stipend.
Annual $1,000 learning and development stipend.
Daily lunches and snacks are provided in our office!
Relocation assistance for employees moving to the Bay Area.

Company

DatologyAI

twittertwittertwitter
company-logo
DatologyAI is an AI-data curation startup that develops deep learning tools for automatic selection in data training.

H1B Sponsorship

DatologyAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (4)
2024 (2)

Funding

Current Stage
Early Stage
Total Funding
$57.65M
Key Investors
FelicisAmplify Partners
2024-05-08Series A· $46M
2024-02-22Seed· $11.65M

Leadership Team

leader-logo
Ari Morcos
CEO and Co-Founder
linkedin
leader-logo
Bogdan Gaza
Co-Founder & CTO
linkedin

Recent News

Company data provided by crunchbase