Staff Machine Learning Engineering (Remote) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Cisco · 7 hours ago

Staff Machine Learning Engineering (Remote)

Cisco is revolutionizing how data and infrastructure connect and protect organizations in the AI era. They are seeking a Staff Machine Learning Engineer to lead the architecture for their AI Platform, design high-scale inference services, and provide mentorship to engineers.

Communications InfrastructureEnterprise SoftwareHardwareSoftware
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Lead the end-to-end architecture for key areas of the AI Platform: multi-tenant LLM serving (vLLM/Ray), routing and orchestration layers, VectorDB/RAG integration, and agentic/SDK surfaces used by product teams
Design and drive implementation of high-scale inference services, including parallelism strategies (TP/PP/EP/MoE), autoscaling policies, and cross-region capacity management for GPU/CPU workloads
Optimize latency, throughput, and cost for large-scale LLM and generative workloads using techniques such as batching, chunked prefills, caching, and mixed precision
Design and tune distributed inference configurations (TP/PP/EP/MoE), across multi-GPU and multi-node clusters and modern GPU architectures
Implement platform capabilities such as telemetry, metering & throttling, guardrails, and rollout/rollback to ensure AI services are safe, observable, and multi-tenant by default
Lead the design of GenAI application services—chat assistants, and automation APIs, grounded in robust RAG pipelines, agentic workflows (LangChain/LangGraph or similar), and MCP-based tool ecosystems
Drive operational excellence with runbooks, readiness checklists, CI/CD safeguards, on-call rotations, and post-incident improvements
Provide technical mentorship and leadership for senior and mid-level engineers: review designs, guide trade-offs around quality/latency/COGS, and help grow the next generation of tech leads
Collaborate closely with applied scientists to productionize new models and techniques, ensuring that research prototypes become robust, observable, and cost-efficient services

Qualification

Machine Learning EngineeringDistributed SystemsCloud-native ArchitecturesProgramming (Python/Go/Java)LLM Inference FrameworksGPU Performance OptimizationRAG Systems DesignAgentic FrameworksAWS/Azure/GCP KnowledgeTechnical MentorshipCollaboration SkillsCommunication Skills

Required

Bachelor's degree in computer science, Engineering, or equivalent practical experience
8+ years of hands-on experience building and operating backend or distributed systems in production or 5+ years of experience with a Master's degree, or 3+ years with a PhD
Proven track record as a technical lead for complex systems: driving architecture, aligning stakeholders, and delivering high-impact projects end-to-end
Strong proficiency in at least one modern programming language (e.g., Python, Go, or Java) and deep experience with software design, debugging, and performance tuning
Significant experience with cloud-native architectures (containers, Kubernetes, service discovery, configuration management, CI/CD) and building reliable microservices (REST/gRPC)
Demonstrated ownership of production services at scale, including on-call participation, incident response, and post-incident/RCAs that led to concrete improvements

Preferred

Hands-on experience running LLM or deep learning inference at scale using frameworks such as vLLM, TensorRT-LLM, Triton Inference Server, or similar
Deep understanding of GPU and distributed systems performance: latency/throughput trade-offs, pipelining, model parallelism (TP/PP/EP/MoE), mixed precision (BF16/FP8/nvFP4), and profiling tools
Experience designing and operating RAG systems and GenAI application layers: document ingestion, chunking/embedding strategies, metadata design, hybrid retrieval, context ranking, and evaluation of retrieval quality
Practical experience with agentic frameworks (LangChain, LangGraph, LlamaIndex, Semantic Kernel, or similar) and multi-agent coordination, including integration with MCP tools and internal/external APIs
Background building platform or Developer experiences capabilities—shared services, SDKs, templates, micro-frontends—that are adopted by multiple product teams
Familiarity with LangSmith or similar evaluation platforms, including experiment design, offline/online evals, hallucination/groundedness metrics, and feedback loops
Strong knowledge of AWS or Azure or GCP (EC2/VMs, IAM roles/ARNs/principals, VPC networking, security best practices) for AI workloads
Experience defining and monitoring dashboards, and alerts for high-availability systems using Prometheus, Grafana, or cloud-native tooling
Excellent communication and collaboration skills, comfortable influencing cross-functional partners and other senior engineers, and explaining trade-offs between quality, latency, and cost to both technical and non-technical audiences

Benefits

Medical, dental and vision insurance
A 401(k) plan with a Cisco matching contribution
Paid parental leave
Short and long-term disability coverage
Basic life insurance
10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
1 paid day off for employee’s birthday
Paid year-end holiday shutdown
4 paid days off for personal wellness determined by Cisco
16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
Cisco’s flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
Additional paid time away may be requested to deal with critical or emergency issues for family members
Optional 10 paid days per full calendar year to volunteer
Eligible to earn annual bonuses subject to Cisco’s policies
Earn performance-based incentive pay on top of their base salary

Company

Cisco develops, manufactures, and sells networking hardware, telecommunications equipment, and other technology services and products. It is a sub-organization of Cisco Press.

H1B Sponsorship

Cisco has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1238)
2024 (1231)
2023 (1273)
2022 (2127)
2021 (1991)
2020 (1173)

Funding

Current Stage
Public Company
Total Funding
unknown
1990-02-13IPO

Leadership Team

leader-logo
Chuck Robbins
Chair and CEO
linkedin
leader-logo
Carl Solder
Chief Technology Officer - Cisco System Australia/New Zealand
linkedin
Company data provided by crunchbase