Avacend Inc · 20 hours ago
Founding Engineer (AI Software Engineer)
Avacend Inc is seeking a Founding Engineer for an AI project requiring deployment and management of AI operations. The role involves managing MLOps pipelines, deploying LLMs, and ensuring the performance and availability of AI inference services.
Responsibilities
Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server
Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Setup and operation of AI inference service monitoring for performance and availability
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc
Managing scalable infrastructure for deploying and managing LLMs
Deploying models in production environments, including containerization, microservices, and API design
Triton Inference Server, including its architecture, configuration, and deployment
Model Optimization techniques using Triton with TRTLLM
Model optimization techniques, including pruning, quantization, and knowledge distillation
Qualification
Required
Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server
Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Setup and operation of AI inference service monitoring for performance and availability
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc
Managing scalable infrastructure for deploying and managing LLMs
Deploying models in production environments, including containerization, microservices, and API design
Triton Inference Server, including its architecture, configuration, and deployment
Model Optimization techniques using Triton with TRTLLM
Model optimization techniques, including pruning, quantization, and knowledge distillation
Company
Avacend Inc
Avacend’s core services offerings provide a unique combination of Telecommunications and Information Technology expertise that are leveraged to offer new award winning Converged Communications services.
H1B Sponsorship
Avacend Inc has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2020 (2)
Funding
Current Stage
Growth StageRecent News
EIN Presswire
2025-12-24
2025-06-27
Company data provided by crunchbase