Novisync, Inc · 1 month ago
Sr. Consultant
Novisync, Inc is a company focused on AI operations, and they are seeking a Sr. Consultant to manage and optimize LLMs and Kubernetes services. The role involves deploying, managing, and troubleshooting containerized services and MLOps pipelines for mission-critical applications.
Information ServicesInformation Technology
Responsibilities
Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server
Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Setup and operation of AI inference service monitoring for performance and availability
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc
Managing scalable infrastructure for deploying and managing LLMs
Deploying models in production environments, including containerization, microservices, and API design
Triton Inference Server, including its architecture, configuration, and deployment
Model Optimization techniques using Triton with TRTLLM
Model optimization techniques, including pruning, quantization, and knowledge distillation
Qualification
Required
Must have skills: LLM and Kubernetes
Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server
Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Setup and operation of AI inference service monitoring for performance and availability
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc
Managing scalable infrastructure for deploying and managing LLMs
Deploying models in production environments, including containerization, microservices, and API design
Triton Inference Server, including its architecture, configuration, and deployment
Model Optimization techniques using Triton with TRTLLM
Model optimization techniques, including pruning, quantization, and knowledge distillation
Company
Novisync, Inc
Founded in 2007, Novisync is a global management consulting and technology services company committed to helping businesses navigate the digital world with confidence.
H1B Sponsorship
Novisync, Inc has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (3)
2024 (16)
2023 (15)
2022 (12)
2021 (10)
2020 (23)
Funding
Current Stage
Growth StageCompany data provided by crunchbase