Apply on Employer Site

Avacend Inc · 20 hours ago

Founding Engineer (AI Software Engineer)

NYC Metro Area

Contract

Hybrid

Mid, Senior Level

Avacend Inc is seeking a Founding Engineer for an AI project requiring deployment and management of AI operations. The role involves managing MLOps pipelines, deploying LLMs, and ensuring the performance and availability of AI inference services.

AnalyticsConsultingHuman ResourcesStaffing Agency

H1B Sponsor Likely

Hiring Manager

Bikram Kunwar

Responsibilities

Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)

Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server

Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production

Setup and operation of AI inference service monitoring for performance and availability

Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production

Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc

Managing scalable infrastructure for deploying and managing LLMs

Deploying models in production environments, including containerization, microservices, and API design

Triton Inference Server, including its architecture, configuration, and deployment

Model Optimization techniques using Triton with TRTLLM

Model optimization techniques, including pruning, quantization, and knowledge distillation

Qualification

KubernetesTensorRT-LLMTriton Inference ServerMLOps/LLMOpsModel Optimization

Required