Apply on Employer Site

Trend Micro · 14 hours ago

Applied AI Architect - Austin, TX

Austin

Full-time

Onsite

Mid, Senior Level

3+ years exp

Trend Micro, a global cybersecurity leader, helps make the world safe for exchanging digital information across enterprises, governments, and consumers. They are seeking an Applied AI Architect to lead the technical direction for model architecture selection, fine-tuning, and optimization, translating research into scalable solutions for cybersecurity.

Cloud SecurityCyber SecuritySecurityVirtualization

Culture & Values

No H1B

U.S. Citizen Only

Responsibilities

Drive research-to-production of LLM/SLM systems — from design and fine-tuning to evaluation, deployment, and continual adaptation in enterprise agent workflows

Lead technical choices — determine when to apply context engineering, prompt tuning, continued pretraining, supervised fine-tuning, reasoning fine-tuning, LoRA, or RL

Architect high-performance inference and serving using vLLM, NVIDIA NIM, Triton, CUDA, or other optimized frameworks

Integrate reinforcement learning frameworks (veRL, SkyRL, PyTorch, Ray RLlib) to enhance reasoning, adaptability, and agent feedback loops

Develop and operationalize AI Ops pipelines — build benchmarks and metrics for model evaluation, observability, drift detection, and lifecycle automation

Advance agent interoperability using A2A (Agent-to-Agent) or MCP (Model Context Protocol) for large-scale coordination

Collaborate with cybersecurity researchers to embed threat reasoning, anomaly detection, and defensive logic directly into model behavior

Publish, document, and codify reusable AI blueprints for hybrid (cloud + on-prem) deployments and future research acceleration

Qualification

LLM/SLM production experienceGPU-accelerated inferencePython proficiencyAI Ops toolchainsContainerized AI microservicesResearch-driven mindsetData-oriented approachOwnership mentality

Required

Proven end-to-end experience bringing LLM/SLM research into production — from fine-tuning and inference optimization to evaluation and AI Ops integration

Excellent knowledge of at least one of the following: Deep understanding of data-model-infrastructure trade-offs and optimization under real business constraints

Hands-on experience fine-tuning LLMs using frameworks such as LLaMA Factory, NeMo, and PEFT (e.g., LoRA)

Strong knowledge of GPU-accelerated inference (ex: vLLM, NIM, Triton, CUDA, NCCL, PyTorch/XLA)

Familiarity with AI Ops toolchains (ex: Weights & Biases, MLflow, Ray Serve)

Proficiency in Python, and experience building containerized AI microservices (ex: Docker, Kubernetes, Ray)

3+ years of applied AI/ML research or engineering, including 2+ years in production-scale deployment