NationMind LLC · 15 hours ago
AI QA ENGINEER (AGENTIC & GENERATIVE) at Dallas, TX
NationMind LLC is a technology consulting firm focused on software development and QA testing services. They are currently hiring a skilled AI QA Engineer to define and own the QA strategy for agentic/multi-agent AI systems, mentor a team of QA engineers, and partner with various teams to embed QA in the SDLC.
Information Technology & Services
Responsibilities
Define and own the QA strategy for agentic/multi-agent AI systems across dev, staging, and prod
Mentor a team of QA engineers; establish testing standards, coding guidelines for test harnesses, and review practices
Partner with Agentic Operations, Data Science, MLOps, and Platform teams to embed QA in the SDLC and incident response
Design tests for agent orchestration, tool calling, planner-executor loops, and inter-agent coordination (e.g., task decomposition, handoff integrity, and convergence to goals)
Validate state management, context windows, memory/knowledge stores, and prompt/graph correctness under varying conditions
Implement scenario fuzzing (e.g., adversarial inputs, prompt perturbations, tool latency spikes, degraded APIs)
Create resilience testing suites: chaos experiments, failover, retries/backoff, circuit-breaking, and degraded mode behavior
Establish latency SLOs and measure end-to-end response times across orchestration layers (LLM calls, tool invocations, queues)
Ensure reliability through soak tests, canary verifications, and automated rollbacks
Define ground-truth and reference pipelines for task accuracy (exact match, semantic similarity, factuality checks)
Build macro validation frameworks that validate task outcomes across multi-step agent workflows (e.g., complex data pipelines, content generation + verification agent loops)
Instrument guardrail validations (toxicity, PII, hallucination, policy compliance)
Design load/stress tests for multi-agent graphs under scale (concurrency, throughput, queue depth, backpressure)
Validate orchestrator correctness (DAG execution, retries, branching, timeouts, compensation paths)
Engineer reusable test artifacts (scenario configs, synthetic datasets, prompt libraries, agent graph fixtures, simulators)
Integrate tests into CI/CD (pre-merge gates, nightly, canary) and production monitoring with alerting tied to KPIs
Define release criteria and run operational readiness (performance, security, compliance, cost/latency budgets)
Build post-deployment validation playbooks and incident triage runbooks
Qualification
Required
7+ years in Software QA/Testing, with 2+ years in AI/ML or LLM-based systems; hands-on experience testing agentic/multi-agent architectures
Strong programming skills in Python or TypeScript/JavaScript; experience building test harnesses, simulators, and fixtures
Experience with LLM evaluation (exact/soft match, BLEU/ROUGE, BERTScore, semantic similarity via embeddings), guardrails, and prompt testing
Expertise in distributed systems testing latency profiling, resiliency patterns (circuit breakers, retries), chaos engineering, and message queues
Familiarity with orchestration frameworks (LangChain, LangGraph, LlamaIndex, DSPy, OpenAI Assistants/Actions, Azure OpenAI orchestration, or similar)
Proficiency with CI/CD (GitHub Actions/Azure DevOps), observability (OpenTelemetry, Prometheus/Grafana, Datadog), and feature flags/canaries
Solid understanding of privacy/security/compliance in AI systems (PII handling, content policies, model safety)
Excellent communication and leadership skills; proven ability to work cross-functionally with Ops, Data, and Engineering
Preferred
Experience with multi-agent simulators, agent graph testing, and tooling latency emulation
Knowledge of MLOps (model versioning, datasets, evaluation pipelines) and A/B experimentation for LLMs
Background in cloud (AWS), serverless, containerization, and event-driven architectures
Prior ownership of cost/latency/SLAs for AI workloads in production
Company
NationMind LLC
NationMind LLC was established with the aim of Empowering Talent.
Funding
Current Stage
Growth StageCompany data provided by crunchbase