Mindrift · 5 hours ago
MCP & Tools Python Developer - Agent Evaluation Infrastructure
Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. The role involves developing and maintaining MCP-compatible evaluation servers, implementing logic for agent actions, and creating tools for testing agents.
Computer Software
Responsibilities
Develop and maintain MCP-compatible evaluation servers
Implement logic to check agent actions against scenario definitions
Create or extend tools that writers and QAs use to test agents
Work closely with infrastructure engineers to ensure compatibility
Occasionally help with test writing or debug sessions when needed
Apply professional judgment to assess AI responses
Qualification
Required
Degree in Computer Science, Software Engineering or related fields
4+ years of Python development experience, ideally in backend or tools
Solid experience building APIs, testing frameworks, or protocol-based interfaces
Understanding of Docker, Linux CLI, and HTTP-based communication
Familiarity with how LLM agents are prompted, executed, and evaluated
Clear documentation and communication skills - you'll work with QA and writers
English proficiency - B2
Benefits
Fixed project rate or individual rates, depending on the project
Some projects include incentive payments
Company
Mindrift
Welcome to Mindrift — a space where innovation meets opportunity.
Funding
Current Stage
Late StageCompany data provided by crunchbase