DevOps Engineer (Full Time - Remote)
💡 Tip sa Pag-apply: Ang pag-click sa "Mag-apply sa Braintrust nang Libre" ay magdadala sa iyo sa opisyal na site ng Braintrust. Ito ay 100% libre para sa iyo at nakakatulong na suportahan ang aming platform sa pamamagitan ng mga referral bonus.
⚠️ Paalala sa pagsasalin: Ang impormasyong ito ay isinalin gamit ang AI. Kung may hindi malinaw o may pagkakamali, ang English na orihinal ang dapat sundin.
Role Overview
This contract will begin with a 1 month paid trial which can then extend to 6+ months
Job description:
We're hiring experienced DevOps engineers to author and validate Infrastructure-as-Code (IaC) tasks. You'll design realistic infrastructure scenarios, build the ground-truth solutions, and define the automated checks that grade whether an AI agent solved them correctly and safely. This is hands-on engineering and judgment work - you're encoding what "good" looks like for real infrastructure operations into verifiable tasks. If you've spent years writing Terraform, Pulumi, CloudFormation, cloud CLI, for AWS or GCP, we want to work with you.
What you'll do:
- Author IaC tasks grounded in real-world AWS and GCP) scenarios.
- Build ground-truth solutions for each task: correct, idempotent IaC that converges to the desired end state.
- Design verifiable graders - automated checks that confirm an agent reached the correct end state
- Review and QA tasks authored by other engineers for correctness, difficulty calibration, and robustness.
- Harden tasks against reward hacking
- Document task intent, assumptions, edge cases, and scoring rationale clearly.
Must-have qualifications:
- 4+ years in DevOps / cloud engineering.
- Deep, hands-on Infrastructure-as-Code expertise: Terraform (required); Pulumi, CloudFormation, or CDK a plus.
- Strong AWS and GCP depth
- Comfortable scripting in Python (and/or Bash) to build and automate validation.
- A clear sense of correctness and verification - you can articulate, in code, what makes an infrastructure outcome right and safe
- Prior work in AI evaluation, benchmarking, or expert data annotation preferred.
- Prior experience with LocalStack preferred
Mga Alerto sa Trabaho