Aller au contenu principal
Mindrift recrutement

Evaluation Scenario Writer - ai Agent Testing Specialist H/F Mindrift

  • Paris - 75
  • CDI
  • Temps partiel
  • Bac +2
  • Bac +3, Bac +4
  • Bac +5
  • Secteur informatique • ESN
  • Exp. 3 ans min.
Lire dans l'app

Détail du poste

Please submit your CV in English and indicate your level of English proficiency.

Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment.

What this opportunity involves

While each project involves unique tasks, contributors may:

- Create structured test cases that simulate complex human workflows
- Define gold-standard behavior and scoring logic to evaluate agent actions
- Analyze agent logs, failure modes, and decision paths
- Work with code repositories and test frameworks to validate your scenarios
- Iterate on prompts, instructions, and test cases to improve clarity and difficulty
- Ensure that scenarios are production-ready, easy to run, and reusable

What we look for

This opportunity is a good fit for software engineers, open to part-time, non-permanent projects. Ideally, contributors will have:

- 3+ of software development experience with strong Python focus
- Experience with Git and code repositories
- Comfortable with structured formats like JSON/YAML for scenario description
- Understanding core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design
- Familiarity with Docker
- English proficiency - B2

How it works

Apply Pass qualification(s) Join a project Complete tasks Get paid

Project time expectations

Tasks for this project are estimated to take 6-10 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.

Payment

- Paid contributions, with rates up to $50/hour*
- Fixed project rate or individual rates, depending on the project
- Some projects include incentive payments

*Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.

Publiée le 24/01/2026 - Réf : f8df43cb-6174-47bd-859f-8fce1529d038

Evaluation Scenario Writer - ai Agent Testing Specialist H/F

Mindrift
  • Paris - 75
  • CDI
  • Temps partiel
Publiée le 24/01/2026 - Réf : f8df43cb-6174-47bd-859f-8fce1529d038

Finalisez votre candidature

sur le site du recruteur

Créez votre compte pour postuler

sur le site du recruteur !

Ces offres pourraient aussi
vous intéresser

MEENT recrutement
MEENT recrutement
Boulogne-Billancourt - 92
CDI
40 000 - 45 000 € / an
Télétravail partiel
Voir l’offre
il y a 4 jours
Digisap Solutions recrutement
Paris 2e - 75
CDI
Télétravail partiel
Voir l’offre
il y a 11 jours
B-Hive recrutement
B-Hive recrutement
Boulogne-Billancourt - 92
CDI
35 000 - 40 000 € / an
Voir l’offre
il y a 11 jours
Voir plus d'offres
Initialisation…
Les sites
L'emploi
  • Offres d'emploi par métier
  • Offres d'emploi par ville
  • Offres d'emploi par entreprise
  • Offres d'emploi par mots clés
L'entreprise
  • Qui sommes-nous ?
  • On recrute
  • Accès client
Les apps
Nous suivre sur :
Informations légales CGU Politique de confidentialité Gérer les traceurs Accessibilité : non conforme Aide et contact