Complétez votre profil pour recevoir des offres adaptées.

Mon espace

Mes CV vus

Mes candidatures

Mes alertes

Mon profil

Paramètres

Déconnexion

Missions d'intérim
Offres de stage
Offres en alternance
Créer mon alerte
Déposer mon CV
Salaire brut net

Téléchargez l'app et postulez dans les premiers !

Diffuser ma première offre Déjà client

Téléchargez l'app et postulez dans les premiers !

Se connecter S'inscrire Formation

Téléchargez l'app et postulez dans les premiers !

Trouver mon job s

Trouver mon entreprise s

Accès recruteur

Diffuser ma première offre

Déjà client

Emploi

Missions d'intérim

Offres de stage

Offres en alternance

Créer mon alerte

Déposer mon CV

Salaire brut net

Formation

Se connecter

S'inscrire

Trouver mon job s

Trouver mon entreprise s

Mon espace

Mes CV vus

Mes candidatures

Mes alertes

Mon profil

Paramètres

Déconnexion

Pas de salaire renseigné

Le recruteur n'a pas communiqué le salaire pour cette offre, ou n'a pas souhaité l'afficher.

Freelance Agent Evaluation Engineer H/F Mindrift

Lyon - 69
Freelance
Temps partiel
Bac +5
Secteur informatique • ESN
Exp. 5 ans min.

Lire dans l'app

Postuler sur le site du recruteur

Détail du poste

Please submit your CV in English and indicate your level of English proficiency.

Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems.Participation is project-based, not permanent employment.

What this opportunity involves

We're building a dataset to evaluate AI coding agents - how well a model handles real-world developer tasks.

You'll create challenging tasks and evaluation criteria within realistic simulated environments:

Build realistic developer environments - a virtual company with codebase, infrastructure, and context (tickets, docs, conversations) that forms a believable development history
Design tasks from intermediate states of these environments - craft the prompt, define what "solved" means, and ensure the task is solvable by an AI agent
Write tests that verify agent solutions - accept all valid approaches and reject incorrect ones, neither too strict nor too lenient
Iterate on tasks and tests based on QA feedback - review agent solutions, analyze failures, and refine until the evaluation is fair and robust

What this is NOT

Not data labeling
Not prompt engineering
Not writing code from scratch - the agent writes most of the code; you guide and evaluate

What we look for

5+ years in software development
Core stack: Python (FastAPI), JavaScript/TypeScript (React), Docker, Postgres, Kafka, Redis
Experience writing tests (functional, integration)
English proficiency - B2+

Why this is hard

Frontier models are already good at coding. Creating a task that genuinely challenges the best models is non-trivial. You need to deeply understand where models fail and what scenarios reveal the difference between a good and a bad solution. Tasks have many valid solutions - writing tests that accept all correct solutions and reject incorrect ones is harder than it sounds.

How it works

Apply Pass qualification(s) Join a project Complete tasks Get paid

Effort estimate

Tasks for this project are estimated to take 20 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.

Compensation

Up to $50/hr equivalent, depending on level and pace. Tasks are estimated at ~20 hours each; you set your own schedule.

Publiée le 02/06/2026 - Réf : 9195e553-fc7e-4bed-a721-4ea5b9716bba

Créez votre compte Hellowork et activez votre alerte Créez une alerte

Métier

Localité

Type de contrat

CDI

CDD

Intérim

Stage

Alternance

Indépendant

Franchise

Associé

Fonctionnaire

Freelance

Stage de lycée

Postuler

Finalisez votre candidature
sur le site du recruteur

Créez votre compte
Hellowork et postulez

sur le site du recruteur !

Ces offres pourraient aussi
vous intéresser

Manager - Senior Manager - Evaluation Financière - Lyon H/F

Forvis Mazars

Lyon 6e - 69

CDI

Voir l’offre

il y a 15 jours

Voir plus d'offres

Coach Emploi

Bêta

Chargement du coach emploi...

Connectez-vous ou créez un compte pour obtenir une analyse personnalisée de votre CV.

Continuer avec un compte

En dialoguant avec notre assistant IA, vous déclarez avoir pris connaissance de notre politique de confidentialité . Notre assistant IA est en version bêta test et peut faire des erreurs. Pour tout savoir sur le fonctionnement, consultez la notice d'information .

Recherches similaires

Emploi Chargé d'évaluation
Emploi Marketing
Emploi Villefranche-sur-Saône
Emploi Tarare
Emploi Villeurbanne
Emploi Meyzieu
Emploi Rillieux-la-Pape
Emploi Belleville-en-Beaujolais
Emploi Mornant
Emploi Givors
Emploi L'Arbresle
Emploi Saint-Priest
Emploi Assistant marketing
Emploi Chargé de marketing
Emploi Community manager
Emploi Chef de produit
Emploi Enquêteur
Entreprises Marketing
Entreprises Chargé d'évaluation
Entreprises Lyon
Emploi Freelance
Emploi Montagne
Emploi Agent
Emploi Tech
Emploi Qualification
Emploi Montagne Lyon
Emploi Freelance Lyon

Voir plus Voir moins

Accueil
Emploi
Emploi Lyon
Emploi Marketing Lyon
Emploi Chargé d'évaluation Lyon
Freelance Agent Evaluation Engineer H/F

Les sites

HelloCV
Helloworkplace
BDM
Jobijoba
Maformation
Diplomeo

L'emploi

Offres d'emploi par métier
Offres d'emploi par ville
Offres d'emploi par entreprise
Offres d'emploi par mots clés

L'entreprise

Qui sommes-nous ?
On recrute
Accès client

Les apps

Nous suivre sur :

Informations légales CGU Politique de confidentialité Gérer les traceurs Accessibilité : non conforme Aide et contact

Freelance Agent Evaluation Engineer H/F Mindrift

Détail du poste

Finalisez votre candidature sur le site du recruteur

Ces offres pourraient aussi vous intéresser

Recherches similaires

Finalisez votre candidature
sur le site du recruteur

Ces offres pourraient aussi
vous intéresser