Hellowork a estimé le salaire pour cette offre@titleMarkup>
Cette estimation de salaire pour le poste de Thèse Jumeau Numérique pour la Formation Immersive en Chirurgie Cardiaque Fondé sur l'IA Générative H/F à Paris est calculée grâce à des offres similaires et aux données de l’INSEE.
Cette fourchette est variable selon expérience.
Salaire brut min
41 200 € / an 3 433 € / mois 22,64 € / heureSalaire brut estimé
51 200 € / an 4 267 € / mois 28,13 € / heureSalaire brut max
67 500 € / an 5 625 € / mois 37,09 € / heureCette information vous semble-t-elle utile ?
Merci pour votre retour !
Thèse Jumeau Numérique pour la Formation Immersive en Chirurgie Cardiaque Fondé sur l'IA Générative H/F
Doctorat.Gouv.Fr
- Paris - 75
- CDD
- Télétravail partiel
- Bac +3, Bac +4
- Bac +5
- Service public d'état
- Exp. - 1 an
- Exp. 1 à 7 ans
- Exp. + 7 ans
Détail du poste
Établissement : Institut Polytechnique de Paris Télécom SudParis
École doctorale : Ecole Doctorale de l'Institut Polytechnique de Paris
Laboratoire de recherche : SAMOVAR - Services répartis, Architectures, Modélisation, Validation, Administration des Réseaux
Direction de la thèse : Catalin FETITA ORCID 0000000261342990
Début de la thèse : 2026-10-01
Date limite de candidature : 2026-04-13T23:59:59
Les maladies cardiovasculaires constituent la première cause de mortalité, et la chirurgie de pontage coronarien exige des compétences techniques élevées acquises sur une longue période. Les méthodes de formation traditionnelles, reposant sur les laboratoires expérimentaux, les simulateurs et l'observation directe au bloc opératoire, manquent de disponibilité, de reproductibilité et d'interactivité. Ce projet propose de combiner la XR immersive et l'IA générative afin de créer des jumeaux numériques flexibles et réalistes de la chirurgie cardiaque, capables de reconstruire, simuler et enrichir des interventions réelles pour une formation immersive et adaptive. Des données multimodales (vidéo, profondeur, mouvement, son), conjuguées à des modèles génératifs textevers3D, seront exploitées. Les avancées récentes en représentations neuronales de scènes, en rendu différentiable et en génération 3D basée sur la diffusion permettront de concevoir des environnements virtuels anatomiquement fidèles et contrôlables. Les défis incluent la rareté de jeux de données chirurgicaux ouverts et la difficulté de modéliser des scènes déformables, occluses et complexes.
Le doctorant étendra les dispositifs de capture, développera des chaînes de prétraitement et d'annotation, ajustera des modèles génératifs et concevra un moteur de génération 3D fondé sur des prompts multimodaux. Le résultat attendu est un simulateur de formation cardiaque piloté par IA, intégré aux infrastructures XR existantes de l'INSERM, ouvrant la voie à un assistant chirurgical cognitif et perceptif qui pourrait transformer tant la formation médicale que l'avenir même de la chirurgie.
Digital twins have profoundly transformed industry [7], where they are used for predictive maintenance [8], process optimization [9], quality control [10], and simulation-based validation [11]. In such environments, systems are engineered, uncertainties are bounded, and physical processes are reproducible. Virtual twins can model machines, infrastructures, or workflows with controlled variability. Examples include XR-based industrial maintenance platforms, gesture capture for craftsmanship replication [12], and AI-driven robotic training environments, such as NVIDIA's work on AI-generated data for robot training in healthcare and industrial applications.
In healthcare, digital twins are emerging but remain fragmented. Current developments primarily focus on organ-level [13] or patient-specific modeling, such as computational cardiac simulations for preoperative planning, anatomical reconstructions, or intraoperative registration in bone and liver surgery [14, 15]. At INSERM, several initiatives are already advancing the field: realistic modeling of the heart, operating room capture and manual reconstruction, projects on teleportation, remote surgery and the use of serious games to train nurses in instrument handling. These efforts demonstrate the feasibility of digital twins in medical contexts but remain largely limited to anatomical or functional replication. Furthermore, living systems introduce unbounded uncertainties-patient-specific anatomy, tissue properties, surgical decision-making, and intraoperative events-that differ fundamentally from the bounded variability of industrial systems.
In generative AI, large-scale diffusion and transformer-based foundation models such as Sora, Veo, and Imagen 4 Ultra Preview demonstrate high visual fidelity, instruction following, and short-term temporal coherence. Despite these advances, limitations remain: long-term temporal drift, weak causal and procedural modeling, insufficient semantic grounding, and poor adaptation to patient-specific or surgeon-specific variability. This is particularly due to non medical dataset used of training. Specialized computer vision and AI methods have been explored for surgical sub-tasks: hand detection using CNN-based models [16], phase recognition with RNNs [17] or gaze-informed Vision Transformers [18], skill assessment via Cascade Mask-RCNN and visual question answering for surgical understanding [20]. However, there is currently no formal framework to capture, structure, and generate expertise-driven surgical activity across hierarchical levels. A relatively close contribution is [21], yet it does not provide a generative, expert-controllable model of surgical dynamics.
Several international initiatives explore components of surgical digital twins, closely related to our subject. In France, CAMMA focuses on multimodal operating room understanding [22, 23], while IMT develops capture setup and reconstruction pipelines. In China, [24] investigates generative approaches for clinical content synthesis . Switzerland hosts the Digital Twin of Spinal Surgery [25], and in the USA, TwinOR develops multimodal scene reconstruction for surgical environment [26]. Despite these efforts, no framework integrates all levels necessary for a complete surgical twin: patient, surgeon, operating team, procedural environment, and multimodal expertise.
In this context, immersive training constitutes a natural downstream application of surgical digital twins. Based on immersive technologies, it has emerged as an effective experiential learning approach. Literature reviews indicate that immersive environments enhance learner engagement, skill acquisition, and knowledge retention by enabling realistic, repeatable, and risk-free practice scenarios. They are particularly effective for procedural learning and decision-making under pressure. However, effectiveness depends strongly on instructional design quality and alignment with pedagogical objectives [27-30].
The ambition of this project is to develop a new class of expertise-driven surgical digital twins. Rather than treating generation as unconstrained probabilistic synthesis, the proposed framework will embed procedural logic, causal structure and medical expertise within the learning architecture, enabling controllable, temporally coherent and patient-specific simulations. Crucially, this project introduces a paradigm shift in surgical simulation: the proposed approach places the domain expert at the center of the generation process. The long-term vision is to provide cardiac surgeons with an interactive XR simulation tool in which structured natural language prompts-expressed in their own professional vocabulary-allow them to specify scenes, procedural steps, anatomical variations, and constraints. This transition from code-based specification to expert-driven, prompt-based control fundamentally redefines how surgical digital twins are designed and operated.
The proposed models will rely on domain-aware generative architectures integrating multimodal surgical data, including video, audio, and textual annotations. By positioning itself at the intersection of foundation models, immersive technologies, and surgical simulation research, the project aims to establish a scientifically rigorous and clinically relevant foundation for trustworthy surgical digital twins.
The research will be structured around three work packages:
WP1: Parameterized 3D reconstruction of the operating room
Summary: The objective is to design a lightweight and clinically compatible capture setup capable of producing a high fidelity and semantically structured 3D reconstruction of the operating room from sparse sensors. The challenge lies in reconciling the strong constraints of real surgical environments with the requirements of modern neural rendering methods.
(i) First, we will design an optimized sparse capture setup inspired by the modular capture studio developed at SAMOVAR and complemented by available hardware at INSERM. The goal is to define a minimal sensor configuration that preserves sterility and ergonomic compatibility, while remaining sufficient for robust reconstruction. Unlike prior systems relying on dense RGB-D camera [36], Kinects [25] or wall-mounted stereo camera [26], the proposed setup will be explicitly conceived for sparsity and deployability in real surgical scenarios. (ii) Second we will adapt 3DGS to sparse-view conditions by integrating recent advances in diffusion models [34, 37, 38] and visual foundation models (CLIP, SAM, DINO) to strengthen the semantic understanding of the scene [39]. (iii) Finally, structured knowledge about operating room organization will be encoded into a scene-graph representation [40] derived from documented layout standards and workflow guidelines [41]. This scene graph will act as a structural prior, embedding functional relationships between devices, sterility areas and spatial organization directly into the reconstruction process.
WP2: Dynamic reconstruction of surgical expertise and multimodal database construction
Summary: This work package focus on capturing, structuring and modeling surgical expertise in a dynamic and multimodal manner. Its goal is to create a robust dataset and hierarchical representation of surgical behavior that captures gestures, action and procedural activity, reflecting the adaptive and context-dependent nature of surgery. By extending the capture setup developed in WP1, we aims to acquire high-quality, multimodal recordings of real surgical procedures and their associated dynamic reconstruction that form the foundation for learning.
(i) We will extend the capture setup developed in WP1 and integrate first-person surgical recordings in collaboration with Pr. Patrick NATAF. The capture protocol will combine egocentric and exocentric video, audio streams and structured textual annotations to achieve a structured database as [42, 43]. We will design a rigorous acquisition and curation protocol that includes data filtering, segmentation, summarization and anonymization [42, 43]. These underlying tasks will rely on fine-tuned foundation models such as SAM, DINO, YOLO, LLaVA. Unlike industrial workflows, surgery is adaptive and context-sensitive. Thus, surgical practice will be structured at three levels: gesture (motor movement), action (sets of gestures) and activity (procedural phase, sets of actions). Our methodology will capture strategy changes, medical team interaction and intraoperative decisions. (ii) 3DGS reconstruction obtained in WP1 will be extended to dynamic representation. (iii) The resulting dataset will be both multimodal, structured, and suitable for conditioning the generative framework in WP3.
WP3: Expert-driven generative scenario synthesis
Summary: This work package aims to develop a knowledge-augmented generative framework capable of producing controllable, temporally coherent and semantically grounded 3D surgical scenarios. The objective is to transform recent advances in diffusion and foundation models into clinically meaningful simulation tools driven directly by expert-level structured prompts. The developed model aims to be introduced in XR-device for user tests.
Our contribution focuses on developing a knowledge- and constraint-augmented generative framework for open cardiac surgery, unifying recent advances in diffusion-based 3D and video generation with structured expert knowledge. The approach combines latent diffusion video and 3D generators [24, 44, 45, Genie3] with constraint-aware conditioning mechanisms derived from WP2, allowing generation of temporally coherent, anatomically faithful, and procedurally consistent surgical scenarios. Surgical ontologies and documented procedural guidelines are embedded into the architecture to enforce semantic and causal consistency, while Visual Foundation Models (SAM, MedSAM, CLIP) provide anatomical grounding, and large language models (GPT-4, Med-PaLM 2, Claude 3) interpret structured expert prompts to enable interactive, controllable scenario generation. By generating accurate 3D and video simulations, we enhance our dataset for model training and evaluation, and close collaboration with cardiac surgeons ensures clinical validity and realism of the results. This framework addresses the limitations of existing generative models, which are typically probabilistic, non-anatomically grounded and lack expert-controllable mechanisms, and represents a first step toward interactive, XR-based surgical training and planning that integrates expert knowledge, patient-specific constraints, and multimodal generative AI into a unified system.
Le profil recherché
-Double licence (L3) en biologie et en informatique
-Forte motivation pour la recherche interdisciplinaire à l'interface entre les sciences de la vie et les méthodes computationnelles avancées
-Solide expertise en vision par ordinateur, IA et technologies immersives
-Capacité à mener une recherche appliquée dans des environnements technologiques complexes
Publiée le 03/04/2026 - Réf : 2d9cb1c1be161042ec571db4b4d4fd55
Créez votre compte Hellowork et activez votre alerte Créez une alerte @titleMarkup>
Thèse Jumeau Numérique pour la Formation Immersive en Chirurgie Cardiaque Fondé sur l'IA Générative H/F
- Paris - 75
- CDD
Pour les postes éligibles :
Télétravail partielFinalisez votre candidature
sur le site du partenaire
Hellowork et postulez
sur le site du partenaire !
Ces offres pourraient aussi
vous intéresser
Recherches similaires
- Job Ingénieur en intelligence artificielle
- Job Data et IA
- Job Business analyst
- Job Data engineer
- Job Data analyst
- Job Data scientist
- Job Data manager
- Entreprises Data et IA
- Entreprises Ingénieur en intelligence artificielle
- Entreprises Paris
- Job Fonction publique
- Job Numérique
- Job Débutant sans expérience
- Job Chirurgie
- Job Sans expérience
- Job Fonction publique Paris
- Job Débutant Paris
- Job CDD Paris
- Job Avenir Paris
- Job Sans expérience Paris
Testez votre correspondance
Chargement du chat...
{{title}}
{{message}}
{{linkLabel}}