Aller au contenu principal
INRIA recrutement

Phd Position F - M Trustworthy Ai-Driven Interpretation Of Malware Attack Behaviours H/F INRIA

  • Rennes - 35
  • CDD
  • Télétravail partiel
  • 36 mois
  • Service public des collectivités territoriales
Lire dans l'app

Détail du poste

PhD Position F/M Trustworthy AI-driven interpretation of malware attack behaviours
Le descriptif de l'offre ci-dessous est en Anglais
Type de contrat : CDD

Niveau de diplôme exigé : Bac +5 ou équivalent

Fonction : Doctorant

Niveau d'expérience souhaité : Jeune diplômé

A propos du centre ou de la direction fonctionnelle

The Inria Centre at Rennes University is one of Inria's nine centres and has more than thirty research teams. The Inria Centre is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc.

Mission confiée

Background

The rapid evolution of cyber threats, particularly sophisticated malware attacks, has increasingly outpaced traditional detection and analysis methods, necessitating the adoption of advanced artificial intelligence (AI) techniques in cybersecurity. While AI-driven models-especially deep learning and explainable AI (XAI)-have shown promise in automating malware detection and interpretation, their deployment in high-stakes environments raises critical concerns regarding reliability, transparency, and trustworthiness. Many state-of-the-art models operate as "black boxes," making it difficult for security analysts to understand, validate, or act upon their predictions, which undermines operational confidence and hinders adoption in real-world defense systems. Furthermore, these models themselves are vulnerable to adversarial manipulations, data poisoning, and concept drift, potentially leading to erroneous or misleading interpretations. As a result, there is a growing imperative to develop AI-based malware analysis frameworks that are not only accurate and adaptive but also interpretable, robust, robust, and aligned with human expertise. This research addresses the foundational challenge of building trustworthy AI systems for malware attack interpretation, aiming to bridge the gap between automated intelligence and human-in-the-loop cybersecurity operations.

The objective of this thesis

This PhD thesis will be funded by ANR PEPR Project DefMal. We will focus on conducting 3 missions in this thesis.

Mission 1: Self-Supervised Learning for Unsupervised Grouping of Malware Behaviors

Traditional malware clustering and family attribution heavily rely on labeled datasets, which are costly to produce, quickly become outdated, and often fail to capture the full spectrum of evolving threats. This mission focuses on leveraging self-supervised learning (SSL) to automatically discover meaningful behavioral patterns in unlabeled malware data-such as network traffic flows from botnets or dynamic execution traces from sandboxed malware-without requiring human-annotated labels.

From a Trustworthy AI perspective, this approach enhances reliability and scalability by reducing dependence on potentially biased or incomplete ground truth. By learning representations directly from raw or minimally processed behavioral data (e.g., system call sequences, API logs, or packet timing features), SSL models can uncover latent structures that reflect real-world attack tactics, such as lateral movement, persistence mechanisms, or command-and-control (C2) communication patterns. Crucially, the learned clusters must be interpretable-not just statistically coherent-so that analysts can understand why certain samples are grouped together. To achieve this, we will integrate contrastive learning frameworks with behavioral feature engineering and post-hoc explanation techniques, enabling human analysts to validate and refine the groupings. This mission thus lays the foundation for trustworthy, label-free malware intelligence that supports proactive threat hunting and early detection of emerging campaigns.

Mission 2: Interpreting and Classifying Malware Sandbox Traces Using AI Models (e.g., Large Language Models)

Sandbox-generated execution traces-comprising sequences of system calls, file operations, registry changes, and network activity-are rich sources of behavioral insight. However, their high dimensionality, noise, and variability make automated analysis challenging. This mission explores the adaptation of Large Language Models (LLMs) and other sequence-based AI architectures (e.g., Transformers, LSTMs) to interpret, summarize, and classify these traces, with a focus on detecting zero-day or novel attack behaviors.

Rather than treating traces as mere input sequences, we will frame malware behavior interpretation as a semantic understanding task, where LLMs are fine-tuned to recognize patterns analogous to "attack narratives" (e.g., privilege escalation credential dumping C2 beaconing). By pre-training on vast corpora of benign and malicious execution logs and incorporating domain-specific knowledge (e.g., MITRE ATT&CK tactics), these models can generate natural language explanations of observed behaviors, significantly improving transparency and analyst trust.

A key challenge is ensuring that the model's predictions are faithful and actionable. We will therefore integrate explainability-by-design mechanisms-such as attention visualization, salient path extraction, and counterfactual reasoning-to allow analysts to interrogate the model's logic. For instance, if the model flags a trace as "likely ransomware," it should highlight the specific file encryption loops and mutex creation patterns that led to this conclusion. This mission advances Trustworthy AI by bridging the gap between complex model outputs and human-understandable cybersecurity decision-making, enabling faster response to previously unseen threats.

Mission 3: Enhancing AI Model Robustness Against Intentionally Modified Malware Behaviors

Malware authors routinely employ evasion techniques-such as code obfuscation, control-flow manipulation, and polymorphism-to bypass signature-based and even AI-driven detection systems. This mission addresses the robustness pillar of Trustworthy AI by developing defense mechanisms that maintain high detection accuracy even when adversaries attempt to perturb static features (e.g., packed binaries) or dynamic behaviors (e.g., delayed payload execution, randomized API calls).

We will investigate adversarial training, behavioral invariant learning, and anomaly detection under distribution shift to build models that focus on core malicious intent rather than superficial, easily manipulated features. For example, while a malware sample may alter its entry point or API call order, its underlying objective-such as injecting code into a remote process-may leave consistent semantic footprints in sandbox traces. By modeling these high-level attack semantics, we aim to create AI systems that are resilient to both known and unknown evasion strategies.

Additionally, we will explore uncertainty quantification and confidence-aware prediction, so that the system can flag inputs where its decision is uncertain-potentially indicating adversarial tampering-thereby supporting human-in-the-loop validation. This mission ensures that the AI models are not only accurate under normal conditions but also reliable and defensible in adversarial settings, a critical requirement for deploying AI in real-world cybersecurity operations.

Expectations

The candidate for this thesis is expected to have accomplished courses on Machine Learning and/or have experience of implementing Machine Learning algorithms using Python for practical data mining problems. Especially, expertise in using Pytorch will be required in the project. Theoretical developments are also expected based on statistics and theory of machine learning and approximation. Knowledge about intrusion detection systems and/or malware analysis will be preferred.

Principales activités

This thesis will be conducted at INRIA Rennes and co-supervised with thesecurity researchers of Eurecom at Sophia-Antipolis and University of Genova in Italy.The Inria Rennes - Bretagne Atlantique Centre is one of Inria's eight centres and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc. The monthly gross salary for the PhD candidate amounts around 2000 euros. For every applicant, please submit online your resume, cover letter and letters of recommendation.

Compétences

Technical skills and level required: Machine Learning (beginner / intermediate level), Pytorch Programming (Proficient) and Intrusion Detection and Malware Classification (beginner / intermediate level)

Languages: English (fluent in speaking, as well as reading and writing scientific papers)

Avantages

- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage

Rémunération

monthly gross salary 2200 euros

A propos d'Inria

Inria est l'institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l'interface d'autres disciplines. L'institut fait appel à de nombreux talents dans plus d'une quarantaine de métiers différents. 900 personnels d'appui à la recherche et à l'innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.

Publiée le 09/10/2025 - Réf : 9df9ffae121dce821ecf242518b71dab

Phd Position F - M Trustworthy Ai-Driven Interpretation Of Malware Attack Behaviours H/F

INRIA
  • Rennes - 35
  • CDD
Publiée le 09/10/2025 - Réf : 9df9ffae121dce821ecf242518b71dab

Finalisez votre candidature

sur le site du recruteur

Créez votre compte pour postuler

sur le site du recruteur !

Voir plus d'offres
Les sites
L'emploi
  • Offres d'emploi par métier
  • Offres d'emploi par ville
  • Offres d'emploi par entreprise
  • Offres d'emploi par mots clés
L'entreprise
  • Qui sommes-nous ?
  • On recrute
  • Accès client
Les apps
Application Android (nouvelle fenêtre) Application ios (nouvelle fenêtre)
Nous suivre sur :
Informations légales CGU Politique de confidentialité Gérer les traceurs Accessibilité : non conforme Aide et contact