Pas de salaire renseigné
Thèse Compression d'Images par Réseaux Neuronaux à Surapprentissage Contrôlé Adaptée aux Plateformes Matérielles Embarquées Fpga et Asic H/F
Institut Polytechnique de Paris Télécom Paris
- Paris - 75
- CDD
- Bac +5
- Service public d'état
Les missions du poste
La compression d'images par apprentissage (Learned Image Compression, LIC) remplace les étapes traditionnelles de transformation et de prédiction conçues manuellement par des codecs neuronaux entraînés de bout en bout. Dans ces systèmes, un encodeur projette l'image vers une représentation latente compacte, qui est ensuite quantifiée et codée entropiquement en un flux binaire, tandis qu'un décodeur reconstruit l'image à partir des caractéristiques latentes décodées [1]. La LIC atteint désormais des performances débit-distorsion compétitives par rapport aux standards modernes de compression tels que H.266/VVC et son successeur hybride intégrant l'IA, NNVC [2], ce qui motive d'importants efforts internationaux de normalisation, notamment JPEG AI [3], qui vise explicitement des systèmes de compression par apprentissage déployables, avec une complexité de décodeur maîtrisée et un coût matériel prévisible [3], [7].
Malgré ces avancées, la déployabilité demeure une limitation majeure. La complexité du décodeur des codecs appris se situe généralement entre environ 80 et 380 kMAC par pixel [4], [5], avec des augmentations supplémentaires dans les architectures récentes en raison de transformations plus expressives et d'un trafic mémoire accru [6]. Ces contraintes sont particulièrement critiques pour les plateformes embarquées, où la latence du décodeur, la bande passante mémoire et la consommation énergétique déterminent directement la faisabilité du déploiement. Réduire l'écart entre performance de compression et déployabilité constitue ainsi un défi central.
Cette thèse propose de relever ce défi en développant des modèles de compression par apprentissage intégrant explicitement les contraintes matérielles dès leur conception (approche hardware-aware). Les travaux s'appuieront sur les avancées récentes en compression apprise économe en paramètres et en compression par optimisation spécifique à l'image [8]-[11], et seront validés par une implémentation sur FPGA, avec une perspective de déploiement sur ASIC. Cette recherche prolonge directement nos contributions récentes à l'état de l'art en matière de déploiement et d'optimisation de la compression apprise, notamment deux articles récents publiés (ou en cours de révision) dans IEEE Transactions on Multimedia (TMM), démontrant une intégration efficace sur FPGA et une conception de codec tenant compte des contraintes matérielles [16], [17].
Since the introduction of end-to-end learned compression [1], research has focused primarily on improving rate-distortion performance through increasingly expressive models. While this has led to compression performance comparable to modern standards [2], it has also resulted in architectures whose computational and memory requirements significantly complicate deployment. JPEG AI explicitly emphasizes decoder simplicity, architectural regularity, and hardware compatibility as core requirements for practical adoption [3], [7].
Existing hardware implementations demonstrate that learned compression can be deployed efficiently under carefully optimized conditions [12]-[15]. However, these implementations largely rely on adapting architectures originally developed without hardware constraints. As a result, achieving deployability requires extensive post hoc optimization, including pruning, quantization, and architectural simplification, which improve efficiency but do not fundamentally address the relationship between codec structure and hardware cost.
Our prior work has systematically explored this optimization space and demonstrated that substantial hardware efficiency gains can be achieved through targeted hardware-aware training and model adaptation [16], [17]. Specifically, we developed a knowledge distillation framework tailored to learned compression, in which knowledge is transferred not only at the reconstruction level but also at the entropy modeling level. This includes distillation of latent distributions to preserve entropy coding efficiency, combined with adaptive reconstruction transfer to maintain distortion performance. This approach enabled the training of significantly more efficient student models while preserving rate-distortion performance.
In addition, we introduced mixed-precision quantization strategies that selectively assign higher precision to entropy-sensitive and reconstruction-critical layers, while aggressively quantizing less sensitive components. This allows efficient hardware implementation without compromising compression performance. Combined with structured pruning, depthwise convolution substitution, and selective freezing of encoder components to stabilize latent representations, these methods enabled substantial reductions in computational complexity and memory footprint.
At the hardware level, we further demonstrated efficient FPGA deployment through hardware-aware pipelining, optimized memory access scheduling, and quantization-aware design, enabling real-time execution of learned compression models under realistic hardware constraints [16], [17]. These results establish state-of-the-art deployment efficiency and confirm that learned compression can be successfully integrated into hardware-constrained environments.
However, these results also highlight a fundamental limitation: such optimizations operate within the structural constraints of existing architectures. While pruning, quantization, knowledge distillation, and architectural simplification can significantly improve hardware efficiency, they do not fundamentally alter the underlying codec structure, which was originally designed without hardware constraints. As compression architectures evolve and increase in complexity [6], the effectiveness of incremental optimization alone becomes increasingly limited.
Recent advances in parameter-efficient and overfitted learned compression introduce a fundamentally different perspective [8]-[11]. These approaches demonstrate that compression performance can be achieved using compact parameterizations optimized per image, enabling structurally simpler decoding and introducing new degrees of freedom in codec design. From a hardware perspective, this suggests that compression models themselves can be designed to align with hardware constraints, rather than adapting hardware to existing architectures.
This motivates a new research direction: learned compression design from first principles, in which hardware constraints are treated as fundamental design parameters rather than implementation afterthoughts.
The overall objective is to develop a hardware-aware framework for overfitted learned image compression that reconciles compression efficiency, implementation simplicity, and industrial deployment constraints.
More specifically, the PhD aims to:
O1. Hardware-constrained codec formulation
Develop learned compression formulations in which decoder complexity, memory access, and parameter count are explicitly bounded during model design and training. This includes defining model parameterizations and training objectives that allow direct control over decoder MAC/pixel, memory bandwidth, and numerical precision requirements.
O2. Hardware-aware parameterization and optimization strategies
Design and evaluate parameter-efficient codec representations, including structured parameterizations and constrained optimization schemes inspired by recent overfitted and parameter-efficient learned compression approaches , with the goal of minimizing decoder hardware cost while preserving rate-distortion performance.
O3. Predictive hardware cost modeling and codec dimensioning
Develop analytical and empirical models linking compression model properties (parameter count, layer structure, latent dimensionality, precision) to hardware performance metrics including throughput, latency, memory bandwidth, and energy consumption. These models will enable principled dimensioning of learned codecs to meet explicit deployment targets.
O4. FPGA-based hardware validation and co-design
Implement and evaluate hardware-aware learned compression codecs on FPGA platforms to validate real-time feasibility and quantify system-level performance. This includes integration of quantization-aware training, structured model optimization, and hardware-efficient execution strategies building on prior work .
M1. Hardware-constrained learned compression formulation (O1)
We will reformulate learned compression under explicit hardware constraints by controlling latent dimensionality, entropy model complexity, operator structure, and numerical precision. Unlike conventional LIC, which optimizes only rate-distortion performance [1], training will incorporate constraints on decoder MAC/pixel, parameter count, and memory footprint. Building on our prior work [16], [17], quantization-aware training and entropy-preserving knowledge distillation will be used to maintain compression efficiency while enforcing hardware-compatible model structures.
M2. Hardware-aware parameterization and training (O2)
We will investigate parameter-efficient and structured codec parameterizations inspired by recent learned compression advances [8]-[11], prioritizing hardware-efficient execution. This includes structured latent representations, operator substitutions such as depthwise and grouped convolution, and structured parameter layouts to reduce computation and memory traffic. Training strategies developed in our prior work [16], [17], including entropy-level knowledge distillation, adaptive reconstruction transfer, selective encoder freezing, and mixed-precision quantization, will be extended to explicitly optimize hardware-relevant metrics while preserving rate-distortion performance.
M3. Predictive hardware cost modeling and dimensioning (O3)
Analytical and empirical models will be developed to relate codec structure to hardware execution cost, including MAC/pixel, memory bandwidth, latency, and resource utilization. These models will enable principled dimensioning of learned compression models to meet deployment constraints such as real-time throughput and energy budgets, extending established complexity analysis methodologies [5].
M4. FPGA implementation and hardware-algorithm co-design (O4)
Hardware-aware compression models will be implemented on FPGA platforms to validate deployability and quantify system-level performance. Implementations will integrate fixed-point and mixed-precision quantization, pipelined execution, and optimized memory access scheduling, building on prior FPGA deployment work [16], [17] and established learned compression accelerators [12]-[15]. Experimental measurements will guide iterative refinement of codec structure and training, enabling systematic co-design of compression models and hardware execution.
Le profil recherché
- Expérience en systèmes embarqués, conception sur FPGA, optimisation et analyse des performances
- Expertise en apprentissage profond et en compression d'images par apprentissage, avec des bases solides en théorie de l'information
Bienvenue chez Institut Polytechnique de Paris Télécom Paris
École doctorale : Ecole Doctorale de l'Institut Polytechnique de Paris
Laboratoire de recherche : Laboratoire de Traitement et Communication de l'Information
Direction de la thèse : Lirida NAVINER ORCID 0000000263204153
Début de la thèse : 2026-10-01
Date limite de candidature : 2026-04-14T23:59:59
Publiée le 17/03/2026 - Réf : 4d6c632ffddf175f65e0756d748fb4c4
Créez votre compte Hellowork et activez votre alerte
Thèse Compression d'Images par Réseaux Neuronaux à Surapprentissage Contrôlé Adaptée aux Plateformes Matérielles Embarquées Fpga et Asic H/F
- Paris - 75
- CDD
Finalisez votre candidature
sur le site du
partenaire
Créez votre compte
Hellowork et postulez
sur le site du
partenaire !
sur le site du partenaire
Hellowork et postulez
sur le site du partenaire !
Ces offres pourraient aussi
vous intéresser
Testez votre correspondance
Chargement du chat...
{{title}}
{{message}}
{{linkLabel}}