Privacy-friendly Image Preparation for TinyML

What is it?

We develop a processing workflow for images which involves facial anonymization to offer a privacy-friendly AI model generation process. It is necessary to mention that the developed method is planned to be adopted to TinyML-based pose estimation use cases, where the target hardware is a microcontroller unit with limited computational resources. 

Face anonymization is a computer vision technique where identifying facial information is removed from digital images.  The goal is to develop a method of anonymization, where the data stays private while preventing the machine learning algorithm from being corrupted.

Why is it necessary?

With the rising importance of data protection and individual privacy, face anonymization has drawn increased attention from industry. Since the introduction of the General Data Protection Regulation (GDPR) by the European Union in early 2018, privacy protection became an indispensable task of research fields, institutions and companies using personal data. GDPR requires regular personal consent from individuals for using their privacy-sensitive data, thereby making it difficult to work with these types of resources. However, the regulation leaves space for non-consensual use of these images if the individual is unrecognizable.

How does it work?

Our method uses a denoising diffusion probabilistic model (DDPM) to generate images of faces, as this type of neural network are shown to achieve remarkable results in the field of image generation. The model is based on the public Huggingface implementation and features a U-Net architecture. The framework of our choice is PyTorch.  

The goal of this model is not only to generate facial images, but to replace ones in a way that the generated face fits the context of the original. For this, we decide to condition the model on facial keypoints. These are facial landmarks, like the ears, eyes, nose etc. For the model training, we chose the Flickr Diverse Faces 256 (FDF256) dataset, as it contains 248,564 high-resolution (256x256) images of faces, with 7 annotated keypoints. For a more controllable image generation, we add classifier-free guidance to the model, and for better context fitting, we add reconstruction guidance.  

We try to construct a model that is as computationally efficient as possible, to be able to run in a TinyML configuration. Optionally, we use model distillation to reduce the memory and computation footprint of the model during inference time.