Skip to content

This project aims to develop an approach for the detection of fake images, with a particular focus on those generated by models such as Stable Diffusion. Different libraries were used in this project like TensorFlow, Keras and pytorch.

Notifications You must be signed in to change notification settings

Malekbennabi3/DeepFake_Detection_TER

Repository files navigation

Objectives of the project:

The main objective of this work is to propose deep learning-based methods for binary image classification (True/Fake) and then to evaluate the performance of these implemented approaches.

The implemented approaches

In this project we have implemented 3 distinct approaches, each based on a very specific technique:

  • Direct approach (CNN)
  • Frequency approach (Fourier spectrum)
  • Dimensionality reduction approach (Auto-encoder)

Datasets

We used 5 main datasets to perform this work:

  • For the real images

  • For the fake datasets

    • Deep Fake Face (DFF) Hugging Face:

      • Inpainting: Inpainting stable diffusion
      • Text2image: Stable diffusion v1.5
      • Insight: Toolbox InsightFace
    • 140k Real and Fake faces (Available on Kaggle)

    Datasets distribution with PCA

    PCA data distribution

    The data were displayed to make sure that the different datasets are interlaced and not linearly separable, in such a way that the model differentiates them by characteristics.

    Approaches

    • Direct approach: This approach is based on training a convolutional neural network(CNN) on images from the inpainting and wiki datasets. The goal is to build a classifier capable to distinguish between real and faked images. To achieve this, we opted for fine-tuning and CNN models based on pre-trained architectures. Direct approach

    • Fourier approach: For this approach, we select DenseNet_V2 because of its ability to converge quickly. The calculation of the Fourier transform enables us to capture specific features that are often present in deepfakes but invisible in the spatial domain.

      To carry out this method we generated a dataset by following the steps below:

      • Conversion of each image into grey levels
      • Application of the 2D Fourier Transform to the greyscale image
      • Centring the Zero Frequency (for easier viewing)
      • Calculation of the Magnitude Spectrum (Logarithm of the absolute value of the Fourier transform)
      • Recording Magnitude Images: The resulting images of the magnitude spectrum were recorded and used as input data for the DenseNet_V2 model. Fourier approach
    • Autoencodeur approach: For this approach, we used an autoencoder trained on 60,000 images from the CelebA dataset. The aim was to develop a model capable of producing compact and informative compact and informative embeddings of images of faces, that can be used for binary classification to distinguish real images from deepfakes. The autoencoder consists of an encoder and a decoder. The encoder transforms the images into a latent vector, while the the decoder reconstructs the images from this latent vector.

      Experiments

      • Training (Inpainting Dataset ):

        • train set 80%
        • validation set 10%
        • testing set 10%
      • Evaluation (30k images from each dataset):

        • Dataset Insight
        • Dataset Text2image
        • Dataset 140k images(GAN).

      Results

      • Direct approach: Direct approach

      • Fourier approach Fourier approach

      • Autoencoder appraoch Autoencoder approach

      Conclusion

      The direct approach (CNN) showed a good ability to detect deepfakes generated by Stable Diffusion, particularly for Text2Image images. However, it has encountered difficulties in generalising this detection to other deepfake generators such as InsightFace and GAN.

      The approach based on the Fourier spectrum confirmed the hypothesis regarding the presence of scattering artefacts in the generated images, demonstrating its effectiveness for detection of deepfakes. On the other hand, the approach based on dimensionality reduction with autoencoders has less conclusive, probably due to the difficulty of extracting significant features from real features from real faces.

for more details and information, please consult the final Report [French]

About

This project aims to develop an approach for the detection of fake images, with a particular focus on those generated by models such as Stable Diffusion. Different libraries were used in this project like TensorFlow, Keras and pytorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published