Welcome to the Reference Implementations for the MIDST Challenge (Membership Inference over Diffusion-models-based Synthetic Tabular Data) - SaTML 2025!
This repository provides code implementations and resources to support participants in the MIDST challenge, focusing on membership inference attacks over synthetic tabular data generated by diffusion models. Our implementations are based on the Diffusion Model Bootcamp provided by the Vector Institute.
The midst_models directory contains implementations and resources for the MIDST Challenge. It is organized into the following subdirectories:
- multi_table_ClavaDDPM/: Contains code and resources for multi-table data synthesis using the ClavaDDPM model. This implementation is designed for generating synthetic data across multiple relational tables using diffusion-based generative models.
- single_table_TabDDPM/: Includes implementations for single-table data synthesis using the TabDDPM algorithm. This directory provides code, notebooks, and resources focused on applying TabDDPM to synthesize data from single-table datasets.
- single_table_TabSyn/: Hosts the code and resources related to the TabSyn algorithm for single-table data synthesis. It contains notebooks and examples demonstrating how to use TabSyn for generating synthetic tabular data and evaluating its performance.
The repository also contains a starter_kits directory that provides an overview of the MIDST competitions and outline how to package a submission to submit to CodaBench.
To get started with this repository, follow these steps:
- Clone this repository to your machine.
- Activate your python environment. You can create a new environment using the following command:
pip install --upgrade pip poetry
poetry env use [name of your python] #python3.9
source $(poetry env info --path)/bin/activate
poetry install --with "tabsyn, clavaddpm"
# If your system is not compatible with pykeops, you can uninstall it using the following command
pip uninstall pykeops
# Install the kernel for jupyter (only need to do it once)
ipython kernel install --user --name=midst_models
- Begin with each model in the
midst_models/
directory, as guided by the README files.
This project is licensed under the terms of the [LICENSE] file located in the root directory of this repository.