Skip to content

Reference implementations for the MIDST challenge (Membership Inference over Diffusion-models-based Synthetic Tabular data) - SaTML 2025!

License

Notifications You must be signed in to change notification settings

VectorInstitute/MIDSTModels

Repository files navigation

MIDST Models

Welcome to the Reference Implementations for the MIDST Challenge (Membership Inference over Diffusion-models-based Synthetic Tabular Data) - SaTML 2025!

This repository provides code implementations and resources to support participants in the MIDST challenge, focusing on membership inference attacks over synthetic tabular data generated by diffusion models. Our implementations are based on the Diffusion Model Bootcamp provided by the Vector Institute.

Repository Structure

The midst_models directory contains implementations and resources for the MIDST Challenge. It is organized into the following subdirectories:

  • multi_table_ClavaDDPM/: Contains code and resources for multi-table data synthesis using the ClavaDDPM model. This implementation is designed for generating synthetic data across multiple relational tables using diffusion-based generative models.
  • single_table_TabDDPM/: Includes implementations for single-table data synthesis using the TabDDPM algorithm. This directory provides code, notebooks, and resources focused on applying TabDDPM to synthesize data from single-table datasets.
  • single_table_TabSyn/: Hosts the code and resources related to the TabSyn algorithm for single-table data synthesis. It contains notebooks and examples demonstrating how to use TabSyn for generating synthetic tabular data and evaluating its performance.

The repository also contains a starter_kits directory that provides an overview of the MIDST competitions and outline how to package a submission to submit to CodaBench.

Getting Started

To get started with this repository, follow these steps:

  1. Clone this repository to your machine.
  2. Activate your python environment. You can create a new environment using the following command:
pip install --upgrade pip poetry
poetry env use [name of your python] #python3.9
source $(poetry env info --path)/bin/activate
poetry install --with "tabsyn, clavaddpm"
# If your system is not compatible with pykeops, you can uninstall it using the following command
pip uninstall pykeops
# Install the kernel for jupyter (only need to do it once)
ipython kernel install --user --name=midst_models
  1. Begin with each model in the midst_models/ directory, as guided by the README files.

License

This project is licensed under the terms of the [LICENSE] file located in the root directory of this repository.

About

Reference implementations for the MIDST challenge (Membership Inference over Diffusion-models-based Synthetic Tabular data) - SaTML 2025!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published