Skip to content

pulp-platform/Deeploy

Repository files navigation

Deeploy

CI Docker GitHub last commit License

Deeploy is an ONNX-to-C compiler that generates low-level optimized C Code for multi-cluster, heterogeneous SoCs. Its goal is to enable configurable deployment flows from a bottom-up compiler perspective, modeling target hardware in a fine-grained and modular manner.

Deeploy is developed as part of the PULP project, a joint effort between ETH Zurich and the University of Bologna.

License

Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All software sources and tool scripts are licensed under Apache 2.0, except for files contained in the scripts directory, which are licensed under the MIT license, and files contained in the DeeployTest/Testsdirectory, which are licensed under the Creative Commons Attribution-NoDerivates 4.0 International license (CC BY-ND 4.0).

Getting started

Installing Deeploy is as simple as running:

pip install -e . --extra-index-url=https://pypi.ngc.nvidia.com

However, to run the code generated by Deeploy on a certain target, you need the toolchains and the simulators associated with this platform.

We provide a Docker container where Deeploy works Out-of-the-Box (i.e. with all the dependencies pre-installed). To pull the docker image, run:

docker pull ghcr.io/pulp-platform/deeploy:main

Then you can start the container in interactive mode with:

docker run -it ghcr.io/pulp-platform/deeploy:main

From the container, clone Deeploy, its submodules, and install the package with:

git clone https://github.com/pulp-platform/Deeploy.git && cd Deeploy
git submodule update --init --recursive
pip install -e . --extra-index-url=https://pypi.ngc.nvidia.com

Congratulations, you installed Deeploy and its dependencies! Now, to test your installation let's run one simple test on each platform with the following commands:

cd DeeployTest && source /app/install/pulp-sdk/configs/siracusa.sh
python testRunner_generic.py -t Tests/Adder
python testRunner_cortexm.py -t Tests/Adder
python testRunner_mempool.py -t Tests/Adder
python testRunner_siracusa.py -t Tests/Adder --cores=8

You can find the ONNX file in DeeployTest/Tests/Adder, to visualize it, you can use Netron. You can also find the generated code for the platform X in TEST_X in DeeployTest and you should notice that the generated code for the Adder test is very simple. However, this gets more complex when you add tiling. Let's generate the code for a single layer but using tiling this time:

python testRunner_tiled_siracusa.py -t Tests/testMatMul --cores=8 --l1=16000

Now you can open the generated code in DeeployTest/TEST_SIRACUSA/Tests/testMatMul/Network.c and see how we executed a tiled layer.

Supported Platforms

Documentation

To build the documentation, simply run:

make docs

Then open docs/_build/html/index.html for more extensive documentation & getting-started guides.

Publications

ESWEEK 2024: Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers

@article{schererDeeployEnablingEnergyEfficient2024,
  title = {Deeploy: {{Enabling Energy-Efficient Deployment}} of {{Small Language Models}} on {{Heterogeneous Microcontrollers}}},
  shorttitle = {Deeploy},
  author = {Scherer, Moritz and Macan, Luka and Jung, Victor J. B. and Wiese, Philip and Bompani, Luca and Burrello, Alessio and Conti, Francesco and Benini, Luca},
  year = {2024},
  month = nov,
  journal = {IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems},
  volume = {43},
  number = {11},
  pages = {4009--4020},
  issn = {1937-4151},
  doi = {10.1109/TCAD.2024.3443718},
}

The preprint version is also available on arXiv: arXiv:2408.04413.