Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 1.55 KB

README.md

File metadata and controls

25 lines (20 loc) · 1.55 KB

CS/ECE/ME/EP 759 Spring 2021 Final Project

This README contains the code base for Rui Pan's final project report: Cautiously Aggressive GPU Space Sharing to Improve Resource Utilization and Job Efficiency.

Some of the prerequisites for replicating the results include:

  • An NVIDIA GPU with Volta architecture
  • Python 3.8 nightly build
  • CUDA-compatible PyTorch & TorchVision

This repo contains:

  • /data: Source data for running the workloads. It should be set up as follows:
  • /latex: LaTex files for editing the report on Overleaf
  • /output: Core-specific utilizations of workloads produced using an earlier version of the profiler
  • /tables: Shell scripts for replicating the profiling results in various tables
  • /workloads: Common DL/HPC workloads used in the evaluations. A lot of these are copied from Gavel.
  • plotting.ipynb: Jupyter Notebook that produces all figures in the report
  • profiler.py: Profiler parser wrapped around nvprof
  • pymps.py: Provides Python access to NVIDIA CUDA Multi-Process Service (MPS)
  • README.md: Well, of course I know him. He's me.
  • report.pdf: PDF version of the final report