Skip to content
@mit-han-lab

MIT HAN Lab

Efficient AI Computing. PI: Song Han

Pinned Loading

  1. streaming-llm streaming-llm Public

    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks

    Python 6.7k 368

  2. smoothquant smoothquant Public

    [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

    Python 1.3k 150

  3. llm-awq llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 2.6k 208

  4. bevfusion bevfusion Public archive

    [ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

    Python 2.4k 427

  5. once-for-all once-for-all Public

    [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

    Python 1.9k 333

  6. temporal-shift-module temporal-shift-module Public

    [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

    Python 2.1k 417

Repositories

Showing 10 of 56 repositories
  • nunchaku Public

    SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

    mit-han-lab/nunchaku’s past year of commit activity
    Cuda 369 Apache-2.0 17 18 1 Updated Nov 28, 2024
  • tinyengine Public

    [NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory

    mit-han-lab/tinyengine’s past year of commit activity
    C 811 MIT 132 33 1 Updated Nov 27, 2024
  • Quest Public

    [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

    mit-han-lab/Quest’s past year of commit activity
    Cuda 210 21 3 0 Updated Nov 22, 2024
  • torchquantum Public

    A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers.

    mit-han-lab/torchquantum’s past year of commit activity
    Jupyter Notebook 1,346 MIT 204 59 (4 issues need help) 8 Updated Nov 19, 2024
  • torchsparse Public

    [MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

    mit-han-lab/torchsparse’s past year of commit activity
    Cuda 1,231 MIT 143 27 2 Updated Nov 12, 2024
  • efficientvit Public

    Efficient vision foundation models for high-resolution generation and perception.

    mit-han-lab/efficientvit’s past year of commit activity
    Python 2,399 Apache-2.0 194 93 0 Updated Nov 12, 2024
  • deepcompressor Public

    Model Compression Toolbox for Large Language Models and Diffusion Models

    mit-han-lab/deepcompressor’s past year of commit activity
    Python 239 Apache-2.0 18 20 1 Updated Nov 10, 2024
  • qserve Public

    QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

    mit-han-lab/qserve’s past year of commit activity
    Python 456 Apache-2.0 25 27 3 Updated Nov 9, 2024
  • mit-han-lab/tinychat-tutorial’s past year of commit activity
    C++ 53 20 4 2 Updated Nov 5, 2024
  • distrifuser Public

    [CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

    mit-han-lab/distrifuser’s past year of commit activity
    Python 601 MIT 24 8 1 Updated Nov 4, 2024