MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training, by Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu, ACL 2021, is a large-scale pre-trained model for symbolic music understanding. It has several mechanisms including OctupleMIDI encoding and bar-level masking strategy that are specifically designed for symbolic music data, and achieves state-of-the-art accuracy on several music understanding tasks, including melody completion, accompaniment suggestion, genre classification, and style classification.
Projects using MusicBERT:
- midiformers: a customized MIDI music remixing tool with easy interface for users. (notebook)
-
please use the provided segmented midi file
total.csv
segment_midi.zip
since there is file name error in original Google Drive file. -
other data
ex) metadata of annotators, original files, ...
are in the drive -
process
total.csv
file to json file.python map_midi_to_label.py
- File
midi_label_map_apex_reg_cls.json
is generated. - Currently, peak value from kernel density estimation is used as label.
- You can also try: use all data / mean / median ... etc
- File
-
Generate XAI for music dataset in OctupleMIDI format using the midi to label mapping file with
gen_xai.py
.python -u gen_xai.py
- Currently, train / test set is splitted by segments
-
Binarize the raw text format dataset. (this script will read
xai_data_raw_apex_reg_cls
folder and outputxai_data_bin_apex_reg_cls
)bash scripts/binarize_xai.sh xai
- Download our pre-trained checkpoints here: small and base, and save in the
checkpoints
folder. (a newer version of fairseq is needed for using provided checkpoints: see issue-37 or issue-45)
-
you should modify hyperparameters, checkpoint path, etc in sh file.
-
using pre-trained model
bash scripts/train_xai_base_small.sh # checkpoints/checkpoint_last_musicbert_base.pt, checkpoints/checkpoint_last_musicbert_base.pt
-
If file path error, try
export PYTHONPATH=`pwd`
-
To custom the model, check
musicbert/__init__.py
- Some custom arguments are provided
- Check fairseq for more information.
-
Sample script for Regression task using LSTM
bash scripts/train_lstm.sh