Trained with Python 3 sources from https://github.com/Jur1cek/gcj-dataset - year 2020.
Used method is language agnostic, so it is pretty easy to train it with some other programming language, actually you can use it with natural text too.
I am too lazy of requirements.txt - you should be fine with numpy, pandas and sklearn. Also if you are gonna train, do not forget to download input file.