Skip to content

Example baseline source code authorship attribution classifier

Notifications You must be signed in to change notification settings

Jur1cek/stylometry-baseline-python

Repository files navigation

Example baseline source code authorship attribution classifier

Trained with Python 3 sources from https://github.com/Jur1cek/gcj-dataset - year 2020.

Used method is language agnostic, so it is pretty easy to train it with some other programming language, actually you can use it with natural text too.

I am too lazy of requirements.txt - you should be fine with numpy, pandas and sklearn. Also if you are gonna train, do not forget to download input file.

About

Example baseline source code authorship attribution classifier

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published