finetuning

Pipeline for experimenting and evaluating LLM ability to audit smart contracts

Usage

Set up two environment variables (recommended to do this inside the terminal config):

export ICL_USERNAME={imperial_username}
export ICL_PASSWORD={imperial_password}

Run script ./start-job.sh to reserve a job.
Run script ./run-server.sh -p {PORT} -e {ENV} to start a Jupyter server on PORT running your {ENV} python environment.

Formal Verification

This repo contains a Web application which goes through model outputs and allows for manual labeling of the results.

The accuracy of the evaluating model is then approximated using a Sequential Massart algorithm (See: formal-verification/sequential_massart_smc.ipynb

Runnnig the web app

Run make install
Run make run to start the app

Results

In total 213 rows were verified where each row contains 3 criteria or 639 samples.

Below is the resulting accuracy of gpt-4o-mini for evaluating audits.

633: Interval: [0.9223155036427156, 0.9780027129062199] - Samples: 634.0 - Estimate: 0.9557661927330173

The model achieves an evaluation accuracy of $(95.6\pm 5)$% with $97.6$% confidence.

To be more exact the accuracy lays between $92.2$% and $97.8$% with $97.6$% confidence. This means that the probability that the estimate is outside of this range is just $2.4$%.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
evaluation-prompts		evaluation-prompts
formal-verification		formal-verification
runs		runs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
all-descriptions.png		all-descriptions.png
all-recommendations.png		all-recommendations.png
config.py		config.py
evaluate.ipynb		evaluate.ipynb
finetune_deepseek.ipynb		finetune_deepseek.ipynb
finetune_unsloth.ipynb		finetune_unsloth.ipynb
inference_deepseek.ipynb		inference_deepseek.ipynb
inference_llama.ipynb		inference_llama.ipynb
inference_openai.ipynb		inference_openai.ipynb
recommendations_deepseek.ipynb		recommendations_deepseek.ipynb
recommendations_llama.ipynb		recommendations_llama.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

finetuning

Usage

Formal Verification

Runnnig the web app

Results

About

Releases

Packages

Contributors 2

Languages

License

MSc-Smart-Contract-Auditing/finetuning

Folders and files

Latest commit

History

Repository files navigation

finetuning

Usage

Formal Verification

Runnnig the web app

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages