Releases: sacdallago/biotrainer
Releases · sacdallago/biotrainer
v0.9.5
09.12.2024 - Version 0.9.5
Features
- Added integration for huggingface datasets by @heispv in #124
- Added per-sequence dimension reduction methods by @nadyadevani3112 in #123
- Improving one_hot_encoding embedder with numpy functions @SebieF
Maintenance
- Fixing "precis
sion" typo inclasification_solver.py
- Updating dependencies
- Improving documentation of the config module by @heispv in #121
- Improving
compute_embeddings
function to handleDict
,str
andPath
asinput_data
- Reducing log level of onnx and dynamo to ERROR to decrease logging output
- Fixing first_steps documentation
- Adding links to biocentral app, repository and biotrainer documentation
v0.9.4
29.10.2024 - Version 0.9.4
Bug fixes
Maintenance
- Updating dependencies: removing python3.9 support
- Updating CI workflow to be compatible with Windows
Known problems
- Currently, there are compatibility problems with ONNX on some machines, please refer to the following issue: #111
v0.9.3
v0.9.2
v0.9.1
10.07.2024 - Version 0.9.1
Maintenance
- Fixing error in type checking for device
- Updating dependencies
- Updating inference examples
- Adding hint for version mismatch in inferencer
- Adding class weights to
out.yml
if they are calculated - Adding contributors file
Features
- Improving fallback mechanism of embedder models. Now, cpu mode is exited once there is enough
RAM again for shorter sequences - Changing model storage format from
.pt
to.safetensors
.
Safetensors is safer for model sharing. Legacy.pt
format is still supported, and can be converted via
from biotrainer.inference import Inferencer
inferencer, out_file = Inferencer.create_from_out_file(out_file_path="out.yml", allow_torch_pt_loading=True)
inferencer.convert_all_checkpoints_to_safetensors()
v0.9.0
16.06.2024 - Version 0.9.0
Maintenance
- Adding more extensive code documentation
- Optimizing imports
- Applying consistent file naming
- Updating dependencies. Note that
jupyter
was removed as a direct optional dependency.
You can always add it viapoetry add jupyter
. - Adding simple differentiation between t5 and esm tokenizer and models in
embedders
module
Features
- Adding new
residues_to_value
protocol.
Similar to the residues_to_class protocol,
this protocol predicts a value for each sequence, using per-residue embeddings. It might, in some situations, outperform
the sequence_to_value protocol.
Bug fixes
- For
huggingface_transformer_embedder.py
, all special tokens are now always deleted from the final embedding
(e.g. first/last for esm1b, last for t5)
v0.8.4
v0.8.3
04.05.2024 - Version 0.8.3
Maintenance
- Updating dependencies
Features
- Adding mps device for macOS. Use by setting the following configuration option:
device: mps
.
Note that MPS is still under development, use it at your responsibility. - Adding flags to the
compute_embedding
method ofEmbeddingService
force_output_dir
: Do not change the given output directory within the methodforce_recomputing
: Always re-compute the embeddings, even if an existing file is found
These changes are made to make the embedders module of biotrainer easier usable outside the biotrainer pipeline itself.
v0.8.2
Maintenance
- Updating dependencies
Features
- Adding option to ignore verification of files in
configurator.py
. This makes it possible to verify a biotrainer
configuration independently of the provided files. - Adding new compute_embeddings_from_list function to
embedding_service.py
. This allows to compute embeddings directly
from sequence strings.
v0.8.1
12.01.2024 - Version 0.8.1
Maintenance
- Updating dependencies after removing bio_embeddings, notably upgrading torch and adding accelerate
- Updating examples, documentation, config and test files for inferencer tests to match the new compile mode
- Replaced the exception with a warning if dropout_rate was set for a model that does not support it (e.g. LogReg)
Features
- Enable pytorch compile mode. The feature exists since torch 2.0 and is now available in biotrainer. It can be enabled via
disable_pytorch_compile: False