Release 0.4.7 · deepset-ai/FARM

Main changes

Support for MiniLM Model (#464)

Interesting model from Microsoft that is up to 2.7x faster than BERT, while showing similar or better performance on many tasks (Paper). We found it particularly interesting for QA and also published a fine-tuned model on SQuAD 2.0: deepset/minilm-uncased-squad2

Benchmarks per component (#491)

Measuring the speed of individual components in the pipeline while respecting CUDA's async behaviour. We were especially interested in analyzing how much time we spend for QA in preprocessing, language model, and prediction head. Turns out it's on average about 20% : 50% : 30%. Interestingly, there's a high variance in the prediction head depending on the relevance of the question. We will use that information to further optimize performance in the prediction head. We'll share more detailed benchmarks soon.

Support for PyTorch 1.6 (#502)

We now support 1.6 and 1.5.1

Details

Question Answering

Pass max_answers param to processor #503
Deprecate QA input dicts with [context, qas] as keys #472
Squad processor verbose feature #470
Propagate QA ground truth in Inferencer #469
Ensure QAInferencer always has task_type "question_answering" #460

Other

Download models from (private) S3 #500
fix _initialize_data_loaders in data_silo #476
Remove torch version wildcard in requirements #489
Make num processes parameter consistent across inferencer and data silo #480
Remove rest_api_schema argument in inference_from_dicts() #474
farm.data_handler.utils: Add encoding to open write in split_file method #466
Fix and document Inferencer usage and pool handling #429
Remove assertions or replace with logging error #468
Remove baskets without features in _create_dataset. #471
fix bugs with regression label standardization #456

Big thanks to all contributors!
@PhilipMay @Timoeller @tanaysoni @brandenchan @bogdankostic @kolk @rohanag @lingsond @ftesser

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.4.7