-
Notifications
You must be signed in to change notification settings - Fork 4
Frequently Asked Questions (FAQ)
In our experiments, few-shot learning achieved relatively low accuracy for language models that can be easily deployed on consumer-grade GPUs, such as GPT-2. However, the beam search approach implemented in hashformers enabled us to achieve state-of-the-art results for this task.
Nevertheless, it remains an open research question whether few-shot learning is more effective than our current beam search approach for large language models at the scale of GPT-J and Dolly. I plan to eventually benchmark and compare both approaches on our hashtag segmentation datasets.
After the release of version 2.0, hashformers
now supports any valid device as segmenter_device
or reranker_device
:
from hashformers import TransformerWordSegmenter as WordSegmenter
ws = WordSegmenter(
segmenter_model_name_or_path="gpt2",
segmenter_model_type="incremental",
segmenter_device="cpu",
reranker_device="cpu",
segmenter_gpu_batch_size=100,
reranker_gpu_batch_size=100,
reranker_model_name_or_path="bert-base-cased",
reranker_model_type="masked"
)
Note: The segmenter_gpu_batch_size
and reranker_gpu_batch_size
parameters set the batch size on the CPU if segmenter_device
and reranker_device
are set to cpu
instead of cuda
. These arguments retain their original names for backward compatibility with previous library versions.