Skip to content

Frequently Asked Questions (FAQ)

Ruan Chaves edited this page Jun 5, 2023 · 2 revisions

Isn't it possible to solve this problem simply with few-shot learning?

In our experiments, few-shot learning achieved relatively low accuracy for language models that can be easily deployed on consumer-grade GPUs, such as GPT-2. However, the beam search approach implemented in hashformers enabled us to achieve state-of-the-art results for this task.

Nevertheless, it remains an open research question whether few-shot learning is more effective than our current beam search approach for large language models at the scale of GPT-J and Dolly. I plan to eventually benchmark and compare both approaches on our hashtag segmentation datasets.

Is it possible to use hashformers on CPU, without CUDA enabled for torch?

After the release of version 2.0, hashformers now supports any valid device as segmenter_device or reranker_device:

from hashformers import TransformerWordSegmenter as WordSegmenter
ws = WordSegmenter(
    segmenter_model_name_or_path="gpt2",
    segmenter_model_type="incremental",
    segmenter_device="cpu",
    reranker_device="cpu",
    segmenter_gpu_batch_size=100,
    reranker_gpu_batch_size=100,
    reranker_model_name_or_path="bert-base-cased",
    reranker_model_type="masked"
)

Note: The segmenter_gpu_batch_size and reranker_gpu_batch_size parameters set the batch size on the CPU if segmenter_device and reranker_device are set to cpu instead of cuda. These arguments retain their original names for backward compatibility with previous library versions.

Clone this wiki locally