sts_bert_distilroberta-base Yahoo Answers small embeddings

dataset

Embeddings on 10% of the Yahoo Answers dataset.

Yahoo Answers dataset: https://www.kaggle.com/datasets/yacharki/yahoo-answers-10-categories-for-nlp-csv

Indexes used can be found in the code repository https://github.com/bemigini/hubness-reduction-sentence-bert.

Embeddings are made with sentence BERT models where distilroberta-base (https://huggingface.co/distilroberta-base) is used as the base model.

For more details on models, see the model item: 10.11583/DTU.20708785

Funding