We are currently experiencing problem with download of datasets
.H5
all-distilroberta-v1_yahoo_answers_small_test.h5 (16.38 MB)
.H5
all-distilroberta-v1_yahoo_answers_small_train.h5 (381.25 MB)
.H5
all-MiniLM-L12-v2_yahoo_answers_small_test.h5 (8.17 MB)
.H5
all-MiniLM-L12-v2_yahoo_answers_small_train.h5 (190.17 MB)
.H5
all-mpnet-base-v2_yahoo_answers_small_test.h5 (16.39 MB)
.H5
all-mpnet-base-v2_yahoo_answers_small_train.h5 (381.4 MB)
.H5
multi-qa-distilbert-cos-v1_yahoo_answers_small_test.h5 (16.37 MB)
.H5
multi-qa-distilbert-cos-v1_yahoo_answers_small_train.h5 (381.1 MB)
1/0
Pretrained Sentence BERT models Yahoo Answers small embeddings
dataset
posted on 2023-05-03, 10:57 authored by Beatrix Miranda Ginn NielsenBeatrix Miranda Ginn Nielsen Embeddings on 10% of the Yahoo Answers dataset using pretrained Sentence BERT models.
Yahoo Answers dataset: https://www.kaggle.com/datasets/yacharki/yahoo-answers-10-categories-for-nlp-csv
Indexes used can be found in the code repository.
Pretrained models used:
all-distilroberta-v1: https://huggingface.co/sentence-transformers/all-distilroberta-v1
all-MiniLM-L12-v2: https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2
all-mpnet-base-v2: https://huggingface.co/sentence-transformers/all-mpnet-base-v2
multi-qa-distilbert-cos-v1: https://huggingface.co/sentence-transformers/multi-qa-distilbert-cos-v1
Funding
Danish Pioneer Centre for AI, DNRF grant number P1
History
ORCID for corresponding depositor
Usage metrics
Categories
Keywords
Licence
Exports
RefWorksRefWorks
BibTeXBibTeX
Ref. managerRef. manager
EndnoteEndnote
DataCiteDataCite
NLMNLM
DCDC