Technical University of Denmark
Browse
1/1
8 files

Pretrained Sentence BERT models Yahoo Answers small embeddings

dataset
posted on 2023-05-03, 10:57 authored by Beatrix Miranda Ginn NielsenBeatrix Miranda Ginn Nielsen

 Embeddings on 10% of the Yahoo Answers dataset using pretrained Sentence BERT models.

Yahoo Answers dataset: https://www.kaggle.com/datasets/yacharki/yahoo-answers-10-categories-for-nlp-csv

Indexes used can be found in the code repository.


Pretrained models used:

 all-distilroberta-v1: https://huggingface.co/sentence-transformers/all-distilroberta-v1

 all-MiniLM-L12-v2: https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2

 all-mpnet-base-v2: https://huggingface.co/sentence-transformers/all-mpnet-base-v2

 multi-qa-distilbert-cos-v1: https://huggingface.co/sentence-transformers/multi-qa-distilbert-cos-v1







Funding

Danish Pioneer Centre for AI, DNRF grant number P1

History

ORCID for corresponding depositor

Usage metrics

    DTU Compute

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC