Sentence BERT models trained on STS benchmark
Sentence BERT models trained on STS benchmark
See sentence transformers (https://www.sbert.net/) for training and use of sentence BERT models in general.
The STS benchmark dataset was fetched from https://sbert.net/datasets/stsbenchmark.tsv.gz
Models follow the naming scheme:
sts_bert_[base model name]_[distance function]_[placement of most dissimilar]_z_[z score normalisation]_n_[normalisation of embeddings]_c_[centering of embeddings]_seed[random seed]
Example: sts_bert_distilroberta-base_cos_dist_ORTHOGONAL_z_False_n_False_c_False_seed1
Base model name
Three different base models were used:
- distilroberta-base (https://huggingface.co/distilroberta-base)
- microsoft-MiniLM-L12-H384-uncased (https://huggingface.co/microsoft/MiniLM-L12-H384-uncased)
- microsoft-mpnet-base (https://huggingface.co/microsoft/mpnet-base)
Distance function
Three different distance functions were used.
- Cosine similarity (cos)
- Cosine distance (cos_dist)
- Euclidean distance (euclidean)
Placement of most dissimilar
This describes how we want the embeddings of two texts which are completetly dissimilar to ideally be placed.
In all these models we have chosen ORTHOGONAL, so the two embeddings of dissimilar texts should have 90 degrees between them.
Another choice could have been OPPOSITE where the embeddings should have 180 degrees between them when they are completely dissimilar.
Z-score normalisation
Whether z-score normalisation of the embeddings was performed during training. True/False.
Normalisation of embeddings
Whether normalisation of embeddings was performed during training. True/False.
Centering of embeddings
Whether centering of the embeddings was performed during training. True/False.
Random seed
What random seed was used for training.