Technical University of Denmark
540 files

Sentence BERT models trained on STS benchmark

posted on 2023-09-28, 06:46 authored by Beatrix Miranda Ginn NielsenBeatrix Miranda Ginn Nielsen

Sentence BERT models trained on STS benchmark 

See sentence transformers ( for training and use of sentence BERT models in general.

The STS benchmark dataset was fetched from

Models follow the naming scheme: 

sts_bert_[base model name]_[distance function]_[placement of most dissimilar]_z_[z score normalisation]_n_[normalisation of embeddings]_c_[centering of embeddings]_seed[random seed] 

Example: sts_bert_distilroberta-base_cos_dist_ORTHOGONAL_z_False_n_False_c_False_seed1

Base model name

Three different base models were used:

- distilroberta-base (

- microsoft-MiniLM-L12-H384-uncased (

- microsoft-mpnet-base (

Distance function

Three different distance functions were used. 

- Cosine similarity (cos)

- Cosine distance (cos_dist)

- Euclidean distance (euclidean)

Placement of most dissimilar

This describes how we want the embeddings of two texts which are completetly dissimilar to ideally be placed. 

In all these models we have chosen ORTHOGONAL, so the two embeddings of dissimilar texts should have 90 degrees between them. 

Another choice could have been OPPOSITE where the embeddings should have 180 degrees between them when they are completely dissimilar. 

Z-score normalisation

Whether z-score normalisation of the embeddings was performed during training. True/False.

Normalisation of embeddings

Whether normalisation of embeddings was performed during training. True/False.

Centering of embeddings

Whether centering of the embeddings was performed during training. True/False.

Random seed

What random seed was used for training. 


Danish Pioneer Centre for AI, DNRF grant number P1


ORCID for corresponding depositor

Usage metrics

    DTU Compute