header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

Microsoft Open Sources Three-Tier Harrier Text Embedding Model, 27B Version Tops Multilingual MTEB v2 Leaderboard

According to 1M AI News monitoring, Microsoft has released the harrier-oss-v1 open-source multilingual text embedding model family on Hugging Face, including three scales: 270M, 0.6B, and 27B. The model card indicates that this series adopts a decoder-only architecture, last-token pooling, and L2 normalization, supports up to 32768 tokens, and can be used for retrieval, clustering, semantic similarity, classification, bilingual mining, and re-ranking.

The Multilingual MTEB v2 is a widely used industry benchmark for multilingual text embeddings, primarily testing tasks such as retrieval, classification, clustering, and semantic similarity. According to Microsoft's model card, the scores of the three scales on this benchmark are 66.5, 69.0, and 74.3, with the 27B version ranking first on the day of its release. The 270M and 0.6B versions also utilize larger embedding models for knowledge distillation. All three models are released under the MIT license.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish