Octen Series: Optimizing Embedding Models to #1 on RTEB Leaderboard
1. Background and Motivation 1.1 RTEB: The True Litmus Test for Retrieval Capability RTEB (Retrieval Embedding Benchmark) is a new generation of retrieval evaluation benchmark launched by MTEB. Compared to traditional MTEB, RTEB focuses on real-world industry application scenarios and aims to address the “overfitting” problem of models on public benchmarks. Core Features of RTEB: Industry-Oriented: Covers key enterprise domains Legal: Legal document retrieval in German, English, Japanese, and French Finance: Financial reports, Q&A, and personal finance content Healthcare: Medical Q&A, clinical dialogues, and health consultations Code: Programming problems, code search, and SQL queries Hybrid Evaluation Strategy: Open datasets + Private datasets ...