Name		Name	Last commit message	Last commit date
parent directory ..
0_embedding		0_embedding
1_retrieval		1_retrieval
2_reranking		2_reranking
3_colbert		3_colbert
4_rag		4_rag
eval		eval
msmacro		msmacro
scifact		scifact
t2_ranking		t2_ranking
wikipedia-nq		wikipedia-nq
README.md		README.md
README_zh_CN.md		README_zh_CN.md

README.md

🚀 Open-Retrievals Examples

Welcome to Open-Retrievals, a cutting-edge repository designed to empower your retrieval-augmented generation (RAG) pipelines with state-of-the-art techniques in embedding, reranking, and RAG integration.

🔍 1. Embedding Models

Model	Original	Finetuned
m3e	0.654	0.693
bge-base-zh-v1.5	0.657	0.703
Qwen2-1.5B-Instruct	-	0.695
e5-mistral-7b-instruct	0.651	0.699

Data Format

Text pair: use in-batch negative fine-tuning

{'query': TEXT_TYPE, 'positive': List[TEXT_TYPE]}
...

Text triplet: Hard negative (or mix In-batch negative) fine-tuning

{'query': TEXT_TYPE, 'positive': List[TEXT_TYPE], 'negative': List[TEXT_TYPE]}
...

Text scored pair:

{(query, positive, label), (query, negative, label), ...}

📊 2. Reranking

Model	Original	Finetuned
bge-reranker-base	0.666	0.706
bge-m3	0.657	0.695
Qwen2-1.5B-Instruct	-	0.699
bge-reranker-v2-gemma	0.637	0.706
chinese-roberta-wwm-ext (ColBERT)	-	0.687

📚 3. RAG

For basic rag application, refer to rag_langchain_demo.py

🚀 4. Deployment

speed: Nvidia TensorRT + Nvidia Triton inference server > Microsoft ONNX Runtime + Nvidia Triton inference server > Pytorch + FastAPI

4.1 Transfer to onnx

Prerequisites

pip install optimum
pip install onnxruntime

python embed2onnx.py --model_name BAAI/bge-small-en-v1.5 --output_path ./onnx_model

❓ 5. FAQ

The grad_norm during training is always zero?

consider to change fp16 or bf16
while training, set bf16 or fp16 in TrainingArguments; while inference, set use_fp16=True in AutoModelForEmbedding or LLMRanker

The fine-tuned embedding performance during inference is worse than original?

check whether the pooling_method is correct
check whether the prompt or instruction is exactly same as training for LLM model

How can we fine-tune the BAAI/bge-m3 ColBERT model?

open-retrievals support to fine-tune the BAAI/bge-m3 colbert directly, just don't set use_fp16=True while fine-tuning, and set the learning_rate smaller

The performance is worse?

the collator and loss should be aligned, especially for triplet training with negative embeddings. The collator of open-retrievals provided is {query: value, positive: value, negative: value}. Another collator is {query: value, document: positive+negative}, so the loss function should be treated accordingly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

README.md

🚀 Open-Retrievals Examples

🔍 1. Embedding Models

📊 2. Reranking

📚 3. RAG

🚀 4. Deployment

4.1 Transfer to onnx

❓ 5. FAQ

Files

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

🚀 Open-Retrievals Examples

🔍 1. Embedding Models

📊 2. Reranking

📚 3. RAG

🚀 4. Deployment

4.1 Transfer to onnx

❓ 5. FAQ