Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ZiyiXia committed Dec 18, 2024
1 parent bc09ef5 commit aeafc7e
Show file tree
Hide file tree
Showing 13 changed files with 187 additions and 21 deletions.
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ BGE (BAAI General Embedding) focuses on retrieval-augmented LLMs, consisting of

## News

- 05/12/2024: :book: We built the BGE documentation for centralized BGE information and materials.
- 10/29/2024: :earth_asia: We created WeChat group for BGE. Scan the [QR code](./imgs/BGE_WeChat_Group.png) to join the group chat! To get the first hand message about our updates and new release, or having any questions or ideas, join us now!
- <img src="./imgs/BGE_WeChat_Group.png" alt="bge_wechat_group" class="center" width="200">

Expand Down Expand Up @@ -109,16 +110,16 @@ Clone the repository and install
```
git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding
# If you do not want to finetune the models, you can install the package without the finetune dependency:
# If you do not need to finetune the models, you can install the package without the finetune dependency:
pip install .
# If you want to finetune the models, you can install the package with the finetune dependency:
# If you want to finetune the models, install the package with the finetune dependency:
# pip install .[finetune]
```
For development in editable mode:
```
# If you do not want to finetune the models, you can install the package without the finetune dependency:
# If you do not need to finetune the models, you can install the package without the finetune dependency:
pip install -e .
# If you want to finetune the models, you can install the package with the finetune dependency:
# If you want to finetune the models, install the package with the finetune dependency:
# pip install -e .[finetune]
```

Expand Down
2 changes: 1 addition & 1 deletion docs/source/API/evaluation/beir/data_loader.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
data loader
===========

.. autoclass:: FlagEmbedding.abc.evaluation.BEIREvalDataLoader
.. autoclass:: FlagEmbedding.evaluation.bier.BEIREvalDataLoader
23 changes: 21 additions & 2 deletions docs/source/FAQ/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,39 +5,58 @@ Below are some commonly asked questions.

.. tip::

For more questions, search issues on GitHub or join our community!
For more questions, search in issues on GitHub or join our community!

.. dropdown:: Having network issue when connecting to Hugging Face?
:animate: fade-in-slide-down

Try to set the :code:`HF_ENDPOINT` to `HF mirror <https://hf-mirror.com/>`_ instead.

.. code:: bash
export HF_ENDPOINT=https://hf-mirror.com
.. dropdown:: When does the query instruction need to be used?
:animate: fade-in-slide-down

For a retrieval task that uses short queries to find long related documents, it is recommended to add instructions for these short queries.
The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task.
In all cases, the documents/passages do not need to add the instruction.

.. dropdown:: Why it takes quite long to just encode 1 sentence?
:animate: fade-in-slide-down

Note that if you have multiple CUDA GPUs, FlagEmbedding will automatically use all of them.
Then the time used to start the multi-process will cost way longer than the actual encoding.
Try to just use CPU or just single GPU for simple tasks.

.. dropdown:: The embedding results are different for CPU and GPU?
:animate: fade-in-slide-down

The encode function will use FP16 by default if GPU is available, which leads to different precision.
Set :code:`fp16=False` to get full precision.

.. dropdown:: How many languages do the multi-lingual models support?
:animate: fade-in-slide-down

The training datasets cover up to 170+ languages.
But note that due to the unbalanced distribution of languages, the performances will be different.
Please further test refer to the real application scenario.

.. dropdown:: How does the different retrieval method works in bge-m3?
:animate: fade-in-slide-down

- Dense retrieval: map the text into a single embedding, e.g., `DPR <https://arxiv.org/abs/2004.04906>`_, `BGE-v1.5 <../bge/bge_v1_v1.5>`_
- Sparse retrieval (lexical matching): a vector of size equal to the vocabulary, with the majority of positions set to zero, calculating a weight only for tokens present in the text.
e.g., BM25, `unicoil <https://arxiv.org/pdf/2106.14807>`_, and `splade <https://arxiv.org/abs/2107.05720>`_
- Multi-vector retrieval: use multiple vectors to represent a text, e.g., `ColBERT <https://arxiv.org/abs/2004.12832>`_.

.. dropdown:: Recommended vector database?
:animate: fade-in-slide-down

Generally you can use any vector database (open-sourced, commercial). We use `Faiss <https://github.com/facebookresearch/faiss>`_ by default in our evaluation pipeline and tutorials.

.. dropdown:: No enough VRAM or OOM error during evaluation?
:animate: fade-in-slide-down

Generally you can use any vector database (open-sourced, commercial). We use `Faiss <https://github.com/facebookresearch/faiss>`_ by default in our evaluation pipeline and tutorials.
The default values of :code:`embedder_batch_size` and :code:`reranker_batch_size` are both 3000. Try a smaller value.
12 changes: 10 additions & 2 deletions docs/source/Introduction/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,22 @@ BGE builds one-stop retrieval toolkit for search and RAG. We provide inference,
:width: 700
:align: center

BGE embedder and reranker in an RAG pipelin. `Source <https://safjan.com/images/retrieval_augmented_generation/RAG.png>`_
BGE embedder and reranker in an RAG pipeline. `Source <https://safjan.com/images/retrieval_augmented_generation/RAG.png>`_

Quickly get started with:

.. toctree::
:maxdepth: 1
:caption: Start

overview
installation
quick_start
concept


.. toctree::
:maxdepth: 1
:caption: Concept

model
retrieval_demo
10 changes: 5 additions & 5 deletions docs/source/Introduction/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Installation
Using pip:
----------

If you do not want to finetune the models, you can install the package without the finetune dependency:
If you do not need to finetune the models, you can install the package without the finetune dependency:

.. code:: bash
Expand All @@ -28,18 +28,18 @@ Clone the repository and install
git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding
# If you do not want to finetune the models, you can install the package without the finetune dependency:
# If you do not need to finetune the models, you can install the package without the finetune dependency:
pip install .
# If you want to finetune the models, you can install the package with the finetune dependency:
# If you want to finetune the models, install the package with the finetune dependency:
pip install .[finetune]
For development in editable mode:

.. code:: bash
# If you do not want to finetune the models, you can install the package without the finetune dependency:
# If you do not need to finetune the models, you can install the package without the finetune dependency:
pip install -e .
# If you want to finetune the models, you can install the package with the finetune dependency:
# If you want to finetune the models, install the package with the finetune dependency:
pip install -e .[finetune]
PyTorch-CUDA
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
Concept
=======
Model
=====

If you are already familiar with the concepts, take a look at the :doc:`BGE models <../bge/index>`!

Embedder
--------

Embedder, or embedding model, is a model designed to convert data, usually text, codes, or images, into sparse or dense numerical vectors (embeddings) in a high dimensional vector space.
Embedder, or embedding model, bi-encoder, is a model designed to convert data, usually text, codes, or images, into sparse or dense numerical vectors (embeddings) in a high dimensional vector space.
These embeddings capture the semantic meaning or key features of the input, which enable efficient comparison and analysis.

A very famous demonstration is the example from `word2vec <https://arxiv.org/abs/1301.3781>`_. It shows how word embeddings capture semantic relationships through vector arithmetic:
Expand Down
13 changes: 13 additions & 0 deletions docs/source/Introduction/overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Overview
========

Our repository provides well-structured `APIs <https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding>`_ for the inference, evaluation, and fine-tuning of BGE series models.
Besides that, there are abundant resources of `tutorials <https://github.com/FlagOpen/FlagEmbedding/tree/master/Tutorials>`_ and `examples <https://github.com/FlagOpen/FlagEmbedding/tree/master/examples>`_ for users to quickly get a hands-on experience.

.. figure:: https://raw.githubusercontent.com/FlagOpen/FlagEmbedding/refs/heads/master/imgs/projects.png
:width: 700
:align: center

Structure of contents in our `repo <https://github.com/FlagOpen/FlagEmbedding>`_

Our repository provides well-structured contents
5 changes: 2 additions & 3 deletions docs/source/Introduction/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,7 @@ First, load one of the BGE embedding model:
from FlagEmbedding import FlagAutoModel
model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5',
query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
use_fp16=True)
model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5')
.. tip::

Expand All @@ -22,6 +20,7 @@ First, load one of the BGE embedding model:
Then, feed some sentences to the model and get their embeddings:

.. code:: python
sentences_1 = ["I love NLP", "I love machine learning"]
sentences_2 = ["I love BGE", "I love text retrieval"]
embeddings_1 = model.encode(sentences_1)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/bge/bge_m3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,5 +119,5 @@ Usage
Useful Links:

`API <../API/inference/embedder/encoder_only/M3Embedder>`_
`Tutorial <>`_
`Tutorial <https://github.com/FlagOpen/FlagEmbedding/blob/master/Tutorials/1_Embedding/1.2.4_BGE-M3.ipynb>`_
`Example <https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/embedder/encoder_only>`_
35 changes: 35 additions & 0 deletions docs/source/bge/bge_reranker.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,37 @@
BGE-Reranker
============

Different from embedding model, reranker, or cross-encoder uses question and document as input and directly output similarity instead of embedding.
To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models.
For examples, use a bge embedding model to first retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 document to get the final top-3 results.

The first series of BGE-Reranker contains two models, large and base.

+-------------------------------------------------------------------------------+-----------------------+------------+--------------+-----------------------------------------------------------------------+
| Model | Language | Parameters | Model Size | Description |
+===============================================================================+=======================+============+==============+=======================================================================+
| `BAAI/bge-reranker-large <https://huggingface.co/BAAI/bge-reranker-large>`_ | English & Chinese | 560M | 2.24 GB | Larger reranker model, easy to deploy with better inference |
+-------------------------------------------------------------------------------+-----------------------+------------+--------------+-----------------------------------------------------------------------+
| `BAAI/bge-reranker-base <https://huggingface.co/BAAI/bge-reranker-base>`_ | English & Chinese | 278M | 1.11 GB | Lightweight reranker model, easy to deploy with fast inference |
+-------------------------------------------------------------------------------+-----------------------+------------+--------------+-----------------------------------------------------------------------+

bge-reranker-large and bge-reranker-base used `XLM-RoBERTa-Large <https://huggingface.co/FacebookAI/xlm-roberta-large>`_ and `XLM-RoBERTa-Base <https://huggingface.co/FacebookAI/xlm-roberta-base>`_ respectively as the base model.
They were trained on high quality English and Chinese data, and acheived State-of-The-Art performance in the level of same size models at the time released.

Usage
-----


.. code:: python
from FlagEmbedding import FlagReranker
reranker = FlagReranker(
'BAAI/bge-reranker-base',
query_max_length=256,
use_fp16=True,
devices=['cuda:1'],
)
score = reranker.compute_score(['I am happy to help', 'Assisting you is my pleasure'])
print(score)
82 changes: 82 additions & 0 deletions docs/source/bge/bge_reranker_v2.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
BGE-Reranker-v2
===============

+------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Language | Parameters | Model Size | Description |
+==================================================================================================================+=======================+=============+==============+=========================================================================================================================================================+
| `BAAI/bge-reranker-v2-m3 <https://huggingface.co/BAAI/bge-reranker-v2-m3>`_ | Multilingual | 568M | 2.27 GB | Lightweight reranker model, possesses strong multilingual capabilities, easy to deploy, with fast inference. |
+------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| `BAAI/bge-reranker-v2-gemma <https://huggingface.co/BAAI/bge-reranker-v2-gemma>`_ | Multilingual | 2.51B | 10 GB | Suitable for multilingual contexts, performs well in both English proficiency and multilingual capabilities. |
+------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| `BAAI/bge-reranker-v2-minicpm-layerwise <https://huggingface.co/BAAI/bge-reranker-v2-minicpm-layerwise>`_ | Multilingual | 2.72B | 10.9 GB | Suitable for multilingual contexts, allows freedom to select layers for output, facilitating accelerated inference. |
+------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| `BAAI/bge-reranker-v2.5-gemma2-lightweight <https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight>`_ | Multilingual | 2.72B | 10.9 GB | Suitable for multilingual contexts, allows freedom to select layers, compress ratio and compress layers for output, facilitating accelerated inference. |
+------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+


.. tip:: Suggessions on model selection

You can select the model according your senario and resource:

- For multilingual, utilize :code:`BAAI/bge-reranker-v2-m3`, :code:`BAAI/bge-reranker-v2-gemma` and :code:`BAAI/bge-reranker-v2.5-gemma2-lightweight`.
- For Chinese or English, utilize :code:`BAAI/bge-reranker-v2-m3` and :code:`BAAI/bge-reranker-v2-minicpm-layerwise`.
- For efficiency, utilize :code:`BAAI/bge-reranker-v2-m3` and the low layer of :code:`BAAI/bge-reranker-v2-minicpm-layerwise`.
- For better performance, recommand :code:`BAAI/bge-reranker-v2-minicpm-layerwise` and :code:`BAAI/bge-reranker-v2-gemma`.

Make sure always test on your real use case and choose the one with best speed-quality balance!

Usage
-----

Use bge-reranker-v2-m3 in the same way as bge-reranker-base and bge-reranker-large.

.. code:: python
from FlagEmbedding import FlagReranker
# Setting use_fp16 to True speeds up computation with a slight performance degradation
reranker = FlagReranker('BAAI/bge-reranker-v2-m3', use_fp16=True)
score = reranker.compute_score(['query', 'passage'])
# or set "normalize=True" to apply a sigmoid function to the score for 0-1 range
score = reranker.compute_score(['query', 'passage'], normalize=True)
print(score)
Use the :code:`FlagLLMReranker` class for bge-reranker-v2-gemma.

.. code:: python
from FlagEmbedding import FlagLLMReranker
# Setting use_fp16 to True speeds up computation with a slight performance degradation
reranker = FlagLLMReranker('BAAI/bge-reranker-v2-gemma', use_fp16=True)
score = reranker.compute_score(['query', 'passage'])
print(score)
Use the :code:`LayerWiseFlagLLMReranker` class for bge-reranker-v2-minicpm-layerwise.

.. code:: python
from FlagEmbedding import LayerWiseFlagLLMReranker
# Setting use_fp16 to True speeds up computation with a slight performance degradation
reranker = LayerWiseFlagLLMReranker('BAAI/bge-reranker-v2-minicpm-layerwise', use_fp16=True)
# Adjusting 'cutoff_layers' to pick which layers are used for computing the score.
score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28])
print(score)
Use the :code:`LightWeightFlagLLMReranker` class for bge-reranker-v2.5-gemma2-lightweight.

.. code:: python
from FlagEmbedding import LightWeightFlagLLMReranker
# Setting use_fp16 to True speeds up computation with a slight performance degradation
reranker = LightWeightFlagLLMReranker('BAAI/bge-reranker-v2.5-gemma2-lightweight', use_fp16=True)
# Adjusting 'cutoff_layers' to pick which layers are used for computing the score.
score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28], compress_ratio=2, compress_layer=[24, 40])
print(score)
1 change: 1 addition & 0 deletions docs/source/bge/bge_v1_v1.5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ To use a single GPU:
model = FlagModel('BAAI/bge-base-en-v1.5', devices=0)
|
Useful Links:

`API <../API/inference/embedder/encoder_only/BaseEmbedder>`_
Expand Down
6 changes: 6 additions & 0 deletions docs/source/bge/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,9 @@ BGE
bge_m3
bge_icl

.. toctree::
:maxdepth: 1
:caption: Reranker

bge_reranker
bge_reranker_v2

0 comments on commit aeafc7e

Please sign in to comment.