update docs

FlagOpen · Dec 18, 2024 · aeafc7e · aeafc7e
1 parent bc09ef5
commit aeafc7e
Show file tree

Hide file tree

Showing 13 changed files with 187 additions and 21 deletions.
diff --git a/README.md b/README.md
@@ -53,6 +53,7 @@ BGE (BAAI General Embedding) focuses on retrieval-augmented LLMs, consisting of
 
 ## News
 
+- 05/12/2024: :book: We built the BGE documentation for centralized BGE information and materials.
 - 10/29/2024: :earth_asia: We created WeChat group for BGE. Scan the [QR code](./imgs/BGE_WeChat_Group.png) to join the group chat! To get the first hand message about our updates and new release, or having any questions or ideas, join us now!
 - <img src="./imgs/BGE_WeChat_Group.png" alt="bge_wechat_group" class="center" width="200">
 
@@ -109,16 +110,16 @@ Clone the repository and install
 ```
 git clone https://github.com/FlagOpen/FlagEmbedding.git
 cd FlagEmbedding
-# If you do not want to finetune the models, you can install the package without the finetune dependency:
+# If you do not need to finetune the models, you can install the package without the finetune dependency:
 pip install  .
-# If you want to finetune the models, you can install the package with the finetune dependency:
+# If you want to finetune the models, install the package with the finetune dependency:
 # pip install  .[finetune]
 ```
 For development in editable mode:
 ```
-# If you do not want to finetune the models, you can install the package without the finetune dependency:
+# If you do not need to finetune the models, you can install the package without the finetune dependency:
 pip install -e .
-# If you want to finetune the models, you can install the package with the finetune dependency:
+# If you want to finetune the models, install the package with the finetune dependency:
 # pip install -e .[finetune]
 ```
 

diff --git a/docs/source/API/evaluation/beir/data_loader.rst b/docs/source/API/evaluation/beir/data_loader.rst
@@ -1,4 +1,4 @@
 data loader
 ===========
 
-.. autoclass:: FlagEmbedding.abc.evaluation.BEIREvalDataLoader
+.. autoclass:: FlagEmbedding.evaluation.bier.BEIREvalDataLoader
diff --git a/docs/source/FAQ/index.rst b/docs/source/FAQ/index.rst
@@ -5,39 +5,58 @@ Below are some commonly asked questions.
 
 .. tip::
 
-    For more questions, search issues on GitHub or join our community!
+    For more questions, search in issues on GitHub or join our community!
 
+.. dropdown:: Having network issue when connecting to Hugging Face?
+    :animate: fade-in-slide-down
+
+    Try to set the :code:`HF_ENDPOINT` to `HF mirror <https://hf-mirror.com/>`_ instead.
+
+    .. code:: bash
+
+        export HF_ENDPOINT=https://hf-mirror.com
 
 .. dropdown:: When does the query instruction need to be used?
+    :animate: fade-in-slide-down
 
     For a retrieval task that uses short queries to find long related documents, it is recommended to add instructions for these short queries. 
     The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task. 
     In all cases, the documents/passages do not need to add the instruction.
 
 .. dropdown:: Why it takes quite long to just encode 1 sentence?
+    :animate: fade-in-slide-down
 
     Note that if you have multiple CUDA GPUs, FlagEmbedding will automatically use all of them. 
     Then the time used to start the multi-process will cost way longer than the actual encoding.
     Try to just use CPU or just single GPU for simple tasks.
 
 .. dropdown:: The embedding results are different for CPU and GPU?
+    :animate: fade-in-slide-down
 
     The encode function will use FP16 by default if GPU is available, which leads to different precision. 
     Set :code:`fp16=False` to get full precision.
 
 .. dropdown:: How many languages do the multi-lingual models support?
+    :animate: fade-in-slide-down
 
     The training datasets cover up to 170+ languages. 
     But note that due to the unbalanced distribution of languages, the performances will be different.
     Please further test refer to the real application scenario.
 
 .. dropdown:: How does the different retrieval method works in bge-m3?
+    :animate: fade-in-slide-down
 
     - Dense retrieval: map the text into a single embedding, e.g., `DPR <https://arxiv.org/abs/2004.04906>`_, `BGE-v1.5 <../bge/bge_v1_v1.5>`_
     - Sparse retrieval (lexical matching): a vector of size equal to the vocabulary, with the majority of positions set to zero, calculating a weight only for tokens present in the text. 
     e.g., BM25, `unicoil <https://arxiv.org/pdf/2106.14807>`_, and `splade <https://arxiv.org/abs/2107.05720>`_
     - Multi-vector retrieval: use multiple vectors to represent a text, e.g., `ColBERT <https://arxiv.org/abs/2004.12832>`_.
 
 .. dropdown:: Recommended vector database?
+    :animate: fade-in-slide-down
+
+    Generally you can use any vector database (open-sourced, commercial). We use `Faiss <https://github.com/facebookresearch/faiss>`_ by default in our evaluation pipeline and tutorials.
+
+.. dropdown:: No enough VRAM or OOM error during evaluation?
+    :animate: fade-in-slide-down
 
-    Generally you can use any vector database (open-sourced, commercial). We use `Faiss <https://github.com/facebookresearch/faiss>`_ by default in our evaluation pipeline and tutorials.
+    The default values of :code:`embedder_batch_size` and :code:`reranker_batch_size` are both 3000. Try a smaller value.
diff --git a/docs/source/Introduction/index.rst b/docs/source/Introduction/index.rst
@@ -7,14 +7,22 @@ BGE builds one-stop retrieval toolkit for search and RAG. We provide inference,
    :width: 700
    :align: center
 
-   BGE embedder and reranker in an RAG pipelin. `Source <https://safjan.com/images/retrieval_augmented_generation/RAG.png>`_
+   BGE embedder and reranker in an RAG pipeline. `Source <https://safjan.com/images/retrieval_augmented_generation/RAG.png>`_
 
 Quickly get started with:
 
 .. toctree::
    :maxdepth: 1
+   :caption: Start
 
+   overview
    installation
    quick_start
-   concept
+
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Concept
+
+   model
    retrieval_demo
diff --git a/docs/source/Introduction/installation.rst b/docs/source/Introduction/installation.rst
@@ -6,7 +6,7 @@ Installation
 Using pip:
 ----------
 
-If you do not want to finetune the models, you can install the package without the finetune dependency:
+If you do not need to finetune the models, you can install the package without the finetune dependency:
 
 .. code:: bash
 
@@ -28,18 +28,18 @@ Clone the repository and install
 
     git clone https://github.com/FlagOpen/FlagEmbedding.git
     cd FlagEmbedding
-    # If you do not want to finetune the models, you can install the package without the finetune dependency:
+    # If you do not need to finetune the models, you can install the package without the finetune dependency:
     pip install  .
-    # If you want to finetune the models, you can install the package with the finetune dependency:
+    # If you want to finetune the models, install the package with the finetune dependency:
     pip install  .[finetune]
 
 For development in editable mode:
 
 .. code:: bash
 
-    # If you do not want to finetune the models, you can install the package without the finetune dependency:
+    # If you do not need to finetune the models, you can install the package without the finetune dependency:
     pip install -e .
-    # If you want to finetune the models, you can install the package with the finetune dependency:
+    # If you want to finetune the models, install the package with the finetune dependency:
     pip install -e .[finetune]
 
 PyTorch-CUDA

diff --git a/docs/source/Introduction/concept.rst → docs/source/Introduction/model.rst b/docs/source/Introduction/concept.rst → docs/source/Introduction/model.rst
@@ -1,10 +1,12 @@
-Concept
-=======
+Model
+=====
+
+If you are already familiar with the concepts, take a look at the :doc:`BGE models <../bge/index>`!
 
 Embedder
 --------
 
-Embedder, or embedding model, is a model designed to convert data, usually text, codes, or images, into sparse or dense numerical vectors (embeddings) in a high dimensional vector space.
+Embedder, or embedding model, bi-encoder, is a model designed to convert data, usually text, codes, or images, into sparse or dense numerical vectors (embeddings) in a high dimensional vector space.
 These embeddings capture the semantic meaning or key features of the input, which enable efficient comparison and analysis.
 
 A very famous demonstration is the example from `word2vec <https://arxiv.org/abs/1301.3781>`_. It shows how word embeddings capture semantic relationships through vector arithmetic:

diff --git a/docs/source/Introduction/overview.rst b/docs/source/Introduction/overview.rst
@@ -0,0 +1,13 @@
+Overview
+========
+
+Our repository provides well-structured `APIs <https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding>`_ for the inference, evaluation, and fine-tuning of BGE series models.
+Besides that, there are abundant resources of `tutorials <https://github.com/FlagOpen/FlagEmbedding/tree/master/Tutorials>`_ and `examples <https://github.com/FlagOpen/FlagEmbedding/tree/master/examples>`_ for users to quickly get a hands-on experience.
+
+.. figure:: https://raw.githubusercontent.com/FlagOpen/FlagEmbedding/refs/heads/master/imgs/projects.png
+   :width: 700
+   :align: center
+
+   Structure of contents in our `repo <https://github.com/FlagOpen/FlagEmbedding>`_
+
+Our repository provides well-structured contents 
diff --git a/docs/source/Introduction/quick_start.rst b/docs/source/Introduction/quick_start.rst
@@ -7,9 +7,7 @@ First, load one of the BGE embedding model:
 
     from FlagEmbedding import FlagAutoModel
 
-    model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5',
-                                        query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
-                                        use_fp16=True)
+    model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5')
 
 .. tip::
 
@@ -22,6 +20,7 @@ First, load one of the BGE embedding model:
 Then, feed some sentences to the model and get their embeddings:
 
 .. code:: python
+
     sentences_1 = ["I love NLP", "I love machine learning"]
     sentences_2 = ["I love BGE", "I love text retrieval"]
     embeddings_1 = model.encode(sentences_1)

diff --git a/docs/source/bge/bge_m3.rst b/docs/source/bge/bge_m3.rst
@@ -119,5 +119,5 @@ Usage
 Useful Links:
 
 `API <../API/inference/embedder/encoder_only/M3Embedder>`_
-`Tutorial <>`_
+`Tutorial <https://github.com/FlagOpen/FlagEmbedding/blob/master/Tutorials/1_Embedding/1.2.4_BGE-M3.ipynb>`_
 `Example <https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/embedder/encoder_only>`_
diff --git a/docs/source/bge/bge_reranker.rst b/docs/source/bge/bge_reranker.rst
@@ -1,2 +1,37 @@
 BGE-Reranker
 ============
+
+Different from embedding model, reranker, or cross-encoder uses question and document as input and directly output similarity instead of embedding. 
+To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models. 
+For examples, use a bge embedding model to first retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 document to get the final top-3 results.
+
+The first series of BGE-Reranker contains two models, large and base.
+
++-------------------------------------------------------------------------------+-----------------------+------------+--------------+-----------------------------------------------------------------------+
+|                                    Model                                      |        Language       | Parameters |  Model Size  |                              Description                              |
++===============================================================================+=======================+============+==============+=======================================================================+
+| `BAAI/bge-reranker-large <https://huggingface.co/BAAI/bge-reranker-large>`_   |   English & Chinese   |    560M    |    2.24 GB   | Larger reranker model, easy to deploy with better inference           |
++-------------------------------------------------------------------------------+-----------------------+------------+--------------+-----------------------------------------------------------------------+
+| `BAAI/bge-reranker-base <https://huggingface.co/BAAI/bge-reranker-base>`_     |   English & Chinese   |    278M    |    1.11 GB   | Lightweight reranker model, easy to deploy with fast inference        |
++-------------------------------------------------------------------------------+-----------------------+------------+--------------+-----------------------------------------------------------------------+
+
+bge-reranker-large and bge-reranker-base used `XLM-RoBERTa-Large <https://huggingface.co/FacebookAI/xlm-roberta-large>`_ and `XLM-RoBERTa-Base <https://huggingface.co/FacebookAI/xlm-roberta-base>`_ respectively as the base model. 
+They were trained on high quality English and Chinese data, and acheived State-of-The-Art performance in the level of same size models at the time released.
+
+Usage
+-----
+
+
+.. code:: python
+
+    from FlagEmbedding import FlagReranker
+
+    reranker = FlagReranker(
+        'BAAI/bge-reranker-base', 
+        query_max_length=256,
+        use_fp16=True,
+        devices=['cuda:1'],
+    )
+
+    score = reranker.compute_score(['I am happy to help', 'Assisting you is my pleasure'])
+    print(score)
diff --git a/docs/source/bge/bge_reranker_v2.rst b/docs/source/bge/bge_reranker_v2.rst
@@ -0,0 +1,82 @@
+BGE-Reranker-v2
+===============
+
++------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
+|                                                      Model                                                       |        Language       | Parameters  |  Model Size  |                                                                       Description                                                                       |
++==================================================================================================================+=======================+=============+==============+=========================================================================================================================================================+
+| `BAAI/bge-reranker-v2-m3 <https://huggingface.co/BAAI/bge-reranker-v2-m3>`_                                      |      Multilingual     |    568M     |    2.27 GB   | Lightweight reranker model, possesses strong multilingual capabilities, easy to deploy, with fast inference.                                            |
++------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
+| `BAAI/bge-reranker-v2-gemma <https://huggingface.co/BAAI/bge-reranker-v2-gemma>`_                                |      Multilingual     |    2.51B    |     10 GB    | Suitable for multilingual contexts, performs well in both English proficiency and multilingual capabilities.                                            |
++------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
+| `BAAI/bge-reranker-v2-minicpm-layerwise <https://huggingface.co/BAAI/bge-reranker-v2-minicpm-layerwise>`_        |      Multilingual     |    2.72B    |    10.9 GB   | Suitable for multilingual contexts, allows freedom to select layers for output, facilitating accelerated inference.                                     |
++------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
+| `BAAI/bge-reranker-v2.5-gemma2-lightweight <https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight>`_  |      Multilingual     |    2.72B    |    10.9 GB   | Suitable for multilingual contexts, allows freedom to select layers, compress ratio and compress layers for output, facilitating accelerated inference. |
++------------------------------------------------------------------------------------------------------------------+-----------------------+-------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
+
+
+.. tip:: Suggessions on model selection
+
+    You can select the model according your senario and resource:
+
+    - For multilingual, utilize :code:`BAAI/bge-reranker-v2-m3`, :code:`BAAI/bge-reranker-v2-gemma` and :code:`BAAI/bge-reranker-v2.5-gemma2-lightweight`.
+    - For Chinese or English, utilize :code:`BAAI/bge-reranker-v2-m3` and :code:`BAAI/bge-reranker-v2-minicpm-layerwise`.
+    - For efficiency, utilize :code:`BAAI/bge-reranker-v2-m3` and the low layer of :code:`BAAI/bge-reranker-v2-minicpm-layerwise`.
+    - For better performance, recommand :code:`BAAI/bge-reranker-v2-minicpm-layerwise` and :code:`BAAI/bge-reranker-v2-gemma`.
+
+    Make sure always test on your real use case and choose the one with best speed-quality balance!
+
+Usage
+-----
+
+Use bge-reranker-v2-m3 in the same way as bge-reranker-base and bge-reranker-large.
+
+.. code:: python
+
+    from FlagEmbedding import FlagReranker
+
+    # Setting use_fp16 to True speeds up computation with a slight performance degradation
+    reranker = FlagReranker('BAAI/bge-reranker-v2-m3', use_fp16=True)
+
+    score = reranker.compute_score(['query', 'passage'])
+    # or set "normalize=True" to apply a sigmoid function to the score for 0-1 range
+    score = reranker.compute_score(['query', 'passage'], normalize=True)
+
+    print(score)
+
+Use the :code:`FlagLLMReranker` class for bge-reranker-v2-gemma.
+
+.. code:: python
+
+    from FlagEmbedding import FlagLLMReranker
+
+    # Setting use_fp16 to True speeds up computation with a slight performance degradation
+    reranker = FlagLLMReranker('BAAI/bge-reranker-v2-gemma', use_fp16=True)
+
+    score = reranker.compute_score(['query', 'passage'])
+    print(score)
+
+Use the :code:`LayerWiseFlagLLMReranker` class for bge-reranker-v2-minicpm-layerwise.
+
+.. code:: python
+
+    from FlagEmbedding import LayerWiseFlagLLMReranker
+
+    # Setting use_fp16 to True speeds up computation with a slight performance degradation
+    reranker = LayerWiseFlagLLMReranker('BAAI/bge-reranker-v2-minicpm-layerwise', use_fp16=True)
+
+    # Adjusting 'cutoff_layers' to pick which layers are used for computing the score.
+    score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28]) 
+    print(score)
+
+Use the :code:`LightWeightFlagLLMReranker` class for bge-reranker-v2.5-gemma2-lightweight.
+
+.. code:: python
+
+    from FlagEmbedding import LightWeightFlagLLMReranker
+
+    # Setting use_fp16 to True speeds up computation with a slight performance degradation
+    reranker = LightWeightFlagLLMReranker('BAAI/bge-reranker-v2.5-gemma2-lightweight', use_fp16=True)
+
+    # Adjusting 'cutoff_layers' to pick which layers are used for computing the score.
+    score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28], compress_ratio=2, compress_layer=[24, 40])
+    print(score)
diff --git a/docs/source/bge/bge_v1_v1.5.rst b/docs/source/bge/bge_v1_v1.5.rst
@@ -89,6 +89,7 @@ To use a single GPU:
     model = FlagModel('BAAI/bge-base-en-v1.5', devices=0)
 
 |
+
 Useful Links:
 
 `API <../API/inference/embedder/encoder_only/BaseEmbedder>`_

diff --git a/docs/source/bge/index.rst b/docs/source/bge/index.rst
@@ -11,3 +11,9 @@ BGE
    bge_m3
    bge_icl
 
+.. toctree::
+   :maxdepth: 1
+   :caption: Reranker
+
+   bge_reranker
+   bge_reranker_v2
-Original file line number
+Diff line change
@@ Expand Up / @@ -89,6 +89,7 @@ To use a single GPU: @@
         model = FlagModel('BAAI/bge-base-en-v1.5', devices=0)
     |
     Useful Links:
     `API <../API/inference/embedder/encoder_only/BaseEmbedder>`_
@@ Expand Down @@