Skip to content

Commit

Permalink
Reuse embeddings
Browse files Browse the repository at this point in the history
Closes #663
  • Loading branch information
edeandrea committed Jun 12, 2024
1 parent 3c9bf71 commit 82f6d29
Show file tree
Hide file tree
Showing 17 changed files with 570 additions and 14 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 27 additions & 1 deletion docs/modules/ROOT/pages/easy-rag.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,26 @@ NOTE: If you add two or more artifacts that provide embedding models,
Quarkus will ask you to choose one of them using the
`quarkus.langchain4j.embedding-model.provider` property.

== Reusing embeddings

You might find that when doing local development that computing embeddings takes some time. This pain can often be felt in dev mode when it may take several minutes between restarts.

That's where reusable ingestions come in! If you don't configure a persistent embedding store and set

[source,properties,subs=attributes+]
----
quarkus.langchain4j.easy-rag.reuse-embeddings.enabled=true
----

then the `easy-rag` extension will detect the local embeddings. If they exist then they are loaded into the embedding store. If they don't exist, they are computed as normal and then written out to the file `easy-rag-embeddings.json` in the current directory.

NOTE: You can customize the embeddings file by setting the `quarkus.langchain4j.easy-rag.reuse-embeddings.file` property.

See this diagram which describes the flow:

.Reusable ingestion flow
image::easy-rag-reuse-embeddings.png[align="center"]

== Getting started with a ready-to-use example

To see Easy RAG in action, use the project `samples/chatbot-easy-rag` in the
Expand Down Expand Up @@ -83,4 +103,10 @@ meaning all files recursively.
For finer-grained control of the Apache Tika parsers (for example, to turn
off OCR capabilities), you can use a regular XML config file recognized by
Tika (see https://tika.apache.org/2.9.2/configuring.html[Tika
documentation]), and specify `-Dtika.config` to point at the file.
documentation]), and specify `-Dtika.config` to point at the file.

== Configuration

Several configuration properties are available:

include::includes/quarkus-langchain4j-easy-rag.adoc[leveloffset=+1,opts=optional]
168 changes: 168 additions & 0 deletions docs/modules/ROOT/pages/includes/quarkus-langchain4j-easy-rag.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@

:summaryTableId: quarkus-langchain4j-easy-rag
[.configuration-legend]
icon:lock[title=Fixed at build time] Configuration property fixed at build time - All other configuration properties are overridable at runtime
[.configuration-reference.searchable, cols="80,.^10,.^10"]
|===

h|[[quarkus-langchain4j-easy-rag_configuration]]link:#quarkus-langchain4j-easy-rag_configuration[Configuration property]

h|Type
h|Default

a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-path]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-path[quarkus.langchain4j.easy-rag.path]`


[.description]
--
Path to the directory containing the documents to be ingested.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_PATH+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_PATH+++`
endif::add-copy-button-to-env-var[]
--|string
|required icon:exclamation-circle[title=Configuration property is required]


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-path-matcher]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-path-matcher[quarkus.langchain4j.easy-rag.path-matcher]`


[.description]
--
Matcher used for filtering which files from the directory should be ingested. This uses the `java.nio.file.FileSystem` path matcher syntax. Example: `glob:++**++.txt` to recursively match all files with the `.txt` extension. The default is `glob:++**++`, recursively matching all files.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_PATH_MATCHER+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_PATH_MATCHER+++`
endif::add-copy-button-to-env-var[]
--|string
|`glob:**`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-recursive]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-recursive[quarkus.langchain4j.easy-rag.recursive]`


[.description]
--
Whether to recursively ingest documents from subdirectories.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_RECURSIVE+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_RECURSIVE+++`
endif::add-copy-button-to-env-var[]
--|boolean
|`true`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-max-segment-size]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-max-segment-size[quarkus.langchain4j.easy-rag.max-segment-size]`


[.description]
--
Maximum segment size when splitting documents, in tokens.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_SEGMENT_SIZE+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_SEGMENT_SIZE+++`
endif::add-copy-button-to-env-var[]
--|int
|`300`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-max-overlap-size]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-max-overlap-size[quarkus.langchain4j.easy-rag.max-overlap-size]`


[.description]
--
Maximum overlap (in tokens) when splitting documents.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_OVERLAP_SIZE+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_OVERLAP_SIZE+++`
endif::add-copy-button-to-env-var[]
--|int
|`30`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-max-results]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-max-results[quarkus.langchain4j.easy-rag.max-results]`


[.description]
--
Maximum number of results to return when querying the retrieval augmentor.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_RESULTS+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_RESULTS+++`
endif::add-copy-button-to-env-var[]
--|int
|`5`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-ingestion-strategy]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-ingestion-strategy[quarkus.langchain4j.easy-rag.ingestion-strategy]`


[.description]
--
The strategy to decide whether document ingestion into the store should happen at startup or not. The default is ON. Changing to OFF generally only makes sense if running against a persistent embedding store that was already populated.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_INGESTION_STRATEGY+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_INGESTION_STRATEGY+++`
endif::add-copy-button-to-env-var[]
-- a|
`on`, `off`
|`on`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-reuse-embeddings-enabled]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-reuse-embeddings-enabled[quarkus.langchain4j.easy-rag.reuse-embeddings.enabled]`


[.description]
--
Whether or not to reuse embeddings. Defaults to `false`.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_REUSE_EMBEDDINGS_ENABLED+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_REUSE_EMBEDDINGS_ENABLED+++`
endif::add-copy-button-to-env-var[]
--|boolean
|`false`


a| [[quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-reuse-embeddings-file]]`link:#quarkus-langchain4j-easy-rag_quarkus-langchain4j-easy-rag-reuse-embeddings-file[quarkus.langchain4j.easy-rag.reuse-embeddings.file]`


[.description]
--
The file path to load/save embeddings, assuming `quarkus.langchain4j.easy-rag.reuse-embeddings.enabled == true`.

Defaults to `easy-rag-embeddings.json` in the current directory.

ifdef::add-copy-button-to-env-var[]
Environment variable: env_var_with_copy_button:+++QUARKUS_LANGCHAIN4J_EASY_RAG_REUSE_EMBEDDINGS_FILE+++[]
endif::add-copy-button-to-env-var[]
ifndef::add-copy-button-to-env-var[]
Environment variable: `+++QUARKUS_LANGCHAIN4J_EASY_RAG_REUSE_EMBEDDINGS_FILE+++`
endif::add-copy-button-to-env-var[]
--|string
|`easy-rag-embeddings.json`

|===
19 changes: 19 additions & 0 deletions docs/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@
<artifactId>quarkus-langchain4j-watsonx</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-easy-rag</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.quarkiverse.antora</groupId>
<artifactId>quarkus-antora</artifactId>
Expand All @@ -100,6 +105,19 @@
<artifactId>quarkus-langchain4j-anthropic-deployment</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-easy-rag-deployment</artifactId>
<version>${project.version}</version>
<type>pom</type>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-openai-deployment</artifactId>
Expand Down Expand Up @@ -310,6 +328,7 @@
<directory>${project.basedir}/../target/asciidoc/generated/config/</directory>
<include>quarkus-langchain4j.adoc</include>
<include>quarkus-langchain4j-anthropic.adoc</include>
<include>quarkus-langchain4j-easy-rag.adoc</include>
<include>quarkus-langchain4j-openai.adoc</include>
<include>quarkus-langchain4j-huggingface.adoc</include>
<include>quarkus-langchain4j-ollama.adoc</include>
Expand Down
2 changes: 2 additions & 0 deletions docs/src/main/resources/application.properties
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ quarkus.langchain4j.pinecone.index-name=abc
quarkus.langchain4j.pinecone.project-id=abc
quarkus.langchain4j.pinecone.api-key=abc
quarkus.langchain4j.redis.dimension=180
quarkus.langchain4j.easy-rag.path=abc
quarkus.langchain4j.easy-rag.ingestion-strategy=off
quarkus.redis.hosts=redis://localhost:6379


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ public void createInMemoryEmbeddingStoreIfNoOtherExists(
BuildProducer<SyntheticBeanBuildItem> beanProducer,
List<EmbeddingStoreBuildItem> embeddingStores,
EasyRagRecorder recorder,
EasyRagConfig config,
BuildProducer<InMemoryEmbeddingStoreBuildItem> inMemoryEmbeddingStoreBuildItemBuildProducer) {
if (embeddingStores.isEmpty()) {
beanProducer.produce(SyntheticBeanBuildItem
Expand All @@ -68,7 +69,7 @@ public void createInMemoryEmbeddingStoreIfNoOtherExists(
.defaultBean()
.unremovable()
.scope(ApplicationScoped.class)
.supplier(recorder.inMemoryEmbeddingStoreSupplier())
.supplier(recorder.inMemoryEmbeddingStoreSupplier(config))
.done());
inMemoryEmbeddingStoreBuildItemBuildProducer.produce(new InMemoryEmbeddingStoreBuildItem());
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
package io.quarkiverse.langchain4j.test;

import static org.assertj.core.api.Assertions.assertThat;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;

import java.util.List;
import java.util.logging.LogRecord;

import jakarta.inject.Inject;

Expand All @@ -29,13 +31,24 @@ public class EasyRagNotRecursiveTest {
.setArchiveProducer(() -> ShrinkWrap.create(JavaArchive.class)
.addAsResource(new StringAsset("quarkus.langchain4j.easy-rag.path=src/test/resources/ragdocuments\n" +
"quarkus.langchain4j.easy-rag.recursive=false\n"),
"application.properties"));
"application.properties"))
.setLogRecordPredicate(record -> true)
.assertLogRecords(EasyRagNotRecursiveTest::verifyLogRecords);

@Inject
InMemoryEmbeddingStore<TextSegment> embeddingStore;

Embedding DUMMY_EMBEDDING = new Embedding(new float[384]);

private static void verifyLogRecords(List<LogRecord> logRecords) {
assertThat(logRecords.stream().map(LogRecord::getMessage))
.contains(
"Ingesting documents from path: src/test/resources/ragdocuments, path matcher = glob:**, recursive = false")
.contains("Ingested 1 files as 1 documents")
.doesNotContain("Writing embeddings to %s")
.doesNotContain("Reading embeddings from %s");
}

@Test
public void verifyOnlyTheRootDirectoryIsIngested() {
List<EmbeddingMatch<TextSegment>> relevant = embeddingStore.findRelevant(DUMMY_EMBEDDING, 3);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
package io.quarkiverse.langchain4j.test;

import static org.assertj.core.api.Assertions.assertThat;
import static org.junit.jupiter.api.Assertions.*;

import java.util.List;
import java.util.logging.LogRecord;

import jakarta.inject.Inject;

Expand All @@ -28,13 +30,24 @@ public class EasyRagPathMatcherTest {
.setArchiveProducer(() -> ShrinkWrap.create(JavaArchive.class)
.addAsResource(new StringAsset("quarkus.langchain4j.easy-rag.path=src/test/resources/ragdocuments\n" +
"quarkus.langchain4j.easy-rag.path-matcher=glob:*.pdf\n"),
"application.properties"));
"application.properties"))
.setLogRecordPredicate(record -> true)
.assertLogRecords(EasyRagPathMatcherTest::verifyLogRecords);

@Inject
InMemoryEmbeddingStore<TextSegment> embeddingStore;

Embedding DUMMY_EMBEDDING = new Embedding(new float[384]);

private static void verifyLogRecords(List<LogRecord> logRecords) {
assertThat(logRecords.stream().map(LogRecord::getMessage))
.contains(
"Ingesting documents from path: src/test/resources/ragdocuments, path matcher = glob:*.pdf, recursive = true")
.contains("Ingested 1 files as 1 documents")
.doesNotContain("Writing embeddings to %s")
.doesNotContain("Reading embeddings from %s");
}

@Test
public void verifyPathMatchingOnlyPdf() {
List<EmbeddingMatch<TextSegment>> relevant = embeddingStore.findRelevant(DUMMY_EMBEDDING, 3);
Expand Down
Loading

0 comments on commit 82f6d29

Please sign in to comment.