Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAT-74] feat: Do not rebuild search index from scratch #2729

Merged
merged 4 commits into from
Feb 6, 2025

Conversation

paultranvan
Copy link
Contributor

Each time a searchEngine is instanciated, we used to rebuild the search
indexes from scratch, i.e. query the local database and compute the
indexes. This can be quite time consuming, especially in slow mobile
devices. Therefore, we now export the search indexes on disk, and import
them at init time. We save the lastSeq for each index, to be the
starting point of the next changes batch updates, fetched from the local
database after a remote replication.

Some measures, performed on a 8K files database:

Redmi Note 10

  • Query all io.cozy.files : 3700ms

  • Build files index : 6500ms

  • Total: 10200 ms

  • Import persisted index: 800 ms

  • Export index: 1057ms

OnePlus Nord 4

  • Query all io.cozy.files : 2000ms

  • Build files index : 2900ms

  • Total: 4900 ms

  • Import persisted index: 332 ms

  • Export index: 363ms

If we compare the query+build index time w.r.t the import time, we have
a ~13x time improvement before the index is ready to use.

@paultranvan paultranvan force-pushed the import-export branch 2 times, most recently from 96fa493 to a445274 Compare February 6, 2025 14:45
for (let attempt = 0; attempt < maxRetries; attempt++) {
const searchResults = await dataProxy.search(searchValue)

if (searchResults && Array.isArray(searchResults)) {
Copy link
Contributor

@zatteo zatteo Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the search is unsuccesful, what can it return ? null ? Or something else ?

I find it strange to need to check if is an array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index are not initialized, null will be returned
If the index are initialized, and the search is working it will be an array

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So theoretically no need to check Array.isArray(searchResults) ? But it's a nit, no problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess we could just do if (searchResults) and avoid an additional check 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed within 344c30c

const searchWithRetry = async (
dataProxy,
searchValue,
{ maxRetries = 5, delay = 500 } = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500 is maybe an high delay for the first attempt ? But I don't know what time can this kind of thing take.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The very first attempt will be right away. It is actually the second attempt that will make 500ms. The performances measures in the PR description give some insights of the time to have initialized index

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try with this and we'll see.

@zatteo
Copy link
Contributor

zatteo commented Feb 6, 2025

Nice! I like the backoff.

if (searchIndex) {
this.searchIndexes[doctype] = searchIndex

if (this.isLocalSearch) {
Copy link
Member

@Ldoppea Ldoppea Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there seems to be no code branch for the remote search anymore, is that expected?

Also those are 2 consecutive if(this.isLocalSearch), should we merge theme?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely not expected, good catch!
Fixed within 0443ed6

Each time a searchEngine is instanciated, we used to rebuild the search
indexes from scratch, i.e. query the local database and compute the
indexes.  This can be quite time consuming, especially in slow mobile
devices. Therefore, we now export the search indexes on disk, and import
them at init time.  We save the lastSeq for each index, to be the
starting point of the next changes batch updates, fetched from the local
database after a remote replication.

Some measures, performed on a 8K files database:

**Redmi Note 10**

- Create index: without peristance - old way
  - Query all io.cozy.files : 3700ms
  - Build files index : 6500ms
  - **Total: 10200 ms**
- Create index: with perstance - new way
  - **Import persisted index: 800 ms**
  - Export index: 1057ms

**OnePlus Nord 4**

- Create index: without peristance - old way
  - Query all io.cozy.files : 2000ms
  - Build files index : 2900ms
  - **Total: 4900 ms**
- Create index: with perstance - new way
  - **Import persisted index: 332 ms**
  - Export index: 363ms

If we compare the query+build index time w.r.t the import time, we have
a ~13x time improvement before the index is ready to use.
@paultranvan paultranvan force-pushed the import-export branch 2 times, most recently from fc59d65 to 51abf16 Compare February 6, 2025 16:46
Sometimes the search might return nothing because the indexes are not
ready yet. In such case, the user will get no search result and will
have to retry.
To avoid this, we now wait for the index to be ready by making several
retries, with exponential backoff
@paultranvan paultranvan merged commit 5b00b12 into master Feb 6, 2025
2 checks passed
@paultranvan paultranvan deleted the import-export branch February 6, 2025 16:59
@paultranvan paultranvan changed the title feat: Do not rebuild search index from scratch [DAT-74] feat: Do not rebuild search index from scratch Feb 7, 2025
Ldoppea added a commit to cozy/cozy-flagship-app that referenced this pull request Feb 7, 2025
Since cozy/cozy-libs/pull/2729 the SearchEngine requires the consuming
app to provide a storage provider in order to persist its index

Related PR: cozy/cozy-libs#2729
Ldoppea added a commit to cozy/cozy-flagship-app that referenced this pull request Feb 7, 2025
Since cozy/cozy-libs/pull/2729 the SearchEngine requires the consuming
app to provide a storage provider in order to persist its index

Related PR: cozy/cozy-libs#2729
Ldoppea added a commit to cozy/cozy-flagship-app that referenced this pull request Feb 12, 2025
Since cozy/cozy-libs/pull/2729 the SearchEngine requires the consuming
app to provide a storage provider in order to persist its index

Related PR: cozy/cozy-libs#2729
Ldoppea added a commit to cozy/cozy-flagship-app that referenced this pull request Feb 12, 2025
Since cozy/cozy-libs/pull/2729 the SearchEngine requires the consuming
app to provide a storage provider in order to persist its index

Related PR: cozy/cozy-libs#2729
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants