Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement searchv2 #3058

Open
roman-khimov opened this issue Dec 18, 2024 · 10 comments
Open

Implement searchv2 #3058

roman-khimov opened this issue Dec 18, 2024 · 10 comments
Labels
feature Completely new functionality I1 High impact S1 Highly significant U2 Seriously planned
Milestone

Comments

@roman-khimov
Copy link
Member

Is your feature request related to a problem? Please describe.

I'm always frustrated when we don't have an implementation for nspcc-dev/neofs-api#314.

Describe the solution you'd like

The per-container DB should be structured like:

Given
DELIM = 0x00 // Not a valid UTF-8, can't be attribute name or value
KEY // Attribute key or special attribute that can be searched for
VALUE // Attribute value which is either a string or number, strings are used as is, numbers are converted to fixed-width (256?) BE
PREFIXA // And B/C/D, some byte prefixes
OID // Object ID

For each object the following keys are created (with no values):
PREFIXA_OID // OIDs only to list objects without filters
PREFIXB_KEY_DELIM_VALUE_DELIM_OID // The main search workhorse, created for all keys
PREFIXC_OID_KEY_DELIM_VALUE // Auxiliary for secondary filters and additional data returned

The mechanics is:

  • node accepting the request forwards it to other nodes as in case of original search
  • it accepts answers, merges them and limits to the number requested (or max)
  • it generates the cursor corresponding to the result returned to client (more on cursor below)
  • reply is ready and sent

Each node does the following:

  • if there are no filters we're just looping over PREFIXA in DB, the cursor is OID as is
  • if there are filters, the first one is magic: whatever the key/action/value is there is used to position DB cursor into PREFIXB key if match is EQUAL/NOT_EQUAL/PREFIX/NUM* (don't forget about negative numbers)
  • other matches can be checked for using the same key (key>N && key <M), this can shortcut the search more quickly for numerics
  • iterating over the DB we check all matches via PREFIXC while the first one works (when it's not, we're done)
  • search is limited by the number of elements requested or max (1000), so we return results earlier if we have enough elements
  • we add requested fields of matching elements using PREFIXC
  • a cursor is returned if we're not yet at the end, it's KEY_DELIM_VALUE_DELIM_OID of the last element encoded as base64 or base58, this allows to continue easily
  • if the first filter is NOT_PRESENT we're also looping over PREFIXA and checking PREFIXC

Describe alternatives you've considered

SQL, various other types of DBs. But the scheme above should be sufficient for our primary cases now.

Additional context

#2990, #2757, #2989, nspcc-dev/neofs-api#306

@roman-khimov roman-khimov added this to the v0.45.0 milestone Dec 18, 2024
@roman-khimov roman-khimov added U2 Seriously planned S1 Highly significant I1 High impact feature Completely new functionality labels Dec 18, 2024
@roman-khimov
Copy link
Member Author

Caveat: creating a cursor from merged values can be non-trivial if attribute is not included into the requested list. It can be degraded to a simple OID then (complicating continuation somewhat) and in general most of use cases do need attribute values, but still.

@roman-khimov
Copy link
Member Author

Caveat 2: numeric values might require an additional prefix anyway since we can have Index=100500 in one object and Index=abcd in another, using the same prefix they'd be mixed and we can end up treating strings as numbers.

@cthulhu-rider
Copy link
Contributor

note: per-container DB describes virtual structure, physically it is split within existing metabases

@roman-khimov
Copy link
Member Author

Yes, we need to limit changes to this specific feature (expose API as early as possible) and deal with associated meta code (GC and alike) in future. Search is still possible with multiple DBs since results can be merged similar to the way results from different nodes are merged.

@cthulhu-rider
Copy link
Contributor

cthulhu-rider commented Jan 10, 2025

Attribute value which is either a string or number, strings are used as is, numbers are converted to fixed-width (256?) BE

choice is obvious for system fields. For example, owner ID is a string while payload size is an integer

for user-defined attributes it is not so obvious. Like here #3058 (comment). In current protocol, there is no way to determine whether user attribute is numeric or not. So, I rly doubt storing them in various formats is legit. But we can resolve this on search query processing. In original search, any non-integer attribute mismatches any numeric query. Do we wanna change this behaviour for SearchV2 somehow?

@roman-khimov u also mentioned some special prefix, could u pls elaborate on this thought?

@roman-khimov
Copy link
Member Author

You can only do this content-based, just like you do this now for old search. The only difference is that the choice is made when processing the object instead of when processing the search request.

Special prefix means splitting PREFIXB into B1 and B2 for numeric and string data.

@cthulhu-rider
Copy link
Contributor

cthulhu-rider commented Jan 13, 2025

if there are no filters we're just looping over PREFIXA in DB, the cursor is OID as is

shouldnt cursor be OID + values of requested attributes to sort/continue in PREFIXC in this case?

UPD: seems like no, missed this requirement https://github.com/nspcc-dev/neofs-api/blob/9f1f12866a4742adb7778c51bd632cd240f81262/object/service.proto#L554-L555

@cthulhu-rider
Copy link
Contributor

cthulhu-rider commented Feb 3, 2025

i'd like to clarify primary seek in proposed algo. Consider objects:

ID1 Height:10 Weight:20
ID2 Height:10 Weight:10
ID3 Height:10 Weight:30

where ID1 < ID2 < ID3

request: FILTER Height>0 Count:1 Attributes:{Weight}

first resp: ID2 Weight:10 cursor:Height_10_ID2

on 2nd request, we position to ID2 in PREFIXB bucket. Then the cursor will go to ID3 and skip ID1, which is wrong: next resp item should be ID1 Weight:20 cursor:Height_10_ID1

this example shows that primary Seek() and Next() can go wrong. Instead, we should iterate over all KEY_DELIM_VALUE_DELIM* items. Or am i missing smth?


one more nuance

if last resp was ID1 Weight:20 cursor:Height_10_ID1, then node should skip ID2 and respond with ID3. If node stores all objects, it can restore Weight attribute of ID1 from the DB to compare other items against it. But if node does not store ID1, it'll respond with ID2 althoughts its Weight is less. For this purpose it would help to have a cursor with all requested attributes' values, not just the primary one

@roman-khimov
Copy link
Member Author

Correct. We have two options here:

  • iterate whole primary attribute prefix again
  • relax ordering requirements for secondary attributes

Our primary use cases for now:

  • OID enumeration, isn't affected
  • block index a < x < b search, isn't affected
  • REST FilePath=smth and we need a timestamp
  • S3 FilePath=smth AND Type=smth AND maybe more AND please give me a lot of addtional attributes

Secondary attribute order does have some advantages for the REST/S3 cases. But to be fair both would benefit a bit more from the reverse order, since when we're talking about time stamps we usually need the latest and it's going to be the last. Implementing reverse result order is certainly not something we want now. We still need this to be simple and to be fast. Both REST and S3 cases are not very likely to produce a lot of results at the same time (very likely to fit into 1000 limit). So I'd opt for relaxing ordering requirements to be "primary attribute only". Easier to implement, will work good enough for current users. If we're to find other use cases we can think of (even more advanced) ordering again.

@cthulhu-rider
Copy link
Contributor

So I'd opt for relaxing ordering requirements to be "primary attribute only". Easier to implement

full agree, lets start with this

cthulhu-rider added a commit that referenced this issue Feb 6, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.
cthulhu-rider added a commit that referenced this issue Feb 6, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 7, 2025
Future use-cases:
 - merge results from several shard's metabases;
 - merge results from several SNs.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 7, 2025
Future use-cases:
 - merge results from several shard's metabases;
 - merge results from several SNs.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 7, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 7, 2025
Future use-cases:
 - merge results from several shard's metabases;
 - merge results from several SNs.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 7, 2025
WIP

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 10, 2025
WIP

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 11, 2025
WIP

Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 11, 2025
Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 11, 2025
Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 11, 2025
Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 11, 2025
Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cthulhu-rider added a commit that referenced this issue Feb 11, 2025
Refs #3058.

Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Completely new functionality I1 High impact S1 Highly significant U2 Seriously planned
Projects
None yet
Development

No branches or pull requests

2 participants