Skip to content

Commit

Permalink
Update docs/vectorizer/api-reference.md
Browse files Browse the repository at this point in the history
Co-authored-by: James Guthrie <JamesGuthrie@users.noreply.github.com>
Signed-off-by: Jascha Beste <bestejascha@gmail.com>
  • Loading branch information
Askir and JamesGuthrie authored Feb 28, 2025
1 parent 5001d04 commit bc0a634
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions docs/vectorizer/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,8 @@ The parsing functions are:

### ai.parsing_auto

You use `ai.parsing_auto` to automatically parse the data with a fitting available parser.
This special parser will decide on the fly which parser fits with each of the documents by guessing the file type.
If the type can not be guessed, the document will not be processed and an error will be appended to the vectorizer errors.
You use `ai.parsing_auto` to automatically select an appropriate parser based on detected file types.
Documents with unrecognizable formats won't be processed and will generate an error (in the `ai.vectorizer_errors` table.

The parser selection works by examining file extensions and content types:
- For PDF files, images, Office documents (DOCX, XLSX, etc.): Uses docling
Expand Down

0 comments on commit bc0a634

Please sign in to comment.