Replies: 4 comments
-
Hello @doncat99! Yes, sounds great! There are two ways to do this: You just do an implementation of documentLoader, that will just return this content. The only problem here is that the structure needs to be defined in a specific way, in you just return will not work. Another way to do this, is just in the extract you can pass the raw content to enrich the extraction: We can also make a DcoumenLoader just for injection, that checks that everything is correct. I can do that |
Beta Was this translation helpful? Give feedback.
-
@enoch3712, thanks for the hint! The code works like below.
|
Beta Was this translation helpful? Give feedback.
-
Hello @doncat99! yes, sounds good! |
Beta Was this translation helpful? Give feedback.
-
Gonna take a look at this for the next release |
Beta Was this translation helpful? Give feedback.
-
above is the standard usage of ExtractThinker.
What if I already have custom processing for the PDF document, such as removing headers and footers and filtering out the target string from the PDF document, and I want the extractor to continue based on my pdf_string?
Beta Was this translation helpful? Give feedback.
All reactions