-
Notifications
You must be signed in to change notification settings - Fork 3
Ingestion Operator Overview
The ingestion pipeline consists of ingestion operators, which this page gives an overview of.
We provide the important information of the operators, such as factory classes and properties.
An ENUMERATOR
typed operator is the start of the pipeline, emitting the retrievables.
Operator Properties:
Property | Description |
---|---|
mediaTypes |
A list of media types to emit retrievables of. One of IMAGE , VIDEO , AUDIO , MESH
|
Factory Class: FileSystemEnumerator
The FileSystemEnumerator
emits retrievables based on the file system, specifically based on a location.
Local Ingestion Context Properties:
Property | Description |
---|---|
path |
The path (relative to the working directory) to start the file tree walk from |
depth |
The depth the tree walk should go, e.g. 1 means one level deeper than current working directory, 2 means two, etc. |
skip |
How many items of the walk should be skipped from the start. |
limit |
How many items the walk should have, after skipping. |
regex |
Optionally adds a Regex pattern. The enumerator emits only files on which fullpath.matches(Regex("pattern-string")) returns true
|
Factory Class: MemoryControlledFileSystemEnumerator
The MemoryControlledFileSystemEnumerator
emits retrievables based on the file system, specifically based on a location.
In contrast to the FileSystemEnumerator
this enumerator is memory aware and paces the emission based on available memory and a heuristic to not over-load the available memory.
Local Ingestion Context Properties:
Property | Description |
---|---|
path |
The path (relative to the working directory) to start the file tree walk from |
depth |
The depth the tree walk should go, e.g. 1 means one level deeper than current working directory, 2 means two, etc. |
skip |
How many items of the walk should be skipped from the start. |
limit |
How many items the walk should have, after skipping. |
An DECODER
typed operator decodes the media file to Content
, ready for further processing.
Factory Class: VideoDecoder
A decoder for videos, which emits video and audio.
Local Ingestion Context Properties:
Property | Description |
---|---|
timeWindowMs |
The duration of the segmentation |
An EXTRACTOR
typed operator extracts, analysises the content and performs the actual ingestion.
See Analyser Overview for more information of the extractors.
An EXTRACTOR
typed operator exports derivative artifacts. E.g. a thumbnail exporter produces thumbnails.
These are defined on the schema, however the properties can be overridden from the ingestion context.
Factory Class: ThumbnailExporter
(Defined in the schema and referenced by name.)
Produces thumbnails.
Local Ingestion Context Properties:
Property | Description |
---|---|
maxSideResolution |
The longer side's size in pixels |
mimeType |
The mime type to use. One of JPG , PNG
|
A TRANSFORMER
typed operator transforms incoming retrievables to outcoming ones, might aggregate or filter them.
Factory Class: TypeFilterTransformer
Filters incoming retrievables based on their type.
Local Ingestion Context Properties:
Property | Description |
---|---|
type |
The type to allow through. One of SOURCE:IMAGE , SOURCE:VIDEO , SOURCE:AUDIO , SOURCE:MESH (custom filters could be defined) |
Factory Class: TemplateTextTransformer
Takes a string template with defined placeholders, and fills these with content for the retrievable in the corresponding fields.
Local Ingestion Context Properties:
Property | Description |
---|---|
template |
A string containing placeholders of the form $fieldName , where the placeholder must correspond to the name of the content field from which to include the string content. |
defaultValue |
An optional parameter of text to include if a retrievable does not have content associated with a given fieldName . |
Factory Class: LastContentAggregator
Aggregates content based on the 'last' strategy.
Local Ingestion Context Properties:
none
Found an issue in the wiki? Post it!
Have a question? Ask it
Disclaimer: Please keep in mind, vitrivr and vitrivr-engine are predominantly research prototypes.