Releases: marqo-ai/marqo
Release 2.15.0
2.15.0
New Features
-
Global Score Modifiers for Hybrid Search (#1082). Introduce global score modifiers for hybrid search, allowing fine-tuned adjustments to RRF scores in combined result lists. This enhancement provides better control over returned results in hybrid search scenarios. Use the
rerankDepth
parameter to control the number of hits to rerank. For detailed usage, check here. -
Custom LanguageBind Model (#1072). Marqo now supports loading custom LanguageBind models from S3 buckets, URLs, or HuggingFace model cards. Fine-tune your own LanguageBind model and integrate it with Marqo to achieve better in-domain results. For more details, see here.
-
Add Marqtune models to the model registry (#1063). Marqtune models have been added to the model registry, with some models renamed to align with the Marqtune naming convention. This update improves consistency and makes it easier to identify models Marqo and Marqtune. The changes are fully backwards compatible.
Bug Fixes and Minor Changes
- Fix a bug where Marqo incorrectly inferred the modality of a field, even when the field is not a tensor field in an unstructured index (#1086).
- Resolve an issue where Languagebind models could only handle a single video or audio in a weighted search query. You can now provide multiple videos and audios in a weighted query seamlessly.(#1072).
- Improve memory usage when indexing image documents with LanguageBind models, enabling more efficient handling of image data. (#1072).
Contributor Shout-Outs
Release 2.14.1
2.14.1
New features
-
Add support for hf_transfer to accelerate the downloading speed of HuggingFace models by 10 to 30 times. See here for details about how to enable it (#1066).
-
Add
/healthz
endpoint for Marqo container liveness checks, which performs a status check for CUDA devices and returns a 500 error if any existing CUDA devices become unavailable or run out of memory (#1068).
Bug fixes and minor changes
-
Fix a bug where numeric map fields are not returned when searching with
attributes_to_retrieve
parameter for unstructured indexes created prior to Marqo 2.13 (#1062). -
Fix a bug where numeric fields, numeric map fields, boolean fields and string array fields are not returned when searching with
attributes_to_retrieve
parameter for unstructured indexes created with Marqo 2.13 or later (#1062). -
Fix a bug where
document-processing
element is removed from theservices.xml
config file when bootstrapping the vector store (#1075).
Release 2.13.4
Release 2.13.3
2.13.3
Bug fixes and minor changes
-
Fix a bug where numeric map fields are not returned when searching with
attributes_to_retrieve
parameter for unstructured indexes created prior to Marqo 2.13 (#1062). -
Fix a bug where numeric fields, numeric map fields, boolean fields and string array fields are not returned when searching with
attributes_to_retrieve
parameter for unstructured indexes created with Marqo 2.13 or later (#1062).
Release 2.14.0
2.14.0
New features
-
FFmpeg-CUDA Support (#1030). Add GPU acceleration for video decoding by integrating FFmpeg with CUDA support. This feature significantly improves video processing performance, making video handling up to 5 times faster. Check here for guidance and requirements.
-
Video and audio file size limits (#1012). Introduce configurable size limits for video and audio files in the add_documents, search, and embed endpoints. This enhancement allows users to manage and optimize resource usage effectively, ensuring smoother processing of multimedia content. Check here for more details.
-
Upgrade to Python 3.9 (#1006). Upgrade the Marqo Docker image to use Python 3.9. With Python 3.8 reaching its End of Life (EOL), we have upgraded our platform to Python 3.9 to maintain security, compatibility, and access to ongoing support.
Bug fixes and minor changes
-
Move NLTK resource downloads to Marqo's startup process and remove the unsafe punkt package. This avoids potential cold start issues and enhances security (#1040).
-
Fix model serialization in OpenAPI specifications. This community-contributed PR resolves issues with OpenAPI spec generation and SwaggerUI by fixing incorrect type hints in the API definition, ensuring accurate model serialization and improving API documentation accessibility (#986).
-
Add brief description for each endpoint in the OpenAPI specifications. This improves API documentation clarity for users (#1042).
Contributor shout-out
- Shoutouts to our valuable 4.7k stargazers!
- Thanks a lot for the heated discussion and suggestions in our community. We love to hear your thoughts and requests. Join our Slack channel and forum now.
- Special thanks to community contributor @gabauer for their impactful PR, helping improve Marqo for everyone!
Release 2.13.2
2.13.2
Bug fixes and minor changes
- Fix a bug where adding documents with numeric lists to an unstructured index results in a 500 error. Now, Marqo successfully processes the document batch, and returns a 400 error only for individual documents that contain numeric lists(1034).
- Fix validation of custom vector fields. Custom vector fields were silently ignored when not specified as tensor fields for an unstructured index. This will now trigger a 400 error. This helps guide users to properly define the field as a tensor field(1034).
- Improve the bootstrapping process to prevent Marqo from crashing during startup when the vector store takes longer to converge, especially with multiple indexes. This ensures a smoother startup process even if the vector store takes time to fully initialize(1036).
Release 2.13.1
2.13.1
Bug fixes and minor changes
- Fix a bug where Marqo returns a 500 error if an inaccessible private image is encountered in the query or embed endpoint. Marqo now correctly returns a 400 error with a helpful error message (#1027).
- Fix a bug preventing Marqo from warming up Languagebind models. Marqo now successfully warms up Languagebind models as expected (#1031).
- Fix a bug where Languagebind models always generate normalized embeddings for non-text content. These models now correctly produce unnormalized embeddings for video, audio, and image content (#1032).
Release 2.13.0
2.13.0
New features
-
Searchable attributes for unstructured indexes (#968). This new feature allows you to specify which lexical or tensor fields to include in your search queries, providing greater control over the search process. By customizing your search parameters, you can enhance the precision of your results across all search types: tensor, lexical, and hybrid. This feature is available for unstructured indexes created with Marqo 2.13 or later. For detailed guidance, please refer to the API reference and comparison of unstructured and structured indexes
-
Support for
stella_en_400M_v5
embedding models (#1021). This feature adds compatibility for the Stella 400M text embedding models, enhancing the versatility of Marqo in handling diverse model types. Users can now use thehf_stella
model type in their custom models. Please refer to stella model guide for details. -
Allow specifying pooling method for Hugging Face models (#954). Marqo can now infer the pooling method and accept user provided pooling method in model properties. For detailed examples, please refer to this document about bringing your own Hugging Face model.
Bug fixes and minor changes
-
Normalize custom vectors during indexing when
normalizeEmbeddings
is set to True for indexes created with Marqo 2.13 or later (#970). This fix ensures that custom vector fields align with other tensor fields in terms of normalization, resulting in more accurate search results and improved overall performance. -
Enhanced query parser for double quotes (#979). This feature introduces improved parsing logic for handling double quotes in search queries, allowing for greater flexibility and resilience against syntax errors. Badly formatted and escaped quotes no longer lead to 500 status errors. Please refer to the lexical search guide for more details and examples.
-
Bug fix for score modifiers handling (#1008). This update resolves an issue related to the handling of score modifiers in queries, specifically those involving the period
.
character. Users will now experience smoother query operations without encountering internal errors, ensuring that score modifiers are correctly applied. -
Bug fixes for media download and query handling (#1022). Users can now successfully download private media files by using the new
mediaDownloadHeaders
parameter, which will replace the deprecatedimageDownloadHeaders
. Additionally, the fix resolves issues preventing the inclusion of more than two modalities in weighted queries, along with support for indexing.png
images in Languagebind models.
Contributor shout-outs
- Shoutouts to our valuable 4.6k stargazers!
- Thanks a lot for the discussion and suggestions in our community. We love to hear your thoughts and requests. Join our Slack channel and forum now.