This interface provides a mechanism for kdb+ users to interact with and create HDF5 datasets. The interface is a thin wrapper for kdb+ around the HDF Group's C api for HDF5, outlined in full here
This is part of the Fusion for kdb+ interface collection.
Kdb+ is the world's fastest time-series database, optimized for ingesting, analyzing and storing massive amounts of structured data. To get started with kdb+, please visit https://code.kx.com/q/learn/ for downloads and developer information. For general information, visit https://kx.com/
Hierarchical Data Format 5 (HDF5) is a file format designed specifically for the storage and organization of large amounts of data.
In many ways, HDF5 acts like a hierarchical file system similar to that used by linux or windows. This structure contains two major objects:
- Datasets - Multidimensional arrays of homogeneously-typed data, or compound data containing a mixture of types. Datasets are similar to files within a traditional POSIX file system.
- Groups - Container structures holding datasets or other groups. They function similarly to directories within a traditional POSIX file system.
There are a number of secondary objects and structures, which add complexity to the format. In doing so, they allow the format to be used for a wider range of use cases
- Attributes: These allow metadata information to be associated with a dataset or group e.g. associate the date of data collection with a group, or the temperature when a set of results was collected.
- Linking functionality: Like a traditional POSIX file system it is possible to create links between objects (hard/soft/external). These allow datasets or groups relevant to multiple experiments to be accessed via more intuitive routes.
If you have any HDF5 related questions, you can raise them on the HDF Forum.
- kdb+ ≥ 3.5 64-bit
- HDF5 C api ≥ 1.10.9
It is recommended that a user install this interface through a release. This is completed in a number of steps
- Ensure you have downloaded/installed the HDF groups C api for HDF5 following the instructions here
- Download a release from here
- Install by executing the following from the Release directory. NOTE: by default, the q executable script
q/hdf5.q
and binary filelib/hdf5kdb.(so|dll)
will be copied to$QHOME
and$QHOME/[mlw](64)
but you can override by setting the environment variables Q_SCRIPT_DIR and/or Q_SHARED_LIB_DIR
## Linux/MacOS
$ chmod +x install.sh && ./install.sh
## Windows
>install.bat
In order to successfully build and install this interface, the following environment variables must be set:
HDF5_INSTALL_DIR
= Location of the HDF5 C api installation (directory containing/include
and/lib
subdirectories).QHOME
= Q installation directory (directory containingq.k
).
Install HDF5 C API according to your architecture.
Linux
Download a supported release of hdf5 and install, instructions are provided in the README of the hdf5 packages. Then set HDF5_INSTALL_DIR
to your install directory and add the path to LD_LIBRARY_PATH
.
$ export LD_LIBRARY_PATH=${HDF5_INSTALL_DIR}/lib:$LD_LIBRARY_PATH
MacOS
Run:
$ brew install hdf5
Then set HDF5_INSTALL_DIR
to your install directory and add the path to DYLD_LIBRARY_PATH
.
$ export DYLD_LIBRARY_PATH=${HDF5_INSTALL_DIR}/lib:$DYLD_LIBRARY_PATH
Windows
Install the windows C api for HDF5 following the instructions provided here. Then set HDF5_INSTALL_DIR
to your install directory and create links to the dll's in the %QHOME%\w64
directory. e.g.
:: Download zip for Windows and unzip it. ex.) hdf5-1.12.0.zip from https://confluence.hdfgroup.org/display/support/HDF5%201.12.0
> cd hdf5[some version]
> mkdir build
> mkdir install
> set HDF5_INSTALL_DIR=%cd%\install
> cd build
build> cmake --config Release -DCMAKE_INSTALL_PREFIX=%HDF5_INSTALL_DIR% .. -DBUILD_TESTING:BOOL=OFF
build> cmake --build . --config Release --target install
build> cd %QHOME%\w64
w64> MKLINK libhdf5.dll %HDF5_INSTALL_DIR%\bin\libhdf5.dll
w64> MKLINK hdf5.dll %HDF5_INSTALL_DIR%\bin\hdf5.dll
]$ mkdir build && cd build
build]$ cmake ..
build]$ cmake --build . --target install
From a Visual Studio command prompt:
> mkdir build && cd build
build> cmake --config Release ..
build> cmake --build . --config Release --target install
This interface is in active developement and as such there are a number of use-cases that are currently not supported.
- Creation of compressed datasets
- Access to unlimited datasets
- Interaction with HDF5 images
If your use case requires the above functionality to be available, please open an issue here. If you are capable, please consider contributing to the project.
Please view this HDF5 FAQ article that explains when you can use threads (e.g. peach) or concurrently process files from multiple instances.
By default, our releases are not built with multi-thread support. Please download the source from our release page in order to build against your HDF5 library with threading enabled if you wish to avail of that functionality. It is recommended to use the standard non-thread safe version of hdf5 when not wishing to use multiple threads.
Documentation outlining the functionality available for this interface can be found here.
The HDF5 interface is provided here under an Apache 2.0 license.
If you find issues with the interface or have feature requests please consider raising an issue here.
If you wish to contribute to this project please follow the contributing guide here.