Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate mlkem-native into libOQS #653

Open
hanno-becker opened this issue Jan 13, 2025 · 19 comments
Open

Integrate mlkem-native into libOQS #653

hanno-becker opened this issue Jan 13, 2025 · 19 comments

Comments

@hanno-becker
Copy link
Contributor

hanno-becker commented Jan 13, 2025

This issue is to track the ongoing integration of mlkem-native into libOQS.

Status:
PR to libOQS open here open-quantum-safe/liboqs#2041

@bhess
Copy link
Contributor

bhess commented Jan 13, 2025

Thanks for opening this issue here, @hanno-becker, to track the integration. I’m planning to open a draft PR in OQS shortly.

One thing I noticed with a recent change in mlkem-native is that I had to modify quite a few files due to the introduction of relative import paths, which are different in the liboqs import. Specifically, I had to revert some of the import path changes made in this commit.

Would it be possible to reconsider this approach? While I understand that it simplifies the build system, it reduces portability and makes integration with different build systems more difficult due to the rigid folder structure. Defining include directories in the build system might be a more flexible solution in the long run.

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 13, 2025

@bhess Thank you for flagging this!

You are right that the include-path changes force you to keep mlkem-native's internal folder structure -- I wasn't expecting that one would want to tweak that. Can you say more about why/how this needs to change for OQS, so we can look for suitable alternatives?

Is it so you don't have to recursively crawl the source files and can store them flattened?

@bhess
Copy link
Contributor

bhess commented Jan 13, 2025

Is it so you don't have to recursively crawl the source files and can store them flattened?

Yes, that's correct. We typically maintain a flat directory structure for each implementation. Prior to commit 73c901b, this approach worked without requiring any modifications.

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 13, 2025

@bhess Ok, understood.

Do you do this to keep the build specification simple, or are there other reasons why you don't want subdirectories?

I'm asking because as far as C files are concerned, there is the autogenerated 'monobuild' file examples/monolithic_build/mlkem_native_monobuild.c which bundles all *.c files into one. You can compile it just with gcc -c mlkem_native_monobuild.c, no includes needed.

I am working on the same for ASM files, but it's not yet there.

Can you say whether, in principle, copying the whole directory structure but then merely building a single monobuild file (one *.c, one *.S), would work for you?

Can you also clarify if this is blocking you, or whether you will still be able to open a draft PR as-is (albeit with patches)?

@bhess
Copy link
Contributor

bhess commented Jan 13, 2025

The main reason is how the automated import of upstream sources in liboqs works. We explicitly specify the source files to be imported, which are then flattened into a single directory. The .c and .S files are added to a list of files for compilation. While it's technically possible to import an entire folder structure, doing so would include files that aren't needed for compilation.

We don't have a requirement for a 'monobuild', and I'm unsure if it would be compatible with our current import mechanism, as each source file is treated as an independent compilation unit in our case.

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 13, 2025

Thank you for the explanation, @bhess.

While it's technically possible to import an entire folder structure, doing so would include files that aren't needed for compilation.

And it's not possible to explicitly list the files to be imported and require that their folder structure be preserved?

We don't have a requirement for a 'monobuild', and I'm unsure if it would be compatible with our current import mechanism, as each source file is treated as an independent compilation unit in our case.

Good to know.

@bhess
Copy link
Contributor

bhess commented Jan 13, 2025

And it's not possible to explicitly list the files to be imported and require that their folder structure be preserved?

There hasn’t been a use case for this feature yet. While adding it could certainly be an option, I believe not being to rigid in the folder structure could benefit other consumers of mlkem-native. Different projects often have unique requirements, such as specific folder structures or the need for a single compilation unit (which appears to be addressed already by your monobuild approach).

@bhess
Copy link
Contributor

bhess commented Jan 13, 2025

See also the draft PR in liboqs: open-quantum-safe/liboqs#2041

@hanno-becker
Copy link
Contributor Author

@bhess Amazing, thank you very much for your work here, and for adjusting to our currently fast-moving code-base. Does OQS have a patch mechanism so that, in principle, you can use a patched version of mlkem-native for now?

@mkannwischer and I will think about how we could solve this so that both the monobuild and the 'flattened' build are convenient. Right now, I'm a bit out of ideas. If you have some -- shoot!

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 20, 2025

@bhess Question on namespacing: Is it of relevance to you to have a separate namespacing macro for FIPS202? I am considering the consolidation of MLKEM_NAMESPACE_PREFIX and FIPS202_NAMESPACE_PREFIX. It probably makes no difference since you are using a custom FIPS202 anyway?

@bhess
Copy link
Contributor

bhess commented Jan 20, 2025

That's right, we won't use the fips202 code of mlkem-native and therefore no need for namespacing it.

@bhess Amazing, thank you very much for your work here, and for adjusting to our currently fast-moving code-base. Does OQS have a patch mechanism so that, in principle, you can use a patched version of mlkem-native for now?

Correct, we do have a patch mechanism, which is the way I'm adjusting the imports at the moment,

@hanno-becker
Copy link
Contributor Author

That's right, we won't use the fips202 code of mlkem-native and therefore no need for namespacing it.

Thanks for the fast reply. I opened #672

@bhess
Copy link
Contributor

bhess commented Jan 21, 2025

I have opened the pull request in liboqs for review.

During the review, a concern was raised about the substantial size of the patch. This poses a challenge for maintaining the upstream code on our end since mlkem-native is (luckily) actively developed and the patch will therefore need to be frequently changed.

These are the two main reasons for the patch, where a solution on the upstream side would be helpful:

  • Including yaml files for the import mechanism

In liboqs we use yaml files to manage the import mechanism for all upstream sources integrated into liboqs. Would you consider adding these yaml files to your repository? If preferred, they could be organized within a subfolder to maintain structure.

  • Modifying hardcoded import paths in mlkem-native

The hardcoded import paths in mlkem-native required the modifications we discussed before. Would you be open to reconsidering this approach? I guess a potential solution could involve making the include paths configurable via a macro or compile definitions.

Your feedback on these suggestions would be appreciated.

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 21, 2025

In liboqs we use yaml files to manage the import mechanism for all upstream sources integrated into liboqs. Would you consider adding these yaml files to your repository? If preferred, they could be organized within a subfolder to maintain structure.

Can you say more about the envisioned content of the YAML files? I'm fine with a META.yml matching the contents of META.json, but some of the fields in the YAML in the PR patch seem specific to how OQS manages imports, rather than a commonly used standard, but please correct me if I am wrong. cc @mkannwischer

See #679

The hardcoded import paths in mlkem-native required the modifications we discussed before. Would you be open to reconsidering this approach? I guess a potential solution could involve making the include paths configurable via a macro or compile definitions.

The current approach simplifies a potential integration into AWS-LC; I will take a closer look what an integration there might look like without relative imports.

@bhess I am sorry for being slow, but I'm still a bit confused about the import constraints in libOQS. In open-quantum-safe/liboqs#2041 it seems that you do import the library with the same directory structure as used in this repository, so it's not clear why the include-path patch is needed. Can you explain again what your constraints are?

@bhess
Copy link
Contributor

bhess commented Jan 22, 2025

See #679

Thank you!

The current approach simplifies a potential integration into AWS-LC; I will take a closer look what an integration there might look like without relative imports.

@bhess I am sorry for being slow, but I'm still a bit confused about the import constraints in libOQS. In open-quantum-safe/liboqs#2041 it seems that you do import the library with the same directory structure as used in this repository, so it's not clear why the include-path patch is needed. Can you explain again what your constraints are?

The imported folder structure isn't exactly the same. For instance, instead of importing the entire "native" folder, I only want to include "native/aarch64" for the ARM version. As a result, we don't have a "native" folder, which breaks the relative imports. A solution is introducing a flag on our end to preserve the full folder structure. I'll investigate this further.. And of course, if you find a way to avoid the relative paths that would be still beneficial :)

@hanno-becker
Copy link
Contributor Author

The imported folder structure isn't exactly the same. For instance, instead of importing the entire "native" folder, I only want to include "native/aarch64" for the ARM version. As a result, we don't have a "native" folder, which breaks the relative imports. A solution is introducing a flag on our end to preserve the full folder structure. I'll investigate this further.. And of course, if you find a way to avoid the relative paths that would be still beneficial :)

Ah ok, I understand. When the import YAML says foo/bar, then bar is imported at the root, regardless of whether it's a file or a directory.

Thank you for investigating -- yes, it would seem that a flag causing foo/bar to be imported as foo/bar with preceding mkdir -p foo would do the trick, and hopefully not be disruptive to the rest of the importer?

@bhess
Copy link
Contributor

bhess commented Jan 22, 2025

One more point while reviewing the patch I currently apply: In liboqs, we utilize our own FIPS202 implementations instead of the FIPS202 backends provided by mlkem-native. This is easily handled by not importing the fips202 folder.

However, the macro MLKEM_USE_NATIVE currently seems to serve a two purposes: Activating the native mlkem-native optimizations (which we use), and enabling the mlkem-native FIPS202 backend (which we do not use).

To address this, I’ve modified the guards around the relevant section of the configuration file:

mlkem-native/mlkem/config.h

Lines 162 to 164 in e219d47

#if defined(MLKEM_USE_NATIVE) && !defined(MLKEM_NATIVE_FIPS202_BACKEND)
#define MLKEM_NATIVE_FIPS202_BACKEND "fips202/native/default.h"
#endif /* MLKEM_NATIVE_FIPS202_BACKEND */

with

#if defined(MLKEM_USE_NATIVE_FIPS202) && !defined(MLKEM_NATIVE_FIPS202_BACKEND)

This change ensures that the FIPS202 backend is only activated when explicitly requested by defining MLKEM_USE_NATIVE_FIPS202, making it independent of the native optimization activation.

Would it be possible to integrate this change directly into mlkem-native? This adjustment would help avoid the need for the custom patch and provide better modularity for projects like liboqs that selectively use the mlkem-native features.

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 22, 2025

Good spot, this needs fixing. Ideally, we wouldn't need a boolean flag and a filename -- I'll have a think if we can consolidate that at the same time.

@hanno-becker
Copy link
Contributor Author

hanno-becker commented Jan 22, 2025

@bhess See #684 I changed it along the lines you suggested.

It was actually possible to have just an arithmetic backend, and no FIPS202 backend, but it would require a custom config since the default config auto-defines the backend file if MLKEM_USE_NATIVE is set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants