-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ML-DSA: Optimize implementation #205
Comments
I'm planning to implement the following optimizations:
As a first indicator, we are getting the speedups below. At a high level we are getting around 4-20x speedups.
|
In the sample runtime you pasted, the encoding functions are only taking one parameter, but they should take more than that. Calling
Edit: I think this is a one-off bug in |
Indeed, thanks for pointing this out! I've updated the comment with the correct results. I see at least a 9x speedup on all encode functions now. |
The version of ML-DSA that is currently being developed is very closely adherent to the spec. However, the spec is not written in a way that lends itself to fast Cryptol code. There are two particular types of slow-code that I've noticed:
join
,reverse
, etc.)In general, we don't want to delete the spec-adherent code, because spec adherence is a high-level goal of this repository. We can either make separate functions inline (e.g.
BitsToBytes_fast
) or make a separate module with the fast versions -- this might depend a little on the architecture decision we make in #198. In either case we need to prove equivalence between the spec version and the fast version.Here are a few notes about things I've seen that could probably be faster:
IntegerToBits
: we could use the built-in functionfromInteger
, with areverse
call to get the endianness right. Similarly withBitsToInteger
and probablyIntegerToBytes
BitsToBytes
andBytesToBits
: these should just besplit
andjoin
calls.BitPack
functions (Alg 16 - 19) index into an array, but they could iterate directly over it instead.The text was updated successfully, but these errors were encountered: