Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add microarchitectural events for Arm, bump HOL Light version #195

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

aqjune-aws
Copy link
Collaborator

@aqjune-aws aqjune-aws commented Feb 21, 2025

We define that an instruction raises an observable microarchitectural
event if its cycles/power consumption/anything that can be observed by
a side-channel attacker can vary depending on the inputs of
the instruction. For example, instructions taking a constant number of
cycles like ADD do not raise an observable event, whereas cond branch does.
Its kinds (EventLoad/Store/...) describe the events distinguishable from
each other by the attacker, and their parameters describe the values
that are inputs and/or outputs of the instructions that will affect the
observed cycles/etc.
An opcode of instruction is not a parameter of the event, even if the
number of taken cycles may depend on opcode. This relies on an assumption
that a program is public information.
One instruction can raise multiple events (e.g., one that reads PC from
the memory and jumps to the address, even though this case will not exist
in Arm).

This largely imports from the codebase of Abdal in his last summer internship.
Also, the branch event has all source and target PCs.

--

Bump HOL Light commit hash, use the upstreamed check_axioms.
Also check that there is no redefinition of check_axiom in s2n-bignum.
Resolve possible slowdowns in ARM decoder.

--

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jimgrundy
Copy link

There is one important difference that I added: the event on conditional
jump isn't taking its boolean condition anymore, but now the destination PC
of the jumps. This is because Arm has a jump instruction to the PC stored in
a register (which is RET but we also have arm_BL_ABSOLUTE).

I worry that this may not be enough for what we want to achieve, which is to record the events and their inputs that might influence micro-architectural structures in ways that could be (indirectly) observable. For example, different kinds of branches train branch predictors in different ways. It might not be sufficient to simply record that there was a branch and where it went. We might need to record more, like was it a conditional branch, was the condition true or false, was it a return (those can be treated specially), was it an indirect branch (though I expect s2n-bignum doesn't use those) ... all of which can influence branch prediction data-structures.

@aqjune-aws
Copy link
Collaborator Author

I think you are right. I am planning to make it even record the actual opcode that it executed (e.g., b, cbnz, ret...) to be able to avoid any confusion.

@aqjune-aws
Copy link
Collaborator Author

Cleaned up the commit history & PR description. I can see there is a slowdown in CI (3hr 40 min -> 4hr 10min). This was somewhat expected, but interestingly this is causing a significant slowdown to Arm program decoder as well; decoding 600 instruction takes 1 min. :/

One unresolved question on my side was what should we do with a branch predictor that may take the history of taken/untaken branches as input? In the formal semantics, EventJump event does not explicitly take a list of branch history; the constructor only has the info of the 'current' branch.
Luckily, from the perspective of what are going to be eventually proven for the constant-time property, s2n-bignum still compare the whole trace of events after function returned, so the history of taken/untaken branches can be naturally compared.

@jimgrundy
Copy link

what should we do with a branch predictor that may take the history of taken/untaken branches as input

I'm not sure you need to do anything until you want to start establishing the absense of side channels even while considering the effects of speculation. For now having the sequence of events might be enough.

@aqjune-aws aqjune-aws closed this Feb 24, 2025
@aqjune-aws aqjune-aws reopened this Feb 24, 2025
@aqjune-aws aqjune-aws marked this pull request as ready for review February 28, 2025 17:02
@aqjune-aws
Copy link
Collaborator Author

As discussed in a separate thread, to soundly assume that a program is public information, we will need to separately prove that (1) each assembly subroutine is never updated, (2) the program does not interpret input buffers as a sequence of assembly commands & execute them.

We define that an instruction raises an observable microarchitectural
event if its cycles/power consumption/anything that can be observed by
a side-channel attacker can vary depending on the inputs of
the instruction. For example, instructions taking a constant number of
cycles like ADD do not raise an observable event, whereas cond branch does.
Its kinds (EventLoad/Store/...) describe the events distinguishable from
each other by the attacker, and their parameters describe the values
that are inputs and/or outputs of the instructions that will affect the
observed cycles/etc.
An opcode of instruction is not a parameter of the event, even if the
number of taken cycles may depend on opcode. This relies on an assumption
that a program is public information.
One instruction can raise multiple events (e.g., one that reads PC from
the memory and jumps to the address, even though this case will not exist
in Arm).

This largely imports from the codebase of Abdal in his last summer internship.
Also, the branch event has all source and target PCs.
…ME_FLAGS

In the arm/proofs dir:
```
sed -i 's/MAYCHANGE SOME_FLAGS/MAYCHANGE SOME_FLAGS ,, MAYCHANGE [events]/g' *.ml
```
Also check that there is no redefinition of check_axiom in s2n-bignum.
Also use HOL Light's new find_index
The BITMATCH_MEMO_CONV (for_hollight.ml) memoizes bitmatch body expressions with their corresponding decision trees,
but if the bitmatch body uses opcode, it could not hit the memo cache because the bitmatch body changed for every opcode.
Concretely speaking,  decode w = bitmatch w with | [pattern] -> ...(w)... | [pattern] -> ...(w) | ... , but if w was concretized to say word 0x12345678, the body of bitmatch was instantiated to bitmatch (word 0x12345678) with | [pattern] -> ...(word 0x12345678) | ...  .
Therefore, BITMATCH_MEMO_CONV could not reuse the cached decision tree unless exactly same opcode was decoded again.

To resolve it, I added `conceal_bitmatch` that hides a bitmatch in a term under a fresh constant & a new conversion that evaluates comb(the fresh constant , args) using BITMATCH tactics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants