You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the LLVM memory model in crucible-llvm comes with the restriction that all strings must be symbolic. Here, when I say "string", I really mean "C string", i.e., a fixed-length array of bytes that is followed by a null terminator. This restriction impacts several parts of the crucible-llvm codebase, including our printf and strlen overrides and the various parts of crucible-symio that interface with strings.
This issue tracks possible ways to this lift this restriction. The restriction manifests itself in this part of loadString:
v <- doLoad bak mem p (bitvectorType 1) (LLVMPointerRepr (knownNat ::NatRepr8)) noAlignment
x <-Partial.projectLLVM_bv bak v
caseBV.asUnsigned <$> asBV x of
Just0->return$ f []
Just c ->do
let c' ::Word8=toEnum$fromInteger c
p' <- doPtrAddOffset bak mem p =<< bvLit sym PtrWidth (BV.one PtrWidth)
go (f . (c':)) p' (fmap (\n -> n -1) maxChars)
Nothing->
addFailedAssertion bak
$UnsupportedGHC.callStack "Symbolic value encountered when loading a string"
Here, the only place where we actually make use of the string being concrete is in the null terminator check. However, this is not an all-or-nothing proposition: it is possible for a string to contain some symbolic characters but still end with a concrete null terminator character. To help distinguish between these types of strings, I will refer to strings that entirely symbolic characters (including the null terminator) as fully symbolic strings, while I will refer to strings that some symbolic characters followed by a concrete null terminator as mostly symbolic strings.
Handling fully symbolic strings is definitely not possible for Crucible. That would be tantamount to asking Crucible to handle strings of symbolic length, which is something that Crucible is fundamentally unequipped to handle. Handling mostly symbolic strings, however, is definitely possible, as that would allow Crucible to say with certainty where strings start and end. It doesn't matter so much whether the characters preceding the null terminator are symbolic.
One proposal for handling mostly symbolic strings is to remove the Nothing case in loadString.go above, and generalize the Just c case to handle all characters that are not concretely equal to the null terminator. (This would require removing the use of bvLit.) I've successfully employed this technique on a downstream project, and it should work nicely with Crux's crucible_string function, which allocates mostly symbolic strings.
Another possibility is to back C strings with SMT strings, which are supported in recent versons of various SMT solvers. This would largely accomplish the same goals while perhaps being more efficient due to not needing to have separate memory log entries for each individual character. On the other hand, we would need to be able to convert from SMT strings to individual characters and vice versa, and what4 does not yet offer a way to do this. Given the amount of work required for this option, I'm really only mentioning it for the sake of completeness.
The text was updated successfully, but these errors were encountered:
Currently, the LLVM memory model in
crucible-llvm
comes with the restriction that all strings must be symbolic. Here, when I say "string", I really mean "C string", i.e., a fixed-length array of bytes that is followed by a null terminator. This restriction impacts several parts of thecrucible-llvm
codebase, including ourprintf
andstrlen
overrides and the various parts ofcrucible-symio
that interface with strings.This issue tracks possible ways to this lift this restriction. The restriction manifests itself in this part of
loadString
:crucible/crucible-llvm/src/Lang/Crucible/LLVM/MemModel.hs
Lines 1219 to 1230 in 994b4f9
Here, the only place where we actually make use of the string being concrete is in the null terminator check. However, this is not an all-or-nothing proposition: it is possible for a string to contain some symbolic characters but still end with a concrete null terminator character. To help distinguish between these types of strings, I will refer to strings that entirely symbolic characters (including the null terminator) as fully symbolic strings, while I will refer to strings that some symbolic characters followed by a concrete null terminator as mostly symbolic strings.
Handling fully symbolic strings is definitely not possible for Crucible. That would be tantamount to asking Crucible to handle strings of symbolic length, which is something that Crucible is fundamentally unequipped to handle. Handling mostly symbolic strings, however, is definitely possible, as that would allow Crucible to say with certainty where strings start and end. It doesn't matter so much whether the characters preceding the null terminator are symbolic.
One proposal for handling mostly symbolic strings is to remove the
Nothing
case inloadString.go
above, and generalize theJust c
case to handle all characters that are not concretely equal to the null terminator. (This would require removing the use ofbvLit
.) I've successfully employed this technique on a downstream project, and it should work nicely with Crux'scrucible_string
function, which allocates mostly symbolic strings.Another possibility is to back C strings with SMT strings, which are supported in recent versons of various SMT solvers. This would largely accomplish the same goals while perhaps being more efficient due to not needing to have separate memory log entries for each individual character. On the other hand, we would need to be able to convert from SMT strings to individual characters and vice versa, and
what4
does not yet offer a way to do this. Given the amount of work required for this option, I'm really only mentioning it for the sake of completeness.The text was updated successfully, but these errors were encountered: