Introduction
hax is a tool for high assurance translations that translates a large subset of Rust into formal languages such as F* or Coq. This extends the scope of the hacspec project, which was previously a DSL embedded in Rust, to a usable tool for verifying Rust programs.
So what is hacspec now?
hacspec is the functional subset of Rust that can be used, together with a hacspec standard library, to write succinct, executable, and verifiable specifications in Rust. These specifications can be translated into formal languages with hax.
Quick start with hax and F*
Do you want to try hax out on a Rust crate of yours? This chapter is what you are looking for!
Setup the tools
- Install the hax toolchain.
đŞ Runningcargo hax --version
should print some version info. - Install F* (optional: only if want to run F*)
Setup the crate you want to verify
Note: the instructions below assume you are in the folder of the specific crate (not workspace!) you want to extract.
Note: this part is useful only if you want to run F*.
- Create the folder
proofs/fstar/extraction
folder, right next to theCargo.toml
of the crate you want to verify.
đŞmkdir -p proofs/fstar/extraction
- Copy this makefile to
proofs/fstar/extraction/Makefile
.
đŞcurl -o proofs/fstar/extraction/Makefile https://gist.githubusercontent.com/W95Psp/4c304132a1f85c5af4e4959dd6b356c3/raw/Makefile
- Add
hax-lib
as a dependency to your crate, enabled only when using hax.
đŞcargo add --target 'cfg(hax)' --git https://github.com/hacspec/hax hax-lib
đŞ (hax-lib
is not mandatory, but this guide assumes it is present)
Partial extraction
Note: the instructions below assume you are in the folder of the specific crate you want to extract.
Run the command cargo hax into fstar
to extract every item of your
crate as F* modules in the subfolder proofs/fstar/extraction
.
What is critical? What is worth verifying?
Probably, your Rust crate contains mixed kinds of code: some parts are
critical (e.g. the library functions at the core of your crate) while
some others are not (e.g. the binary driver that wraps the
library). In this case, you likely want to extract only partially your
crate, so that you can focus on the important part.
Partial extraction.
If you want to extract a function
your_crate::some_module::my_function
, you need to tell hax
to
extract nothing but my_function
:
cargo hax into -i '-** +your_crate::some_module::my_function' fstar
Note this command will extract my_function
but also any item
(function, type, etc.) from your crate which is used directly or
indirectly by my_function
. If you don't want the dependency, use
+!
instead of +
in the -i
flag.
Unsupported Rust code.
hax doesn't support every Rust
constructs,
unsafe
code, or complicated mutation scheme. That is another reason
for extracting only a part of your crate. When running hax, if an item
of your crate, say a function my_crate::f
, is not handled by hax,
you can append -my_crate::f
to the -i
flag. You can learn more
about the -i
flag in the FAQ.
Start F* verification
After running the hax toolchain on your Rust code, you will end up
with various F* modules in the proofs/fstar/extraction
folder. The
Makefile
in proofs/fstar/extraction
will run F*.
- Lax check: the first step is to run
OTHERFLAGS="--lax" make
, which will run F* in "lax" mode. The lax mode just makes sure basic typechecking works: it is not proving anything. This first step is important because there might be missing libraries. If F* is not able to find a definition, it is probably alibcore
issue: you probably need to edit the F* library, which lives in theproofs-libs
directory in the root of the hax repo. - Typecheck: the second step is to run
make
. This will ask F* to typecheck fully your crate. This is very likely that you need to add preconditions and postconditions at this stage. Indeed, this second step is about panic freedom: if F* can typecheck your crate, it means your code never panics, which already is an important property.
To go further, please read the next chapter.
Tutorial
This tutorial is a guide for formally verifying properties about Rust programs using the hax toolchain. hax is a tool that translates Rust programs to various formal programming languages.
The formal programming languages we target are called backends. Some of them, e.g. F* or Coq, are general purpose formal programming languages. Others are specialized tools: ProVerif is dedicated to proving properties about protocols.
This tutorial focuses on proving properties with the F* programming language.
Panic freedom
Let's start with a simple example: a function that squares a u8
integer. To extract this function to F* using hax, we simply need to
run the command cargo hax into fstar
in the directory of the crate
in which the function square
is defined.
Note: throughout this tutorial, you can edit the snippets of code and extract to F* by clicking the play button ( ), or even typecheck it with the button ( ).
fn square(x: u8) -> u8 { x * x }
Though, if we try to verify this function, F* is complaining about a
subtyping issue: F* tells us that it is not able to prove that the
result of the multiplication x * x
fits the range of u8
. The
multiplication x * x
might indeed be overflowing!
For instance, running square(16)
panics: 16 * 16
is 256
, which
is just over 255
, the largest integer that fits u8
. Rust does not
ensure that functions are total: a function might panic at any
point, or might never terminate.
Rust and panicking code
Quoting the chapter To panic!
or Not to
panic!
from the Rust book:
The
panic!
macro signals that your program is in a state it can't handle and lets you tell the process to stop instead of trying to proceed with invalid or incorrect values.
A Rust program should panics only in a situation where an assumption or an invariant is broken: a panics models an invalid state. Formal verification is about proving such invalid state cannot occur, at all.
From this observation emerges the urge of proving Rust programs to be panic-free!
Fixing our squaring function
Let's come back to our example. There is an informal assumption to the multiplication operator in Rust: the inputs should be small enough so that the addition doesn't overflow.
Note that Rust also provides wrapping_mul
, a non-panicking variant
of the multiplication on u8
that wraps when the result is bigger
than 255
. Replacing the common multiplication with wrapping_mul
in
square
would fix the panic, but then, square(256)
returns zero.
Semantically, this is not what one would expect from square
.
Our problem is that our function square
is well-defined only when
its input is within 0
and 15
.
Solution A: reflect the partialness of the function in Rust
A first solution is to make square
return an Option<u8>
instead of a u8
:
fn square_option(x: u8) -> Option<u8> { if x >= 16 { None } else { Some(x * x) } }
Here, F* is able to prove panic-freedom: calling square
with any
input is safe. Though, one may argue that square
's input being small
enough should really be an assumption. Having to deal with the
possible integer overflowing whenever squaring is a huge burden. Can
we do better?
Solution B: add a precondition
The type system of Rust doesn't allow the programmer to formalize the
assumption that square
expects a small u8
. This becomes
possible using hax: one can annotate a function with a pre-condition
on its inputs.
The pre-conditions and post-conditions on a function form a contract: "if you give me some inputs that satisfies a given formula (the precondition), I will produce a return value that satisfy another formula (the postcondition)". Outside this contracts, anything might happen: the function might panic, might run forever, erase your disk, or anything.
The helper crate
hax-lib
provdes the requires
proc-macro
which lets user writting pre-conditions directly in Rust.
#[hax_lib::requires(x < 16)] fn square_requires(x: u8) -> u8 { x * x }
With this precondition, F* is able to prove panic freedom. From now
on, it is the responsibility of the clients of square
to respect the
contact. The next step is thus be to verify, through hax extraction,
that square
is used correctly at every call site.
Common panicking situations
Multiplication is not the only panicking function provided by the Rust library: most of the other integer arithmetic operation have such informal assumptions.
Another source of panics is indexing. Indexing in an array, a slice or a vector is a partial operation: the index might be out of range.
In the example folder of hax, you can find the chacha20
example
that makes use of pre-conditions to prove panic freedom.
Another solution for safe indexing is to use the newtype index pattern, which is also supported by hax. The data invariants chapter gives more details about this.
Proving properties
In the last chapter, we proved one property on the square
function:
panic freedom. After adding a precondition, the signature of the
square
function was x:u8 -> Pure u8 (requires x <. 16uy) (ensures fun _ -> True)
.
This contract stipulates that, given a small input, the function will
return a value: it will not panic or diverge. We could enrich the
contract of square
with a post-condition about the fact it is a
increasing function:
#[hax_lib::requires(x < 16)] #[hax_lib::ensures(|result| result >= x)] fn square_ensures(x: u8) -> u8 { x * x }
Such a simple post-condition is automatically proven by F*. The
properties of our square
function are not fascinating. Let's study a
more interesting example: Barrett reduction.
A concrete example of contract: Barrett reduction
While the correctness of square
is obvious, the Barrett reduction is
not.
Given value
a field element (a i32
whose absolute value is at most
BARRET_R
), the function barrett_reduce
defined below computes
result
such that:
result ⥠value (mod FIELD_MODULUS)
;- the absolute value of
result
is bound as follows:|result| < FIELD_MODULUS
.
It is easy to write this contract directly as hax::requires
and
hax::ensures
annotations, as shown in the snippet below.
type FieldElement = i32; const FIELD_MODULUS: i32 = 3329; const BARRETT_SHIFT: i64 = 26; const BARRETT_R: i64 = 0x4000000; // 2^26 const BARRETT_MULTIPLIER: i64 = 20159; // â(BARRETT_R / FIELD_MODULUS) + 1/2â #[hax_lib::requires((i64::from(value) >= -BARRETT_R && i64::from(value) <= BARRETT_R))] #[hax_lib::ensures(|result| result > -FIELD_MODULUS && result < FIELD_MODULUS && result % FIELD_MODULUS == value % FIELD_MODULUS)] fn barrett_reduce(value: i32) -> i32 { let t = i64::from(value) * BARRETT_MULTIPLIER; let t = t + (BARRETT_R >> 1); let quotient = t >> BARRETT_SHIFT; let quotient = quotient as i32; let sub = quotient * FIELD_MODULUS; // Here a lemma to prove that `(quotient * 3329) % 3329 = 0` // may have to be called in F*. value - sub }
Before returning, a lemma may have to be called in F* to prove the correctness
of the reduction.
The lemma is Math.Lemmas.cancel_mul_mod (v quotient) 3329;
, which establishes
that (quotient * 3329) % 3329
is zero.
With the help of that one lemma, F* is able to prove the
reduction computes the expected result.
(We may expose lemmas like this to call from Rust directly in future.)
This Barrett reduction examples is taken from libcrux's proof of Kyber which is using hax and F*.
This example showcases an intrinsic proof: the function
barrett_reduce
not only computes a value, but it also ship a proof
that the post-condition holds. The pre-condition and post-condition
gives the function a formal specification, which is useful both for
further formal verification and for documentation purposes.
Extrinsic properties with lemmas
Consider the encrypt
and decrypt
functions below. Those functions
have no precondition, don't have particularly interesting properties
individually. However, the composition of the two yields an useful
property: encrypting a ciphertext and decrypting the result with a
same key produces the ciphertext again. |c| decrypt(c, key)
is the
inverse of |p| encrypt(p, key)
.
fn encrypt(plaintext: u32, key: u32) -> u32 { plaintext ^ key } fn decrypt(ciphertext: u32, key: u32) -> u32 { ciphertext ^ key }
In this situation, adding a pre- or a post-condition to either
encrypt
or decrypt
is not useful: we want to state our inverse
property about both of them. Better, we want this property to be
stated directly in Rust: just as with pre and post-conditions, the
Rust souces should clearly state what is to be proven.
To this end, Hax provides a macro lemma
. Below, the Rust function
encrypt_decrypt_identity
takes a key and a plaintext, and then
states the inverse property. The body is empty: the details of the
proof itself are not relevant, at this stage, we only care about the
statement. The proof will be completed manually in the proof
assistant.
fn encrypt(plaintext: u32, key: u32) -> u32 { plaintext ^ key } fn decrypt(ciphertext: u32, key: u32) -> u32 { ciphertext ^ key } #[hax_lib::lemma] #[hax_lib::requires(true)] fn encrypt_decrypt_identity( key: u32, plaintext: u32, ) -> Proof<{ decrypt(encrypt(plaintext, key), key) == plaintext }> { }
Data invariants
In the two previous chapters we saw how to write specifications on functions, be it with pre and post-condition or with lemmas. In this chapter, we will see how to maintain invariants with precise types.
Making illegal states unpresentable
With the Barrett example, we were working on a certain field, whose
elements were represented as i32
integers. To simplify, let's
consider Fâ
, the finite field with 3 elements (say 0
, 1
and
2
). Every element of F3
can be represented as a i32
integers,
but the converse doesn't hold: the vast majority of i32
integers are
not in of Fâ
.
Representing Fâ
as i32
s, every time we define a function consuming
Fâ
elements, we face the risk to consume illegal elements. We are
thus back to chapter 4.1: we should panic on
illegal elements, and add hax pre-conditions on every single
function. That's not ideal: the property of being either 0
, 1
or
2
should be encoded directly on the type representing Fâ
elements.
enum
s to then rescue
Rust alone already can solve our representation issues with
enums! Below, we
define the enum
type F3
which has only three constructor: F3
represent exactly the elements of Fâ
, not more, not less.
enum F3 { E1, E2, E3, }
With F3
, there doesn't exist illegal values at all: we can now
define total
functions on Fâ
elements. We dropped altogether a source of panic!
Soon you want to work with a bigger finite field: say
Fââââ
. Representing this many q
different elements with an Rust
enum would be very painful... The enum
approach falls apart.
Newtype and refinements
Since we don't want an enum
with 2347 elements, we have to revert to
a type that can hold this many elements. The smallest integer type
large enough provided by Rust is u16
.
Let's define F
a
"newtype":
a struct with
one u16
field v
. Notice the refinment annotation on v
: the
extraction of this type F
via hax will result in a type enforcing
v
small enough.
pub const Q: u16 = 2347; #[hax_lib::attributes] pub struct F { #[hax_lib::refine(v < Q)] pub v: u16, }
In Rust, we can now define functions that operates on type F
,
assuming they are in bounds with respect to Fââââ
: every such
assumption will be checked and enforced by the proof assistant. As an
example, below is the implementation of the addition for type F
.
pub const Q: u16 = 2347; #[hax_lib::attributes] pub struct F { #[hax_lib::refine(v < Q)] pub v: u16, } use core::ops::Add; impl Add for F { type Output = Self; fn add(self, rhs: Self) -> Self { Self { v: (self.v + rhs.v) % Q, } } }
Here, F* is able to prove automatically that (1) the addition doesn't
overflow and (2) that the invariant of F
is preserved. The
definition of type F
in F* (named t_F
) very explicitly requires
the invariant as a refinement on v
.
Examples
This chapter contains various examples that demonstrates how hax can be used to prove properties about programs. Each example is self-contained. hax being a tool that can extract Rust to various backends, this section provides examples for each backend.
The first subsection takes some examples from Rust by Example, and shows how to prove properties on them.
The other sections present backend-specific examples.
Rust By Example
Using the F* backend
Using the ProVerif backend
Using the ProVerif backend
Troubleshooting/FAQ
This chapter captures a list of common questions or issues and how to resolve them. If you happen to run into an issue that is not documented here, please consider submitting a pull request!
Rust Item Extraction Using cargo hax
Overview
When extracting Rust items with hax, it is often necessary to include only a specific subset of items from a crate. The cargo hax into
subcommand provides the -i
flag to control which items are included or excluded, and how their dependencies are handled. This allows precise tailoring of the extraction process.
The -i
Flag
The -i
flag accepts a list of patterns with modifiers to define inclusion or exclusion rules for Rust items. Patterns are processed sequentially from left to right, determining which items are extracted.
Basic Concepts
- Patterns: Rust paths with support for
*
and**
globs.*
matches any single segment (e.g.,mycrate::*::myfn
).**
matches any subpath, including empty segments (e.g.,**::myfn
).
- Modifiers:
+
: Includes items and their dependencies (transitively).+~
: Includes items and their direct dependencies only.+!
: Includes only the item itself (no dependencies).+:
: Includes only the item's type signature (no body or dependencies).-
: Excludes items.
By default, all items are included, unless explicitly modified.
Practical Examples of the -i
Flag Usage
Consider the following crate (mycrate
) with the lib.rs
module:
#![allow(unused)] fn main() { fn interesting_function() { aux() } fn aux() { foo::f() } fn something_else() { /* ... */ } mod foo { fn f() { /* ... */ } fn g() { /* ... */ } fn h() { /* ... */ } fn interesting_function() { something() } fn something() { /* ... */ } mod bar { fn interesting_function() { /* ... */ } } } fn not_that_one() { not_that_one_dependency() } fn not_that_one_dependency() { /* ... */ } fn not_extracting_function(_: u8) -> u8 { unsafe { /* ... */ } 0 } }
1. Selectively Including Items with Dependencies
cargo hax into -i '-** +mycrate::**::interesting_function' <BACKEND>
- Explanation:
-**
: Excludes all items by default.+mycrate::**::interesting_function
: Includes all items matchingmycrate::**::interesting_function
and their dependencies.
- Extracted Items:
mycrate::interesting_function
(direct match).mycrate::foo::interesting_function
(direct match).mycrate::foo::bar::interesting_function
(direct match).mycrate::aux
(dependency ofmycrate::interesting_function
).mycrate::foo::f
(dependency ofmycrate::aux
).mycrate::foo::something
(dependency ofmycrate::foo::interesting_function
).
2. Excluding Specific Items
cargo hax into -i '+** -*::not_that_one' <BACKEND>
- Explanation:
+**
: Includes all items by default.-*::not_that_one
: Excludes any item namednot_that_one
, but keeps all other items, includingnot_that_one_dependency
.
- Extracted Items: All except
mycrate::not_that_one
.
3. Including Items Without Dependencies
cargo hax into -i '-** +!mycrate::interesting_function' <BACKEND>
- Explanation:
-**
: Excludes all items by default.+!mycrate::interesting_function
: Includes onlymycrate::interesting_function
, without dependencies.
- Extracted Items: Only
mycrate::interesting_function
.
4. Including Items with Direct Dependencies Only
cargo hax into -i '-** +~mycrate::interesting_function' <BACKEND>
- Explanation:
-**
: Excludes all items by default.+~mycrate::interesting_function
: Includesmycrate::interesting_function
and its direct dependencies (but not their transitive dependencies).
- Extracted Items:
mycrate::interesting_function
.mycrate::aux
(direct dependency).
- Excluded Items:
mycrate::foo::f
(transitive dependency ofmycrate::aux
).
5. Including Items in Signature-Only Mode
cargo hax into -i '+:mycrate::not_extracting_function' <BACKEND>
- Explanation:
+:mycrate::not_extracting_function
: Includes only the type signature ofmycrate::not_extracting_function
(e.g., as an assumed or axiomatized symbol).
- Extracted Items:
- The type signature of
mycrate::not_extracting_function
, without its body or dependencies.
- The type signature of
Summary
The -i
flag offers powerful control over extraction, allowing fine-grained inclusion and exclusion of items with various dependency handling strategies. Use it to:
- Extract specific items and their dependencies (
+
or+~
). - Exclude certain items (
-
). - Include items without dependencies (
+!
). - Extract type signatures only (
+:
).
For complex crates, this flexibility ensures only the necessary parts are extracted, optimizing analysis or transformation workflows.
Contributing
This chapter contains information about internals of hax.
Please read the CONTRIBUTING.md
before opening a pull request.
Architecture
Hax is a software pipeline designed to transform Rust code into various formal verification backends such as F*, Coq, ProVerif, and EasyCrypt. It comprises two main components:
- The Frontend (written in Rust)
- The Engine (written in OCaml)
The frontend hooks into the Rust compiler, producing a abstract syntax tree for a given crate. The engine then takes this AST in input, applies various transformation, to reach in the end the language of the backend: F*, Coq...
The Frontend (Rust)
The frontend is responsible for extracting and exporting Rust code's abstract syntax trees (ASTs) in a format suitable for processing by the engine (or by other tools).
hax-frontend-exporter
Library
This library mirrors the internal types of the Rust compiler (rustc
) that constitute the HIR (High-Level Intermediate Representation), THIR (Typed High-Level Intermediate Representation), and MIR (Mid-Level Intermediate Representation) ASTs. It extends them with additional information such as attributes, trait implementations, and removes ID indirections.
SInto
Trait: The library defines an entry point for translating a given rustc
value to its mirrored hax version using the SInto
trait (stateful into
). For a value x
of type T
from rustc
, if T
is mirrored by hax, then x.sinto(s)
produces an augmented and simplified "hax-ified" AST for x
. Here, s
represents the state holding information about the translation process.
hax-driver
Binary
hax-driver
is a custom Rust compiler driver that behaves like rustc
but performs additional tasks:
- Item Enumeration: Lists all items in a crate.
- AST Transformation: Applies
sinto
on each item to generate the hax-ified AST. - Output Generation: Outputs the mirrored items into a
haxmeta
file within thetarget
directory.
cargo-hax
Binary
cargo-hax
provides a hax
subcommand for Cargo, accessible via cargo hax --help
. It serves as the command-line interface for hax, orchestrating both the frontend and the engine.
Workflow:
- Custom Build Execution: Runs
cargo build
, instructing Cargo to usehax-driver
instead ofrustc
. - Multiple Compiler Invocations:
cargo build
invokeshax-driver
multiple times with various options. - Inter-Process Communication:
hax-driver
communicates withcargo-hax
viastderr
using JSON lines. - Metadata Generation: Produces
haxmeta
files containing the transformed ASTs. - Engine Invocation (Optional): If requested, runs the engine, passing options and
haxmeta
information viastdin
serialized as JSON. - Interactive Communication: Engages in interactive communication with the engine.
- User Reporting: Outputs results and diagnostics to the user.
The Engine (OCaml - documentation)
The engine processes the transformed ASTs and options provided via JSON input from stdin
. It performs several key functions to convert the hax-ified Rust code into the target backend language.
Importing and Simplifying ASTs
- AST Importation: Imports the hax-ified Rust THIR AST. This is module
Import_thir
. - Internal AST Conversion: Converts the imported AST into a simplified and opinionated internal AST designed for ease of transformation and analysis. This is mostly the functor
Ast.Make
.
Internal AST and Features
The internal AST is defined using a functor that takes a list of type-level booleans, referred to as features, and produces the AST types accordingly.
Features are for instances, mutation, loops, unsafe code. The enumeration Features.Enumeration
lists all those features.
Feature Witnesses: On relevant AST nodes, feature witnesses are included to enforce constraints at the type level. For example, in the loop
expression constructor, a witness of type F.loop
is used, where F
represents the current feature set. If F.loop
is an empty type, constructing a loop
expression is prohibited, ensuring that loops are disallowed in contexts where they are not supported.
Transformation Phases
The engine executes a sequence of phases, which are determined based on the target backend. Each phase:
- Input: Takes a list of items from an AST with specific feature constraints.
- Output: Transforms these items into a new AST type, potentially enabling or disabling features through type-level changes.
The phases can be found in the Phases
module.
Backend Code Generation
After completing the transformation phases:
- Backend Printer Invocation: Calls the printer associated with the selected backend to generate the target code.
- File Map Creation: Produces a map from file names to their contents, representing the generated code.
- Output Serialization: Outputs the file map and additional information (e.g., errors) as JSON to
stderr
.
Communication Protocol
The engine communicates asynchronously with the frontend using a protocol defined in hax_types::engine_api::protocol
. This communication includes:
- Diagnostic Data: Sending error messages, warnings, and other diagnostics.
- Profiling Information: Providing performance metrics and profiling data.
- Pretty-Printing Requests: Requesting formatted versions of Rust source code or diagnostics for better readability.
Libraries
Macros and attributes
The hax engine understands only one attribute: #[_hax::json(PAYLOAD)]
,
where PAYLOAD
is a JSON serialization of the Rust enum
hax_lib_macros_types::AttrPayload
.
Note #[_hax::json(PAYLOAD)]
is a tool
attribute: an
attribute that is never expanded.
In the engine, the OCaml module Attr_payloads
offers an API to query
attributes easily. The types in crate hax_lib_macros_types
and
corresponding serializers/deserializers are automatically generated in
OCaml, thus there is no manual parsing involved.
User experience
Asking the user to type #[_hax::json(some_long_json)]
is not very
friendly. Thus, the crate hax-lib-macros
defines a bunch of proc
macros
that defines nice and simple-to-use macros. Those macro take care of
cooking some hax_lib_macros_types::AttrPayload
payload(s), then
serialize those payloads to JSON and produce one or more
#[_hax::json(serialized_payload)]
attributes.
Warning: this book is currently being rewritten!
hax is the successor of hacspec. You can find the previous book describing hacspec here, but keep in mind most of the information there is outdated.