Introduction

hax is a tool for high assurance translations that translates a large subset of Rust into formal languages such as F* or Coq. This extends the scope of the hacspec project, which was previously a DSL embedded in Rust, to a usable tool for verifying Rust programs.

So what is hacspec now?

hacspec is the functional subset of Rust that can be used, together with a hacspec standard library, to write succinct, executable, and verifiable specifications in Rust. These specifications can be translated into formal languages with hax.

Quick start with hax and F*

Do you want to try hax out on a Rust crate of yours? This chapter is what you are looking for!

Setup the tools

Setup the crate you want to verify

Note: the instructions below assume you are in the folder of the specific crate (not workspace!) you want to extract.

Note: this part is useful only if you want to run F*.

  • Create the folder proofs/fstar/extraction folder, right next to the Cargo.toml of the crate you want to verify.
    🪄 mkdir -p proofs/fstar/extraction
  • Copy this makefile to proofs/fstar/extraction/Makefile.
    🪄 curl -o proofs/fstar/extraction/Makefile https://gist.githubusercontent.com/W95Psp/4c304132a1f85c5af4e4959dd6b356c3/raw/Makefile
  • Add hax-lib as a dependency to your crate, enabled only when using hax.
    🪄 cargo add --target 'cfg(hax)' --git https://github.com/hacspec/hax hax-lib
    🪄 (hax-lib is not mandatory, but this guide assumes it is present)

Partial extraction

Note: the instructions below assume you are in the folder of the specific crate you want to extract.

Run the command cargo hax into fstar to extract every item of your crate as F* modules in the subfolder proofs/fstar/extraction.

What is critical? What is worth verifying?
Probably, your Rust crate contains mixed kinds of code: some parts are critical (e.g. the library functions at the core of your crate) while some others are not (e.g. the binary driver that wraps the library). In this case, you likely want to extract only partially your crate, so that you can focus on the important part.

Partial extraction.
If you want to extract a function your_crate::some_module::my_function, you need to tell hax to extract nothing but my_function:

cargo hax into -i '-** +your_crate::some_module::my_function' fstar

Note this command will extract my_function but also any item (function, type, etc.) from your crate which is used directly or indirectly by my_function. If you don't want the dependency, use +! instead of + in the -i flag.

Unsupported Rust code.
hax doesn't support every Rust constructs, unsafe code, or complicated mutation scheme. That is another reason for extracting only a part of your crate. When running hax, if an item of your crate, say a function my_crate::f, is not handled by hax, you can append -my_crate::f to the -i flag. You can learn more about the -i flag in the FAQ.

Start F* verification

After running the hax toolchain on your Rust code, you will end up with various F* modules in the proofs/fstar/extraction folder. The Makefile in proofs/fstar/extraction will run F*.

  1. Lax check: the first step is to run OTHERFLAGS="--lax" make, which will run F* in "lax" mode. The lax mode just makes sure basic typechecking works: it is not proving anything. This first step is important because there might be missing libraries. If F* is not able to find a definition, it is probably a libcore issue: you probably need to edit the F* library, which lives in the proofs-libs directory in the root of the hax repo.
  2. Typecheck: the second step is to run make. This will ask F* to typecheck fully your crate. This is very likely that you need to add preconditions and postconditions at this stage. Indeed, this second step is about panic freedom: if F* can typecheck your crate, it means your code never panics, which already is an important property.

To go further, please read the next chapter.

Tutorial

This tutorial is a guide for formally verifying properties about Rust programs using the hax toolchain. hax is a tool that translates Rust programs to various formal programming languages.

The formal programming languages we target are called backends. Some of them, e.g. F* or Coq, are general purpose formal programming languages. Others are specialized tools: ProVerif is dedicated to proving properties about protocols.

This tutorial focuses on proving properties with the F* programming language.

Panic freedom

Let's start with a simple example: a function that squares a u8 integer. To extract this function to F* using hax, we simply need to run the command cargo hax into fstar in the directory of the crate in which the function square is defined.

Note: throughout this tutorial, you can edit the snippets of code and extract to F* by clicking the play button ( ), or even typecheck it with the button ( ).

fn square(x: u8) -> u8 {
    x * x
}

Though, if we try to verify this function, F* is complaining about a subtyping issue: F* tells us that it is not able to prove that the result of the multiplication x * x fits the range of u8. The multiplication x * x might indeed be overflowing!

For instance, running square(16) panics: 16 * 16 is 256, which is just over 255, the largest integer that fits u8. Rust does not ensure that functions are total: a function might panic at any point, or might never terminate.

Rust and panicking code

Quoting the chapter To panic! or Not to panic! from the Rust book:

The panic! macro signals that your program is in a state it can't handle and lets you tell the process to stop instead of trying to proceed with invalid or incorrect values.

A Rust program should panics only in a situation where an assumption or an invariant is broken: a panics models an invalid state. Formal verification is about proving such invalid state cannot occur, at all.

From this observation emerges the urge of proving Rust programs to be panic-free!

Fixing our squaring function

Let's come back to our example. There is an informal assumption to the multiplication operator in Rust: the inputs should be small enough so that the addition doesn't overflow.

Note that Rust also provides wrapping_mul, a non-panicking variant of the multiplication on u8 that wraps when the result is bigger than 255. Replacing the common multiplication with wrapping_mul in square would fix the panic, but then, square(256) returns zero. Semantically, this is not what one would expect from square.

Our problem is that our function square is well-defined only when its input is within 0 and 15.

Solution A: reflect the partialness of the function in Rust

A first solution is to make square return an Option<u8> instead of a u8:

fn square_option(x: u8) -> Option<u8> {
    if x >= 16 {
        None
    } else {
        Some(x * x)
    }
}

Here, F* is able to prove panic-freedom: calling square with any input is safe. Though, one may argue that square's input being small enough should really be an assumption. Having to deal with the possible integer overflowing whenever squaring is a huge burden. Can we do better?

Solution B: add a precondition

The type system of Rust doesn't allow the programmer to formalize the assumption that square expects a small u8. This becomes possible using hax: one can annotate a function with a pre-condition on its inputs.

The pre-conditions and post-conditions on a function form a contract: "if you give me some inputs that satisfies a given formula (the precondition), I will produce a return value that satisfy another formula (the postcondition)". Outside this contracts, anything might happen: the function might panic, might run forever, erase your disk, or anything.

The helper crate hax-lib provdes the requires proc-macro which lets user writting pre-conditions directly in Rust.

#[hax_lib::requires(x < 16)]
fn square_requires(x: u8) -> u8 {
    x * x
}

With this precondition, F* is able to prove panic freedom. From now on, it is the responsibility of the clients of square to respect the contact. The next step is thus be to verify, through hax extraction, that square is used correctly at every call site.

Common panicking situations

Multiplication is not the only panicking function provided by the Rust library: most of the other integer arithmetic operation have such informal assumptions.

Another source of panics is indexing. Indexing in an array, a slice or a vector is a partial operation: the index might be out of range.

In the example folder of hax, you can find the chacha20 example that makes use of pre-conditions to prove panic freedom.

Another solution for safe indexing is to use the newtype index pattern, which is also supported by hax. The data invariants chapter gives more details about this.

Proving properties

In the last chapter, we proved one property on the square function: panic freedom. After adding a precondition, the signature of the square function was x:u8 -> Pure u8 (requires x <. 16uy) (ensures fun _ -> True).

This contract stipulates that, given a small input, the function will return a value: it will not panic or diverge. We could enrich the contract of square with a post-condition about the fact it is a increasing function:

#[hax_lib::requires(x < 16)]
#[hax_lib::ensures(|result| result >= x)]
fn square_ensures(x: u8) -> u8 {
    x * x
}

Such a simple post-condition is automatically proven by F*. The properties of our square function are not fascinating. Let's study a more interesting example: Barrett reduction.

A concrete example of contract: Barrett reduction

While the correctness of square is obvious, the Barrett reduction is not.

Given value a field element (a i32 whose absolute value is at most BARRET_R), the function barrett_reduce defined below computes result such that:

  • result ≡ value (mod FIELD_MODULUS);
  • the absolute value of result is bound as follows: |result| < FIELD_MODULUS.

It is easy to write this contract directly as hax::requires and hax::ensures annotations, as shown in the snippet below.

type FieldElement = i32;
const FIELD_MODULUS: i32 = 3329;
const BARRETT_SHIFT: i64 = 26;
const BARRETT_R: i64 = 0x4000000; // 2^26
const BARRETT_MULTIPLIER: i64 = 20159; // ⌊(BARRETT_R / FIELD_MODULUS) + 1/2⌋

#[hax_lib::requires((i64::from(value) >= -BARRETT_R && i64::from(value) <= BARRETT_R))]
#[hax_lib::ensures(|result| result > -FIELD_MODULUS && result < FIELD_MODULUS
                     && result %  FIELD_MODULUS ==  value % FIELD_MODULUS)]
fn barrett_reduce(value: i32) -> i32 {
    let t = i64::from(value) * BARRETT_MULTIPLIER;
    let t = t + (BARRETT_R >> 1);

    let quotient = t >> BARRETT_SHIFT;
    let quotient = quotient as i32;

    let sub = quotient * FIELD_MODULUS;

    // Here a lemma to prove that `(quotient * 3329) % 3329 = 0`
    // may have to be called in F*.

    value - sub
}

Before returning, a lemma may have to be called in F* to prove the correctness of the reduction. The lemma is Math.Lemmas.cancel_mul_mod (v quotient) 3329;, which establishes that (quotient * 3329) % 3329 is zero. With the help of that one lemma, F* is able to prove the reduction computes the expected result. (We may expose lemmas like this to call from Rust directly in future.)

This Barrett reduction examples is taken from libcrux's proof of Kyber which is using hax and F*.

This example showcases an intrinsic proof: the function barrett_reduce not only computes a value, but it also ship a proof that the post-condition holds. The pre-condition and post-condition gives the function a formal specification, which is useful both for further formal verification and for documentation purposes.

Extrinsic properties with lemmas

Consider the encrypt and decrypt functions below. Those functions have no precondition, don't have particularly interesting properties individually. However, the composition of the two yields an useful property: encrypting a ciphertext and decrypting the result with a same key produces the ciphertext again. |c| decrypt(c, key) is the inverse of |p| encrypt(p, key).

fn encrypt(plaintext: u32, key: u32) -> u32 {
    plaintext ^ key
}

fn decrypt(ciphertext: u32, key: u32) -> u32 {
    ciphertext ^ key
}

In this situation, adding a pre- or a post-condition to either encrypt or decrypt is not useful: we want to state our inverse property about both of them. Better, we want this property to be stated directly in Rust: just as with pre and post-conditions, the Rust souces should clearly state what is to be proven.

To this end, Hax provides a macro lemma. Below, the Rust function encrypt_decrypt_identity takes a key and a plaintext, and then states the inverse property. The body is empty: the details of the proof itself are not relevant, at this stage, we only care about the statement. The proof will be completed manually in the proof assistant.

fn encrypt(plaintext: u32, key: u32) -> u32 {
    plaintext ^ key
}

fn decrypt(ciphertext: u32, key: u32) -> u32 {
    ciphertext ^ key
}

#[hax_lib::lemma]
#[hax_lib::requires(true)]
fn encrypt_decrypt_identity(
    key: u32,
    plaintext: u32,
) -> Proof<{ decrypt(encrypt(plaintext, key), key) == plaintext }> {
}

Data invariants

In the two previous chapters we saw how to write specifications on functions, be it with pre and post-condition or with lemmas. In this chapter, we will see how to maintain invariants with precise types.

Making illegal states unpresentable

With the Barrett example, we were working on a certain field, whose elements were represented as i32 integers. To simplify, let's consider F₃, the finite field with 3 elements (say 0, 1 and 2). Every element of F3 can be represented as a i32 integers, but the converse doesn't hold: the vast majority of i32 integers are not in of F₃.

Representing F₃ as i32s, every time we define a function consuming F₃ elements, we face the risk to consume illegal elements. We are thus back to chapter 4.1: we should panic on illegal elements, and add hax pre-conditions on every single function. That's not ideal: the property of being either 0, 1 or 2 should be encoded directly on the type representing F₃ elements.

enums to then rescue

Rust alone already can solve our representation issues with enums! Below, we define the enum type F3 which has only three constructor: F3 represent exactly the elements of F₃, not more, not less.

enum F3 {
    E1,
    E2,
    E3,
}

With F3, there doesn't exist illegal values at all: we can now define total functions on F₃ elements. We dropped altogether a source of panic!

Soon you want to work with a bigger finite field: say F₂₃₄₇. Representing this many q different elements with an Rust enum would be very painful... The enum approach falls apart.

Newtype and refinements

Since we don't want an enum with 2347 elements, we have to revert to a type that can hold this many elements. The smallest integer type large enough provided by Rust is u16.

Let's define F a "newtype": a struct with one u16 field v. Notice the refinment annotation on v: the extraction of this type F via hax will result in a type enforcing v small enough.

pub const Q: u16 = 2347;

#[hax_lib::attributes]
pub struct F {
    #[hax_lib::refine(v < Q)]
    pub v: u16,
}

In Rust, we can now define functions that operates on type F, assuming they are in bounds with respect to F₂₃₄₇: every such assumption will be checked and enforced by the proof assistant. As an example, below is the implementation of the addition for type F.

pub const Q: u16 = 2347;

#[hax_lib::attributes]
pub struct F {
    #[hax_lib::refine(v < Q)]
    pub v: u16,
}

use core::ops::Add;

impl Add for F {
    type Output = Self;
    fn add(self, rhs: Self) -> Self {
        Self {
            v: (self.v + rhs.v) % Q,
        }
    }
}

Here, F* is able to prove automatically that (1) the addition doesn't overflow and (2) that the invariant of F is preserved. The definition of type F in F* (named t_F) very explicitly requires the invariant as a refinement on v.

Examples

This chapter contains various examples that demonstrates how hax can be used to prove properties about programs. Each example is self-contained. hax being a tool that can extract Rust to various backends, this section provides examples for each backend.

The first subsection takes some examples from Rust by Example, and shows how to prove properties on them.

The other sections present backend-specific examples.

Rust By Example

Using the F* backend

Using the ProVerif backend

Using the ProVerif backend

Troubleshooting/FAQ

This chapter captures a list of common questions or issues and how to resolve them. If you happen to run into an issue that is not documented here, please consider submitting a pull request!

Rust Item Extraction Using cargo hax

Overview

When extracting Rust items with hax, it is often necessary to include only a specific subset of items from a crate. The cargo hax into subcommand provides the -i flag to control which items are included or excluded, and how their dependencies are handled. This allows precise tailoring of the extraction process.

The -i Flag

The -i flag accepts a list of patterns with modifiers to define inclusion or exclusion rules for Rust items. Patterns are processed sequentially from left to right, determining which items are extracted.

Basic Concepts

  • Patterns: Rust paths with support for * and ** globs.
    • * matches any single segment (e.g., mycrate::*::myfn).
    • ** matches any subpath, including empty segments (e.g., **::myfn).
  • Modifiers:
    • +: Includes items and their dependencies (transitively).
    • +~: Includes items and their direct dependencies only.
    • +!: Includes only the item itself (no dependencies).
    • +:: Includes only the item's type signature (no body or dependencies).
    • -: Excludes items.

By default, all items are included, unless explicitly modified.

Practical Examples of the -i Flag Usage

Consider the following crate (mycrate) with the lib.rs module:

#![allow(unused)]
fn main() {
fn interesting_function() { aux() }
fn aux() { foo::f() }
fn something_else() { /* ... */ }

mod foo {
    fn f() { /* ... */ }
    fn g() { /* ... */ }
    fn h() { /* ... */ }
    fn interesting_function() { something() }
    fn something() { /* ... */ }

    mod bar {
        fn interesting_function() { /* ... */ }
    }
}

fn not_that_one() { not_that_one_dependency() }
fn not_that_one_dependency() { /* ... */ }

fn not_extracting_function(_: u8) -> u8 {
    unsafe { /* ... */ }
    0
}
}

1. Selectively Including Items with Dependencies

cargo hax into -i '-** +mycrate::**::interesting_function' <BACKEND>
  • Explanation:
    • -**: Excludes all items by default.
    • +mycrate::**::interesting_function: Includes all items matching mycrate::**::interesting_function and their dependencies.
  • Extracted Items:
    1. mycrate::interesting_function (direct match).
    2. mycrate::foo::interesting_function (direct match).
    3. mycrate::foo::bar::interesting_function (direct match).
    4. mycrate::aux (dependency of mycrate::interesting_function).
    5. mycrate::foo::f (dependency of mycrate::aux).
    6. mycrate::foo::something (dependency of mycrate::foo::interesting_function).

2. Excluding Specific Items

cargo hax into -i '+** -*::not_that_one' <BACKEND>
  • Explanation:
    • +**: Includes all items by default.
    • -*::not_that_one: Excludes any item named not_that_one, but keeps all other items, including not_that_one_dependency.
  • Extracted Items: All except mycrate::not_that_one.

3. Including Items Without Dependencies

cargo hax into -i '-** +!mycrate::interesting_function' <BACKEND>
  • Explanation:
    • -**: Excludes all items by default.
    • +!mycrate::interesting_function: Includes only mycrate::interesting_function, without dependencies.
  • Extracted Items: Only mycrate::interesting_function.

4. Including Items with Direct Dependencies Only

cargo hax into -i '-** +~mycrate::interesting_function' <BACKEND>
  • Explanation:
    • -**: Excludes all items by default.
    • +~mycrate::interesting_function: Includes mycrate::interesting_function and its direct dependencies (but not their transitive dependencies).
  • Extracted Items:
    1. mycrate::interesting_function.
    2. mycrate::aux (direct dependency).
  • Excluded Items:
    • mycrate::foo::f (transitive dependency of mycrate::aux).

5. Including Items in Signature-Only Mode

cargo hax into -i '+:mycrate::not_extracting_function' <BACKEND>
  • Explanation:
    • +:mycrate::not_extracting_function: Includes only the type signature of mycrate::not_extracting_function (e.g., as an assumed or axiomatized symbol).
  • Extracted Items:
    • The type signature of mycrate::not_extracting_function, without its body or dependencies.

Summary

The -i flag offers powerful control over extraction, allowing fine-grained inclusion and exclusion of items with various dependency handling strategies. Use it to:

  • Extract specific items and their dependencies (+ or +~).
  • Exclude certain items (-).
  • Include items without dependencies (+!).
  • Extract type signatures only (+:).

For complex crates, this flexibility ensures only the necessary parts are extracted, optimizing analysis or transformation workflows.

Contributing

This chapter contains information about internals of hax.

Please read the CONTRIBUTING.md before opening a pull request.

Architecture

Hax is a software pipeline designed to transform Rust code into various formal verification backends such as F*, Coq, ProVerif, and EasyCrypt. It comprises two main components:

  1. The Frontend (written in Rust)
  2. The Engine (written in OCaml)

The frontend hooks into the Rust compiler, producing a abstract syntax tree for a given crate. The engine then takes this AST in input, applies various transformation, to reach in the end the language of the backend: F*, Coq...

The Frontend (Rust)

The frontend is responsible for extracting and exporting Rust code's abstract syntax trees (ASTs) in a format suitable for processing by the engine (or by other tools).

hax-frontend-exporter Library

This library mirrors the internal types of the Rust compiler (rustc) that constitute the HIR (High-Level Intermediate Representation), THIR (Typed High-Level Intermediate Representation), and MIR (Mid-Level Intermediate Representation) ASTs. It extends them with additional information such as attributes, trait implementations, and removes ID indirections.

SInto Trait: The library defines an entry point for translating a given rustc value to its mirrored hax version using the SInto trait (stateful into). For a value x of type T from rustc, if T is mirrored by hax, then x.sinto(s) produces an augmented and simplified "hax-ified" AST for x. Here, s represents the state holding information about the translation process.

hax-driver Binary

hax-driver is a custom Rust compiler driver that behaves like rustc but performs additional tasks:

  1. Item Enumeration: Lists all items in a crate.
  2. AST Transformation: Applies sinto on each item to generate the hax-ified AST.
  3. Output Generation: Outputs the mirrored items into a haxmeta file within the target directory.

cargo-hax Binary

cargo-hax provides a hax subcommand for Cargo, accessible via cargo hax --help. It serves as the command-line interface for hax, orchestrating both the frontend and the engine.

Workflow:

  1. Custom Build Execution: Runs cargo build, instructing Cargo to use hax-driver instead of rustc.
  2. Multiple Compiler Invocations: cargo build invokes hax-driver multiple times with various options.
  3. Inter-Process Communication: hax-driver communicates with cargo-hax via stderr using JSON lines.
  4. Metadata Generation: Produces haxmeta files containing the transformed ASTs.
  5. Engine Invocation (Optional): If requested, runs the engine, passing options and haxmeta information via stdin serialized as JSON.
  6. Interactive Communication: Engages in interactive communication with the engine.
  7. User Reporting: Outputs results and diagnostics to the user.

The Engine (OCaml - documentation)

The engine processes the transformed ASTs and options provided via JSON input from stdin. It performs several key functions to convert the hax-ified Rust code into the target backend language.

Importing and Simplifying ASTs

  • AST Importation: Imports the hax-ified Rust THIR AST. This is module Import_thir.
  • Internal AST Conversion: Converts the imported AST into a simplified and opinionated internal AST designed for ease of transformation and analysis. This is mostly the functor Ast.Make.

Internal AST and Features

The internal AST is defined using a functor that takes a list of type-level booleans, referred to as features, and produces the AST types accordingly.

Features are for instances, mutation, loops, unsafe code. The enumeration Features.Enumeration lists all those features.

Feature Witnesses: On relevant AST nodes, feature witnesses are included to enforce constraints at the type level. For example, in the loop expression constructor, a witness of type F.loop is used, where F represents the current feature set. If F.loop is an empty type, constructing a loop expression is prohibited, ensuring that loops are disallowed in contexts where they are not supported.

Transformation Phases

The engine executes a sequence of phases, which are determined based on the target backend. Each phase:

  1. Input: Takes a list of items from an AST with specific feature constraints.
  2. Output: Transforms these items into a new AST type, potentially enabling or disabling features through type-level changes.

The phases can be found in the Phases module.

Backend Code Generation

After completing the transformation phases:

  1. Backend Printer Invocation: Calls the printer associated with the selected backend to generate the target code.
  2. File Map Creation: Produces a map from file names to their contents, representing the generated code.
  3. Output Serialization: Outputs the file map and additional information (e.g., errors) as JSON to stderr.

Communication Protocol

The engine communicates asynchronously with the frontend using a protocol defined in hax_types::engine_api::protocol. This communication includes:

  • Diagnostic Data: Sending error messages, warnings, and other diagnostics.
  • Profiling Information: Providing performance metrics and profiling data.
  • Pretty-Printing Requests: Requesting formatted versions of Rust source code or diagnostics for better readability.

Libraries

Macros and attributes

The hax engine understands only one attribute: #[_hax::json(PAYLOAD)], where PAYLOAD is a JSON serialization of the Rust enum hax_lib_macros_types::AttrPayload.

Note #[_hax::json(PAYLOAD)] is a tool attribute: an attribute that is never expanded.

In the engine, the OCaml module Attr_payloads offers an API to query attributes easily. The types in crate hax_lib_macros_types and corresponding serializers/deserializers are automatically generated in OCaml, thus there is no manual parsing involved.

User experience

Asking the user to type #[_hax::json(some_long_json)] is not very friendly. Thus, the crate hax-lib-macros defines a bunch of proc macros that defines nice and simple-to-use macros. Those macro take care of cooking some hax_lib_macros_types::AttrPayload payload(s), then serialize those payloads to JSON and produce one or more #[_hax::json(serialized_payload)] attributes.

Warning: this book is currently being rewritten!

hax is the successor of hacspec. You can find the previous book describing hacspec here, but keep in mind most of the information there is outdated.