Introduction

hax is a tool for high assurance translations that translates a large subset of Rust into formal languages such as F* or Coq. This extends the scope of the hacspec project, which was previously a DSL embedded in Rust, to a usable tool for verifying Rust programs.

So what is hacspec now?

hacspec is the functional subset of Rust that can be used, together with a hacspec standard library, to write succinct, executable, and verifiable specifications in Rust. These specifications can be translated into formal languages with hax.

Quick start with hax and F*

Do you want to try hax out on a Rust crate of yours? This chapter is what you are looking for!

Setup the tools

Setup the crate you want to verify

Note: the instructions below assume you are in the folder of the specific crate (not workspace!) you want to extract.

Note: this part is useful only if you want to run F*.

  • Create the folder proofs/fstar/extraction folder, right next to the Cargo.toml of the crate you want to verify.
    🪄 mkdir -p proofs/fstar/extraction
  • Copy this makefile to proofs/fstar/extraction/Makefile.
    🪄 curl -o proofs/fstar/extraction/Makefile https://gist.githubusercontent.com/W95Psp/4c304132a1f85c5af4e4959dd6b356c3/raw/Makefile
  • Add hax-lib as a dependency to your crate.
    🪄 cargo add --git https://github.com/hacspec/hax hax-lib
    🪄 (hax-lib is not mandatory, but this guide assumes it is present)

Partial extraction

Note: the instructions below assume you are in the folder of the specific crate you want to extract.

Run the command cargo hax into fstar to extract every item of your crate as F* modules in the subfolder proofs/fstar/extraction.

What is critical? What is worth verifying?
Probably, your Rust crate contains mixed kinds of code: some parts are critical (e.g. the library functions at the core of your crate) while some others are not (e.g. the binary driver that wraps the library). In this case, you likely want to extract only partially your crate, so that you can focus on the important part.

Partial extraction.
If you want to extract a function your_crate::some_module::my_function, you need to tell hax to extract nothing but my_function:

cargo hax into -i '-** +your_crate::some_module::my_function' fstar

Note this command will extract my_function but also any item (function, type, etc.) from your crate which is used directly or indirectly by my_function. If you don't want the dependency, use +! instead of + in the -i flag.

Unsupported Rust code.
hax doesn't support every Rust constructs, unsafe code, or complicated mutation scheme. That is another reason for extracting only a part of your crate. When running hax, if an item of your crate, say a function my_crate::f, is not handled by hax, you can append -my_crate::f to the -i flag. You can learn more about the -i flag in the FAQ.

Start F* verification

After running the hax toolchain on your Rust code, you will end up with various F* modules in the proofs/fstar/extraction folder. The Makefile in proofs/fstar/extraction will run F*.

  1. Lax check: the first step is to run OTHERFLAGS="--lax" make, which will run F* in "lax" mode. The lax mode just makes sure basic typechecking works: it is not proving anything. This first step is important because there might be missing libraries. If F* is not able to find a definition, it is probably a libcore issue: you probably need to edit the F* library, which lives in the proofs-libs directory in the root of the hax repo.
  2. Typecheck: the second step is to run make. This will ask F* to typecheck fully your crate. This is very likely that you need to add preconditions and postconditions at this stage. Indeed, this second step is about panic freedom: if F* can typecheck your crate, it means your code never panics, which already is an important property.

To go further, please read the next chapter.

Tutorial

This tutorial is a guide for formally verifying properties about Rust programs using the hax toolchain. hax is a tool that translates Rust programs to various formal programming languages.

The formal programming languages we target are called backends. Some of them, e.g. F* or Coq, are general purpose formal programming languages. Others are specialized tools: ProVerif is dedicated to proving properties about protocols.

This tutorial focuses on proving properties with the F* programming language.

Panic freedom

Let's start with a simple example: a function that squares a u8 integer. To extract this function to F* using hax, we simply need to run the command cargo hax into fstar in the directory of the crate in which the function square is defined.

Note: throughout this tutorial, you can inspect the hax extraction to F* for each code Rust snippets, by clicking on the "F* extraction" tab.

fn square(x: u8) -> u8 {
    x * x
}
let square (x: u8) : u8 = x *! x

Though, if we try to verify this function, F* is complaining about a subtyping issue: F* tells us that it is not able to prove that the result of the multiplication x * x fits the range of u8. The multiplication x * x might indeed be overflowing!

For instance, running square(16) panics: 16 * 16 is 256, which is just over 255, the largest integer that fits u8. Rust does not ensure that functions are total: a function might panic at any point, or might never terminate.

Rust and panicking code

Quoting the chapter To panic! or Not to panic! from the Rust book:

The panic! macro signals that your program is in a state it can't handle and lets you tell the process to stop instead of trying to proceed with invalid or incorrect values.

A Rust program should panics only in a situation where an assumption or an invariant is broken: a panics models an invalid state. Formal verification is about proving such invalid state cannot occur, at all.

From this observation emerges the urge of proving Rust programs to be panic-free!

Fixing our squaring function

Let's come back to our example. There is an informal assumption to the multiplication operator in Rust: the inputs should be small enough so that the addition doesn't overflow.

Note that Rust also provides wrapping_mul, a non-panicking variant of the multiplication on u8 that wraps when the result is bigger than 255. Replacing the common multiplication with wrapping_mul in square would fix the panic, but then, square(256) returns zero. Semantically, this is not what one would expect from square.

Our problem is that our function square is well-defined only when its input is within 0 and 15.

Solution A: reflect the partialness of the function in Rust

A first solution is to make square return an Option<u8> instead of a u8:

fn square_option(x: u8) -> Option<u8> {
    if x >= 16 {
        None
    } else {
        Some(x * x)
    }
}
let square_option (x: u8) : Core.Option.t_Option u8 =
  if x >=. 16uy
  then Core.Option.Option_None <: Core.Option.t_Option u8
  else Core.Option.Option_Some (x *! x) <: Core.Option.t_Option u8

Here, F* is able to prove panic-freedom: calling square with any input is safe. Though, one may argue that square's input being small enough should really be an assumption. Having to deal with the possible integer overflowing whenever squaring is a huge burden. Can we do better?

Solution B: add a precondition

The type system of Rust doesn't allow the programmer to formalize the assumption that square expects a small u8. This becomes possible using hax: one can annotate a function with a pre-condition on its inputs.

The pre-conditions and post-conditions on a function form a contract: "if you give me some inputs that satisfies a given formula (the precondition), I will produce a return value that satisfy another formula (the postcondition)". Outside this contracts, anything might happen: the function might panic, might run forever, erase your disk, or anything.

The helper crate hax-lib-macros provdes the requires proc-macro which lets user writting pre-conditions directly in Rust.

#[hax_lib::requires(x < 16)]
fn square_requires(x: u8) -> u8 {
    x * x
}
let square_requires (x: u8) 
  : Pure u8 (requires x <. 16uy) (requires fun _ -> True)
  = x *! x

With this precondition, F* is able to prove panic freedom. From now on, it is the responsability of the clients of square to respect the contact. The next step is thus be to verify, through hax extraction, that square is used correctly at every call site.

Common panicking situations

Mutipication is not the only panicking function provided by the Rust library: most of the other integer arithmetic operation have such informal assumptions.

Another source of panics is indexing. Indexing in an array, a slice or a vector is a partial operation: the index might be out of range.

In the example folder of hax, you can find the chacha20 example that makes use of pre-conditions to prove panic freedom.

Another solution for safe indexing is to use the newtype index pattern, which is also supported by hax. The data invariants chapter gives more details about this.

Proving properties

In the last chapter, we proved one property on the square function: panic freedom. After adding a precondition, the signature of the square function was x:u8 -> Pure u8 (requires x <. 16uy) (ensures fun _ -> True).

This contract stipulates that, given a small input, the function will return a value: it will not panic or diverge. We could enrich the contract of square with a post-condition about the fact it is a increasing function:

#[hax_lib::requires(x < 16)]
#[hax_lib::ensures(|result| result >= x)]
fn square_ensures(x: u8) -> u8 {
    x * x
}
let square_ensures (x: u8)
    : Pure u8
      (requires x <. 16uy)
      (ensures fun result -> result >=. x) 
    = x *! x

Such a simple post-condition is automatically proven by F*. The properties of our square function are not fasinating. Let's study a more interesting example: Barrett reduction.

A concrete example of contract: Barrett reduction

While the correctness of square is obvious, the Barrett reduction is not.

Given value a field element (a i32 whose absolute value is at most BARRET_R), the function barrett_reduce defined below computes result such that:

  • result ≡ value (mod FIELD_MODULUS);
  • the absolute value of result is bound as follows: |result| < FIELD_MODULUS.

It is easy to write this contract directly as hax::requires and hax::ensures annotations, as shown in the snippet below.

type FieldElement = i32;
const FIELD_MODULUS: i32 = 3329;
const BARRETT_SHIFT: i64 = 26;
const BARRETT_R: i64 = 0x4000000; // 2^26
const BARRETT_MULTIPLIER: i64 = 20159; // ⌊(BARRETT_R / FIELD_MODULUS) + 1/2⌋

#[hax_lib::requires((i64::from(value) >= -BARRETT_R && i64::from(value) <= BARRETT_R))]
#[hax_lib::ensures(|result| result > -FIELD_MODULUS && result < FIELD_MODULUS
                     && result %  FIELD_MODULUS ==  value % FIELD_MODULUS)]
fn barrett_reduce(value: i32) -> i32 {
    let t = i64::from(value) * BARRETT_MULTIPLIER;
    let t = t + (BARRETT_R >> 1);

    let quotient = t >> BARRETT_SHIFT;
    let quotient = quotient as i32;

    let sub = quotient * FIELD_MODULUS;

    // Here a lemma to prove that `(quotient * 3329) % 3329 = 0`
    // may have to be called in F*.

    value - sub
}
unfold
let t_FieldElement = i32

let v_BARRETT_MULTIPLIER: i64 = 20159L

let v_BARRETT_R: i64 = 67108864L

let v_BARRETT_SHIFT: i64 = 26L

let v_FIELD_MODULUS: i32 = 3329l

let barrett_reduce (value: i32)
    : Pure i32
      (requires
        (Core.Convert.f_from value <: i64) >=. (Core.Ops.Arith.Neg.neg v_BARRETT_R <: i64) &&
        (Core.Convert.f_from value <: i64) <=. v_BARRETT_R)
      (ensures
        fun result ->
          let result:i32 = result in
          result >. (Core.Ops.Arith.Neg.neg v_FIELD_MODULUS <: i32) && result <. v_FIELD_MODULUS &&
          (result %! v_FIELD_MODULUS <: i32) =. (value %! v_FIELD_MODULUS <: i32)) =
  let t:i64 = (Core.Convert.f_from value <: i64) *! v_BARRETT_MULTIPLIER in
  let t:i64 = t +! (v_BARRETT_R >>! 1l <: i64) in
  let quotient:i64 = t >>! v_BARRETT_SHIFT in
  let quotient:i32 = cast (quotient <: i64) <: i32 in
  let sub:i32 = quotient *! v_FIELD_MODULUS in
  let _:Prims.unit = Tutorial_src.Math.Lemmas.cancel_mul_mod quotient 3329l in
  value -! sub

Before returning, a lemma may have to be called in F* to prove the correctness of the reduction. The lemma is Math.Lemmas.cancel_mul_mod (v quotient) 3329;, which establishes that (quotient * 3329) % 3329 is zero. With the help of that one lemma, F* is able to prove the reduction computes the expected result. (We may expose lemmas like this to call from Rust directly in future.)

This Barrett reduction examples is taken from libcrux's proof of Kyber which is using hax and F*.

This example showcases an intrinsic proof: the function barrett_reduce not only computes a value, but it also ship a proof that the post-condition holds. The pre-condition and post-condition gives the function a formal specification, which is useful both for further formal verification and for documentation purposes.

Extrinsic properties with lemmas

Consider the encrypt and decrypt functions below. Those functions have no precondition, don't have particularly interesting properties individually. However, the compostion of the two yields an useful property: encrypting a ciphertext and decrypting the result with a same key produces the ciphertext again. |c| decrypt(c, key) is the inverse of |p| encrypt(p, key).

fn encrypt(plaintext: u32, key: u32) -> u32 {
    plaintext ^ key
}

fn decrypt(ciphertext: u32, key: u32) -> u32 {
    ciphertext ^ key
}
let decrypt (ciphertext key: u32) : u32 = ciphertext ^. key

let encrypt (plaintext key: u32) : u32 = plaintext ^. key

In this situation, adding a pre- or a post-condition to either encrypt or decrypt is not useful: we want to state our inverse property about both of them. Better, we want this property to be stated directly in Rust: just as with pre and post-conditions, the Rust souces should clearly state what is to be proven.

To this end, Hax provides a macro lemma. Below, the Rust function encrypt_decrypt_identity takes a key and a plaintext, and then states the inverse property. The body is empty: the details of the proof itself are not relevant, at this stage, we only care about the statement. The proof will be completed manually in the proof assistant.

#[hax_lib::lemma]
#[hax_lib::requires(true)]
fn encrypt_decrypt_identity(
    key: u32,
    plaintext: u32,
) -> Proof<{ decrypt(encrypt(plaintext, key), key) == plaintext }> {
}
let encrypt_decrypt_identity (key plaintext: u32)
    : Lemma (requires true)
      (ensures (decrypt (encrypt plaintext key <: u32) key <: u32) =. plaintext) = ()

Data invariants

In the two previous chapters we saw how to write specifications on functions, be it with pre and post-condition or with lemmas. In this chapter, we will see how to maintain invariants with precise types.

Making illegal states unpresentable

With the Barrett example, we were working on a certain field, whose elements were represented as i32 integers. To simplify, let's consider F₃, the finite field with 3 elements (say 0, 1 and 2). Every element of F3 can be represented as a i32 integers, but the converse doesn't hold: the vast majority of i32 integers are not in of F₃.

Representing F₃ as i32s, every time we define a function consuming F₃ elements, we face the risk to consume illegal elements. We are thus back to chapter 4.1: we should panic on illegal elements, and add hax pre-conditions on every single function. That's not ideal: the property of being either 0, 1 or 2 should be encoded directly on the type representing F₃ elements.

enums to then rescue

Rust alone already can solve our representation issues with enums! Below, we define the enum type F3 which has only three constructor: F3 represent exactly the elements of F₃, not more, not less.

enum F3 {
    E1,
    E2,
    E3,
}
type t_F3 =
  | F3_E1 : t_F3
  | F3_E2 : t_F3
  | F3_E3 : t_F3

let t_F3_cast_to_repr (x: t_F3) : isize =
  match x with
  | F3_E1  -> isz 0
  | F3_E2  -> isz 1
  | F3_E3  -> isz 3

With F3, there doesn't exist illegal values at all: we can now define total functions on F₃ elements. We dropped altogether a source of panic!

Soon you want to work with a bigger finite field: say F₂₃₄₇. Representing this many q different elements with an Rust enum would be very painful... The enum apporach falls appart.

Newtype and refinements

Since we don't want an enum with 2347 elements, we have to revert to a type that can hold this many elements. The smallest integer type large enough provided by Rust is u16.

Let's define F a "newtype": a struct with one u16 field v. Notice the refinment annotation on v: the extraction of this type F via hax will result in a type enforcing v small enough.

pub const Q: u16 = 2347;

#[hax_lib::attributes]
pub struct F {
    #[hax_lib::refine(v < Q)]
    pub v: u16,
}
let v_Q: u16 = 2347us

type t_F = { f_v:f_v: u16{f_v <. v_Q} }

In Rust, we can now define functions that operates on type F, assuming they are in bounds with respect to F₂₃₄₇: every such assumption will be checked and enforced by the proof assistant. As an example, below is the implementation of the addition for type F.

use core::ops::Add;

impl Add for F {
    type Output = Self;
    fn add(self, rhs: Self) -> Self {
        Self {
            v: (self.v + rhs.v) % Q,
        }
    }
}
[@@ FStar.Tactics.Typeclasses.tcinstance]
let impl: Core.Ops.Arith.t_Add t_F t_F =
  {
    f_Output = t_F;
    f_add_pre = (fun (self: t_F) (rhs: t_F) -> true);
    f_add_post = (fun (self: t_F) (rhs: t_F) (out: t_F) -> true);
    f_add = fun (self: t_F) (rhs: t_F) -> { 
          f_v = (self.f_v +! rhs.f_v <: u16) %! v_Q 
      } <: t_F
  }

Here, F* is able to prove automatically that (1) the addition doesn't overflow and (2) that the invariant of F is preserved. The definition of type F in F* (named t_F) very explicitely requires the invariant as a refinement on v.

Troubleshooting/FAQ

This chapter captures a list of common questions or issues and how to resolve them. If you happen to run into an issue that is not documented here, please consider submitting a pull request!

The include flag: which items should be extracted, and how?

Often, you need to extract only a portion of a crate. The subcommand cargo hax into accepts the -i flag which will help you to include or exclude items from the extraction.

Consider the following lib.rs module of a crate named mycrate.

#![allow(unused)]
fn main() {
fn interesting_function() { aux() }
fn aux() { foo::f() }
fn something_else() { ... }
mod foo {
  fn f() { ... }
  fn g() { ... }
  fn h() { ... }
  fn interesting_function() { something() }
  fn something() { ... }
  mod bar {
    fn interesting_function() { ... }
  }
}
fn not_that_one() { not_that_one_dependency() }
fn not_that_one_dependency() { ... }
}

Here are a few examples of the -i flag.

  • cargo hax into -i '-** +mycrate::**::interesting_function' <BACKEND>
    The first clause -** makes items not to be extracted by default.
    This extracts any item that matches the glob pattern mycrate::**::interesting_function, but also any (local) dependencies of such items. ** matches any sub-path.
    Thus, the following items will be extracted:
    • mycrate::interesting_function (direct match);
    • mycrate::foo::interesting_function (direct match);
    • mycrate::foo::bar::interesting_function (direct match);
    • mycrate::aux (dependency of mycrate::interesting_function);
    • mycrate::foo::f (dependency of mycrate::aux);
    • mycrate::foo::something (dependency of mycrate::foo::interesting_function).
  • cargo hax into -i '+** -*::not_that_one' <BACKEND>
    Extracts any item but the ones named <crate>::not_that_one. Here, everything is extracted but mycrate::not_that_one, including mycrate::not_that_one_dependency.
  • cargo hax into -i '-** +!mycrate::interesting_function' <BACKEND>
    Extracts only mycrate::interesting_function, not taking into account dependencies.
  • cargo hax into -i '-** +~mycrate::interesting_function' <BACKEND>
    Extracts mycrate::interesting_function and its direct dependencies (no transitivity): this includes mycrate::aux, but not mycrate::foo::f.

Now, suppose we add the function not_extracting_function below. not_extracting_function uses some unsafe code: this is not supported by hax. However, let's suppose we have a functional model for this function and that we still want to extract it as an axiomatized, assumed symbol.

#![allow(unused)]
fn main() {
fn not_extracting_function(u8) -> u8 {
    ... 
    unsafe { ... }
    ...
}
}
  • cargo hax into -i '+:mycrate::not_extracting_function' <BACKEND>
    Extracts mycrate::not_extracting_function in signature-only mode.

Libraries

Macros and attributes

The hax engine understands only one attribute: #[_hax::json(PAYLOAD)], where PAYLOAD is a JSON serialization of the Rust enum hax_lib_macros_types::AttrPayload.

Note #[_hax::json(PAYLOAD)] is a tool attribute: an attribute that is never expanded.

In the engine, the OCaml module Attr_payloads offers an API to query attributes easily. The types in crate hax_lib_macros_types and corresponding serializers/deserializers are automatically generated in OCaml, thus there is no manual parsing involved.

User experience

Asking the user to type #[_hax::json(some_long_json)] is not very friendly. Thus, the crate hax-lib-macros defines a bunch of proc macros that defines nice and simple-to-use macros. Those macro take care of cooking some hax_lib_macros_types::AttrPayload payload(s), then serialize those payloads to JSON and produce one or more #[_hax::json(serialized_payload)] attributes.

Warning: this book is currently being rewritten!

hax is the successor of hacspec. You can find the previous book describing hacspec here, but keep in mind most of the information there is outdated.