B3: An intermediate verification language
K. Rustan M. Leino
Manuscript KRML 301, 18 February 2025

Abstract. This document defines the intermediate verification language B3. With a design informed by the experience and needs of prescribing verification conditions for Dafny, B3 aims to support the verification foundation of any similar language.

0. Introduction

Authoring an automated program verification is a complex task that can be broken up into two major components with separate concerns. One component is the front end, which prescribes the proof obligations for any given source program. The other component is the back end, which uses algorithms that try to discharge these proof obligations. The front end is concerned with what the proof obligations are (for the source language at hand), whereas the back end is concerned with how to discharge the proof obligations. The front end and back end are joined at an Intermediate Verification Language (IVL). For the front-end author, the IVL becomes a thinking tool that suggests ways to prescribe the kinds of verification conditions that commonly arise in a program verifier. For the back-end author, the IVL provides a primitive and precise foundation upon which to build its algorithms.

The back end makes use of a set of cooperating decision procedures or semi-decision procedures, each of which is equipped to handle a particular domain (e.g., linear integer arithmetic). Such decision procedures are provided in off-the-shelf Satisfiability Modulo Theories (SMT) solvers. The back end works by turning the proof obligations prescribed by a given IVL program into queries posed to one or several SMT solvers (henceforth referred to as just the solver).

In the last 25 years, IVLs have been used effectively in dozens of automated program verifiers. Widely used IVLs include Boogie, Why3, and Viper. For example, the verifier for the Dafny programming language has, since its creation 17 years ago, used the Boogie 2 IVL and tool. This long experience has developed several patterns of prescribing proof obligations that are not well supported by Boogie 2.

This document describes a new IVL, named B3, which incorporates support for these common patterns. The semantics of B3 is defined by strongest postconditions and its associated proof obligations (which, equivalently, can in the opposite direction be computed as weakest conservative preconditions in the standard way). The semantics is presented as two operations on a solver state Ω: Ω.extend(e) records strongest postconditions and Ω.prove(e) indicates proof obligations. Through these two solver operations, B3 essentially defines a script for automatically interacting with the solver.

In contrast to the organization of a reference manual that introduces each construct with all its details and variations, this document aims to introduce ideas in a top-down fashion, starting with larger ideas and later going into further details and variations.

1. Overview

At the top level, B3 provides declarations to introduce types, functions, axioms, and procedures. It builds in some primitive types, including bool and int, and operations on these.

The following sections are ordered to first highlight the core features of the language. Later sections add variations and unsurprising features.

2. Statements

A prominent feature of any IVL is its support for control flow and data flow. This provides the front end a straightforward way to encode the statements of the source language, like if statements and assignment statements. Therefore, let's start with B3's statements.

Here is the grammar of (the core) statements. In it, capitalized words denote non-terminal grammar productions. As suggestive alternatives to the non-terminal Identifier, the grammar uses the letters x, y, z (for variables and parameters), (for labels), F (for functions), and M (for procedures). Keywords are syntax highlighted. Punctuation is shown as is, except for the punctuation that's part of the grammar definition: ::= (which defines its LHS grammar production), | (for choice), parentheses (for grouping), square brackets (to denote an optional component), a suffix * to denote 0 or more occurrences of what just precedes it, and *, as a variation of * that says the occurrences are comma-separated. Tokens that may be confused with such grammar components are enclosed in double quotes.

Stmt ::=
  | var x: Type
  | x := Expr
  | Assertion
  | [ : ] Block
  | exit 
  | If
  | [ : ] Loop
  | return
  | Call
  | ...
Block ::=
  | "{" Stmt* "}"
If ::=
  | if Expr Block else Block
  | ...

These grammatical elements will be elaborated further later in the document.

Blocks contain a sequence of statements. The variable declarations among those statements must use distinct names. The language allows a block to introduce a variable that shadows a variable with the same name in an outer block. The scope of a variable ends at the end of the enclosing block. Variables can be updated with assignment statements.

The left-hand side of an assignment must be a mutable variable. Local variables, in/out-parameters, and out-parameters are mutable, but in-parameters are not.

Some statements can be labeled. Subcomponents of such a statement can use an exit statement to exit the statement. That is, a statement exit is allowed only inside the statement labeled with . This provides outward jumps, as provided for example in a source-language break statement out of a loop. (For a variation of this statement, see section 15.0.)

The grammar shows the basic form of an if statement. For variations of it, see section 16. There are also variations on the var statement and the assignment statement, see sections 15.1 and 15.2, respectively.

Assertions are defined in sections 4 and 17, calls are defined in section 6, and the loop statement is defined in section 10.

3. Symbolic execution

The statements are given a strongest-postcondition semantics, presented in terms of processing rules that strongly hint at a symbolic execution. However, note that the semantics does not depend on using symbolic execution. Indeed, the given extend and prove operations for a statement S are the duals of the weakest-precondition operations wlp.S and wp.S.true. Such a weakest-precondition semantics can be used to formulate a single verification-condition formula for each B3 procedure (and one can also obtain a single verification-condition formula straight from the extend and prove operations). From the given semantics, it is also possible to define an operational semantics for B3, for example a trace semantics (analogous to what's in the Boogie 2 language reference manual).

To describe the processing of statements, we first introduce its components.

3.0. Continuations

The symbolic execution is defined for a given continuation, which is a sequence of statements, with typical names Ss and Cc. We denote the empty sequence by $\varepsilon$ and use the operator $\cdot$ to denote sequence concatenation.

3.1. Incarnations

When symbolic execution processes an assignment to a variable, it generates a new name for the variable and binds the new name to the value being assigned. These names thus denote incarnations of the variable. The incarnation state is a map σ from each variable to the variable's current incarnation.

To a first order of approximation, the names used for incarnations are irrelevant, provided there are no name clashes. However, to make naming deterministic, we use names of the form x and x%n, where x is the name of the variable as declared in the B3 program and n is a sequence number. When a variable x is first declared, it has the name x (which can be used, but often isn't), and subsequence assignments will use the names x%0, x%1, .

For a variable x, σ.new(x) denotes the next name to be used for x. For an expression e, σ[e] denotes e in which every variable has been renamed according to σ. For a map σ, we write σ[x := y] to denote a map that is like σ except that it maps x to y.

3.2. Solver state

Symbolic execution makes use of a solver Ω, whose state is essentially a logical formula (but with additional bookkeeping information that we'll see later). The two main operations on a solver state Ω are

As we represent it, the solver state is immutable and the definition of symbolic execution sometimes uses the same solver state more than once. In practice, a solver may be incremental, so the interaction with the solver may use SMT Lib PUSH and POP operations to efficiently accomplish what we describe here.

3.3. Statement processing

Symbolic execution is defined from

The translation function Process$\llbracket$ Ss, σ, B, Ω $\rrbracket$ gives a sequence a solver commands. These say how the state of the symbolic execution proceeds and what needs to be proved when. For example, if σ respectively maps x and y to the incarnations x%4 and y%1, then, as we'll soon see, Process$\llbracket$ y := x + 1 $\cdot$ check x < y, σ, B, Ω $\rrbracket$ gives the solver commands

Ω.extend(y%2 = x%4 + 1).prove(x%4 < y%2)

For now, we define Process only for a few statements, enough to give a representative idea of how the symbolic execution works. Later sections will define Process for other statements. We suppose that every block has been given a label.

Process$\llbracket$ $\varepsilon$, σ, B, Ω $\rrbracket$ =

Process$\llbracket$ var x: T $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  Process$\llbracket$ Cc, σ[x := x], B, Ω $\rrbracket$

Process$\llbracket$ x := e $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  let x' be σ.new(x)
  Process$\llbracket$ Cc, σ[x := x'], B, Ω.extend(x' = e') $\rrbracket$

Process$\llbracket$ : { Ss } $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let B' be B[ := (domain of σ, Cc)]
  Process$\llbracket$ Ss $\cdot$ exit , σ, B', Ω $\rrbracket$

Process$\llbracket$ exit  $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let (V, Cc') be the value of B[]
  let σ' be σ domain-restricted to V
  Process$\llbracket$ Cc', σ', B, Ω $\rrbracket$

Process$\llbracket$ return $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let (V, Cc') be the value of B[return]
  let σ' be σ domain-restricted to V
  Process$\llbracket$ Cc', σ', B, Ω $\rrbracket$

A general choice in symbolic execution is whether or not to join the two branches of an if before processing its continuation. Here is the version that does not do that join, but instead processes the branches through an if separately (quite possibly with some redundancy).

Process$\llbracket$ if e S0 else S1 $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  Process$\llbracket$ S0 $\cdot$ Cc, σ, B, Ω.extend(e') $\rrbracket$
  Process$\llbracket$ S1 $\cdot$ Cc, σ, B, Ω.extend(¬e') $\rrbracket$

Remark: To support joins without letting branches influence each others, names of incarnations should be generated in such a way that the numberings in two branches have no effect on each other. For example, suppose the incarnation of a variable x is x%8 upon reaching an if. Then, the first new incarnation inside the then branch can be x%8.then.0 and the first new incarnation inside the else branch can be x%8.else.0. The incarnation at the join of the two branches will be x%9.

4. Assertions

A core function of an IVL is to generate proof obligations. These are prescribed using assertions, of which there are two basic forms. The check statement checks if a condition holds in the current state. Checks are forgetful, meaning that they have no effect on the solver state. Stated in terms of an operational semantics, program execution continues after a check even if its condition does not hold. The assume statement lets the solver state learn a condition. This means the condition becomes an antecedent in discharging further proof obligations. Stated in terms of an operational semantics, any execution that would encounter an assume with a condition that does not hold is not considered a possible trace of the program (that is, any analysis of the program ignores such a trace).

In simple settings, the property being checked is then learned, so the language offers an assert statement that combines the check and assume statements. Stated in terms of an operational semantics, program execution continues after an assert only if its condition holds.

These three statements, along with a block that encloses a sequence of assertions, are

Assertion ::=
  | check Expr
  | assume Expr
  | assert Expr
  | ...
ABlock ::=
  | "{" ABlockAssertion* "}"
ABlockAssertion ::=
  | Assertion
  | ...

The two basic assertions are processed as follows:

Process$\llbracket$ check e $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  Ω.prove(e')
  Process$\llbracket$ Cc, σ, B, Ω $\rrbracket$
Process$\llbracket$ assume e $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  let Ω' be Ω.extend(e')
  Process$\llbracket$ Cc, σ, B, Ω' $\rrbracket$

The statement assert e always means check e $\cdot$ assume e.

As we go along, we will sometimes need to collect the learning that's prescribed by an assertion. This is done by the translation function Learn, which returns an expression.

Learn$\llbracket$ check e, σ $\rrbracket$ =
  true

Learn$\llbracket$ assume e, σ $\rrbracket$ =
  σ[e]

5. Procedures

A procedure is a name, a signature, a specification, and an optional implementation.

TopLevelDeclaration ::=
  | procedure M "(" CParam*, ")" Spec [ Block ]
  | ...
CParam ::=
  | [inout | out] x: Type
Spec ::=
  | PreSpecification* PostSpecification*
PreSpecification ::=
  | requires Expr
  | requires ABlock
PostSpecification ::=
  | ensures Expr
  | ensures ABlock

The comma-separated parameters of a procedure come in three kinds: in-parameters, in/out-parameters, and out-parameters. Although a procedure can have any number of parameters, subsequent declarations and uses of procedures will representatively show three parameters, one of each kind: x: X, inout y: Y, out z: Z.

The specification consists of any number of preconditions and any number of postconditions, each of which indicates either an expression or an assertion. Postconditions are allowed to refer to the initial value of an in/out-parameter y by the expression old(y). For the purpose of substitution in expressions, old(y) is treated as a variable; that is, we allow old(y) to be used as a left-hand side in the incarnation map σ.

The processing of a procedure consists of three tasks. The first task vets the precondition, the second task vets the postcondition, and the third task enforces that the body, if any, establishes the postcondition. The first of these tasks precedes the other two, but the second and third tasks do not depend on each other.

TopLevelProcess$\llbracket$ procedure M(x: X, inout y: Y, out z: Z) PreSpecs PostSpecs Body, Ω $\rrbracket$ =
  let σ be [x := x, y := y]
  let y' be σ.new(y)
  let z' be σ.new(z)
  let σ' be [x := x, old(y) := y, y := y', z := z']
  if there is a Body
    let Body' be assume old(y) = y $\cdot$ Body
    VetPre$\llbracket$ PreSpecs, PostSpecs, Body', σ, σ', Ω $\rrbracket$
  else
    VetPre$\llbracket$ PreSpecs, PostSpecs, $\varepsilon$, σ, σ', Ω $\rrbracket$

VetPre vets the precondition and then continues into VetPost and EnforcePost.

VetPre$\llbracket$ requires e $\cdot$ PreSpecs, PostSpecs, Body, σ, σ', Ω $\rrbracket$ =
  let e' be σ[e]
  VetPre$\llbracket$ PreSpecs, PostSpecs, Body, σ, σ', Ω.extend(e') $\rrbracket$

VetPre$\llbracket$ requires A $\cdot$ PreSpecs, PostSpecs, Body, σ, σ', Ω $\rrbracket$ =
  Process$\llbracket$ A, σ, $\varepsilon$, Ω $\rrbracket$
  let L be Learn$\llbracket$ A, Ω $\rrbracket$
  VetPre$\llbracket$ PreSpecs, PostSpecs, Body, σ, σ', Ω.extend(L) $\rrbracket$

VetPre$\llbracket$ $\varepsilon$, PostSpecs, Body, σ, σ', Ω $\rrbracket$ =
  VetPost$\llbracket$ PostSpecs, σ', Ω $\rrbracket$
  EnforcePost$\llbracket$ PostSpecs, Body, σ', Ω $\rrbracket$
VetPost$\llbracket$ ensures e $\cdot$ PostSpecs, σ, Ω $\rrbracket$ =
  let e' be σ[e]
  VetPost$\llbracket$ PostSpecs, σ, Ω.extend(e') $\rrbracket$

VetPost$\llbracket$ ensures A $\cdot$ PostSpecs, σ, Ω $\rrbracket$ =
  Process$\llbracket$ A, σ, $\varepsilon$, Ω $\rrbracket$
  let L be Learn$\llbracket$ A, Ω $\rrbracket$
  VetPost$\llbracket$ PostSpecs, σ, Ω $\rrbracket$

VetPost$\llbracket$ $\varepsilon$, σ, Ω $\rrbracket$ =

To enforce that the body establishes the postconditions, and also to enforce that a caller establishes the precondition (see section 6 on calls), we define two operations on specifications, ConvertToLearn and ConvertToCheck. These learn from the assertions and either just learn or both check and learn the conditions. The operations process requires and ensures (and in section 10 also invariant for loops) in the same way; the following definitions use the ficticious keyword reqensinv to stand for any of requires, ensures, or invariant.

ConvertToLearn$\llbracket$ reqensinv e $\cdot$ Specs $\rrbracket$ =
  let Sp be ConvertToLearn$\llbracket$ Specs $\rrbracket$
  assume e $\cdot$ Sp

ConvertToLearn$\llbracket$ reqensinv A $\cdot$ Specs $\rrbracket$ =
  let L be Learn$\llbracket$ A $\rrbracket$
  let Sp be ConvertToLearn$\llbracket$ Specs $\rrbracket$
  assume L $\cdot$ Sp

ConvertToLearn$\llbracket$ $\varepsilon$ $\rrbracket$ =
  $\varepsilon$
ConvertToCheck$\llbracket$ reqensinv e $\cdot$ Specs $\rrbracket$ =
  let Sp be ConvertToCheck$\llbracket$ Specs $\rrbracket$
  check e $\cdot$ Sp

ConvertToCheck$\llbracket$ reqensinv A $\cdot$ Specs $\rrbracket$ =
  let L be Learn$\llbracket$ A $\rrbracket$
  let Sp be ConvertToCheck$\llbracket$ Specs $\rrbracket$
  assume L $\cdot$ Sp

ConvertToCheck$\llbracket$ $\varepsilon$ $\rrbracket$ =
  $\varepsilon$

Now, we define the third task of processing procedure declarations. It uses the procedure's precondition, as provided in the Ω parameter to EnforcePost, and it enforces the postcondition after processing the body.

EnforcePost$\llbracket$ PostSpecs, Body, σ, Ω $\rrbracket$ =
  let Ens be ConvertToCheck$\llbracket$ PostSpecs $\rrbracket$
  let B be [return := (domain of σ, Ens)]
  Process$\llbracket$ Body $\cdot$ return, σ, B, Ω $\rrbracket$

5.0. Example

Consider the following procedure declaration:

procedure M(x: int, inout y: int)
  requires { check FPre(x, y) $\cdot$ assume FCanCall(x, y) }
  requires F(x, y) < 10
  ensures { check GPre(x, y) $\cdot$ assume GCanCall(x, y) }
  ensures old(y) <= G(x, y)

Vetting this procedure declaration from a solver state Ω involves the following checks:

// vet precondition
Ω.prove(FPre(x, y))
Ω' := Ω.extend(FCanCall(x, y))
Ω" := Ω'.extend(F(x, y) < 10)
// vet postcondition
Ω".prove(GPre(x, y%0))
Ω3 := Ω".extend(GCanCall(x, y%0))
Ω4 := Ω3.extend(y < G(x, y%0))

Note that vetting learns from, but does not check, the expression forms of the pre- and postconditions.

If the procedure has a body Body, then it will be processed by

Process$\llbracket$ Body $\cdot$ return, ..., ..., Ω" $\rrbracket$

When this processing reaches the return label, say in a solver state Ω5 and with the incarnation y%8 for y, then processing will enforce the postcondition by

Ω6 := Ω5.extend(GCanCall(x, y%8))
Ω6.prove(y <= G(x, y%8))

Note that enforcement does not make use of the check assertions in the procedure's pre- and postcondition.

6. Calls

A procedure can be called as a statement. This enforces that the caller has established the precondition and in return lets the caller use the procedure's postcondition. In a call, the in/out- and out-arguments must be distinct variables. (Following the good idea in C#) the arguments are syntactically tagged with the same labels as the formal parameters.

Call ::=
  | call M "(" Argument*, ")"

Argument ::=
  | Expr
  | ( inout | out ) x

It would be possible to define a call as an elaboration of the body of the called procedure (provided the procedure has a body). However, here, we just define the processing of a call in terms of the procedure's specification.

Suppose M is a procedure

procedure M(x: X, inout y: Y, out z: Z) PreSpecs PostSpecs

Then,

Process$\llbracket$ call M(e, inout b, out c) $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  let x' be σ.new(x%in)
  let b' be σ[b]
  let b" be σ.new(b')
  let c' be σ.new(c)
  let σ' be [x := x', y := b']
  let Pre be ConvertToCheck$\llbracket$ PreSpecs $\rrbracket$
  let Ω' be Ω.extend(x' = e')
  Process$\llbracket$ Pre, σ', B, Ω' $\rrbracket$
  let σ" be σ[x := x', old(y) := b', y := b", z := c']
  let Post be ConvertToLearn$\llbracket$ PostSpecs $\rrbracket$
  Process$\llbracket$ Post $\cdot$ Cc, σ", B, Ω' $\rrbracket$

Note that the semantics of the call statement does not retain any part of the precondition in the continuation of the call. In other words, the call “checks and forgets” the precondition.

6.0. Example

Continuing the example from section 5.0, consider a call

call M(a, inout b)

From a solver state Ω and with incarnations a%3 and b%5 for a and b, the call is processed as follows:

Ω' := Ω.extend(x%in = a%3)
Ω" := Ω'.extend(FCanCall(x%in, b%5))
Ω".prove(F(x%in, b%5) < 10)
Ω3 := Ω'.extend(GCanCall(x%in, b%6))
Ω4 := Ω3.extend(b%5 <= G(x%in, b%6))

7. Types

B3 features the some built-in primitive types as well as user-defined types.

Type ::=
  | bool
  | int
  | real
  | bv0 | bv1 | bv2 | ...
  | T

There's not much to say about type declarations. Simply, they introduce names of uninterpreted, user-defined types.

TopLevelDeclaration ::=
  | ...
  | type T

There's nothing to process about a type.

TopLevelProcess$\llbracket$ type T $\rrbracket$ =

For variations of type declarations, see section 18.

(TODO: Consider adding record types and/or collections of variables as a type.)

8. Axioms

An axiom declares a top-level assumption. It can be used to (fully or partially) describe properties of functions or relations between functions.

TopLevelDeclaration ::=
  | ...
  | axiom Explains* Expr

The Explains clauses are explained in section 12 on dependency management.

There's nothing to process about an axiom.

TopLevelProcess$\llbracket$ axiom e $\rrbracket$ =

9. Functions

A function is a name, a signature, and an optional definition.

TopLevelDeclaration ::=
  | ...
  | function F "(" FParam*, ")" ":" TypeOrPartition [ FDefinition ]
FParam ::=
  | FParamModifier* x: Type
FDefinition ::=
  | [ ForallHeader ] Trigger* FPre* "{" Expr "}"
ForallHeader ::=
  | forall x: Type
FPre ::=
  | requires Expr

In its simplest form, a function declaration just introduces an uninterpreted function symbol; for example:

function Fib(x: int): int

Functions live in a different name space from variables.

For details of partitions, see sections 18.1 and 19.1; for now, it suffices to think of TypeOrPartition as just Type. The FParamModifier production is defined in section 19.0 and the ForallHeader production is explained in section 19.2.

9.0. Function definition

The value of a function can be defined using an axiom. It can also be defined as part of the function declaration. There are two advantages of giving the definition as part of the function declaration. One advantage is convenience, since it eliminates the need for writing a universally quantified axiom. The other advantage is that, for a function that returns a boolean (aka, a predicate), B3 can trace into the body to give more detailed error information (see section 11 on markers).

Here is an example of a simple function declaration:

function Triple(x: int): int
{
  3 * x
}

The definition of this function effectively turns into the following axiom:

axiom forall x: int :: Triple(x) = 3 * x

It frequently occurs that a function is underspecified, that is, that it is defined only on certain parameters. This can be incorporated into the axiom by using requires clauses. For example, to give a definition of Triple only for non-negative integers, one can write

function Triple(x: int): int
  requires 0 <= x
{
  3 * x
}

which in effect gives the following axiom:

axiom forall x: int :: 0 <= x ==> Triple(x) = 3 * x

(TODO: The use of the requires keyword here is a bad idea, since requires suggests enforcing the condition at call sites, which is not the intention. That is, a function in B3 total but possibly underspecified. The keyword should be replaced by something that suggests underspecification.)

10. Loops

Loop ::=
  | loop Invariant* Block
Invariant ::=
  | invariant Expr
  | invariant ABlock

A loop is used to denote a possibly infinite repetition of a block statement. exit and return statements can be used to end the repetition.

It would be possible to define a loop as an elaboration of its body. However, here, we just define the processing of a loop in terms of the loop's specification, its invariant clauses.

Process$\llbracket$ loop Invs S $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let CheckInv be ConvertToCheck$\llbracket$ Invs $\rrbracket$
  let LearnInv be ConvertToLearn$\llbracket$ Invs $\rrbracket$
  Process$\llbracket$ CheckInv, σ, B, Ω $\rrbracket$
  let x be the assignment targets (see section 10.0 below) of the loop body
  let x' be σ.new(x)
  let σ' be σ[x := x']
  VetInv$\llbracket$ Invs, σ', Ω $\rrbracket$
  Process$\llbracket$ LearnInv $\cdot$ S $\cdot$ CheckInv, σ', B, Ω $\rrbracket$

where the vetting of the invariant is defined by

VetInv$\llbracket$ $\varepsilon$, σ', Ω $\rrbracket$ =

VetInv$\llbracket$ invariant e $\cdot$ Ii, σ', Ω $\rrbracket$ =
  let e' be σ'[e]
  VetInv$\llbracket$ Ii, σ', Ω.extend(e') $\rrbracket$

VetInv$\llbracket$ invariant A $\cdot$ Ii, σ', Ω $\rrbracket$ =
  Process$\llbracket$ A, σ', $\varepsilon$, Ω $\rrbracket$
  let L be Learn$\llbracket$ A, Ω $\rrbracket$
  VetInv$\llbracket$ Ii, σ', Ω.extend(L) $\rrbracket$

10.0. Assignment targets

The assignment targets of a loop body is the set of variables that are assigned (either directly by an assignment statement or as an in/out- or out-argument of a call statement) on some control path from the start of the loop body to the loop body's back-edge. For example, for the loop

: loop {
  v := e0
  if A {
    if B {
      w := e1
      exit 
    } else {
      x := e2
      exit '
    }
  } else {
    if C {
      call M(e3, inout y)
    } else {
      z := e4
      return
    }
  }
}

the assignment targets are v and y, but not w, x, or z.

11. Error information

As presented so far, a call to Ω.prove(e) attempts to prove e in the solver state Ω and reports an error if no such proof is found. Each prove operation stems from a check statement, so an error reported by prove includes information that identifies the failing check. While check statements can be explicit in the program text, some check statements are generated as part of ConvertToCheck operations. In case of the latter, the requires, ensures, or invariant clause that uses ConvertToCheck is included in the error report.

Beyond identifying the (explicit or implicit) check statement the proof of whose condition failed, prove does two more things to add more information to error reports.

11.0. Road post markers

One thing is to include road post markers. A road post marker is declared by the probe statement:

Stmt ::=
  | ...
  | probe [ Expr ]

Such a probe statement has no effect on the evaluation of the program. Instead, the expression is recorded into the solver state as an expression of interest. In the event that an error is reported, the solver evaluates all expressions of interest and reports their values paired with their respective probe statements.

For example, consider the following program snippet, started in a state where the value of x is not known:

if x < 10 {
  probe x + x
  check x < 100
}

This probe has no effect on the output (which is that no error is reported). However, for

if x < 10 {
  probe x + x
  check 100 <= x
}

the error reported will include a value for the expression x + x (for example, 14).

We introduce a new operation, Ω.record(e), which returns a solver state that additionally keeps track of e as an expression of interest.

Process$\llbracket$ probe e $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  Process$\llbracket$ Cc, σ, B, Ω.record(e') $\rrbracket$

One use of probes is to just record path information, as road post markers leading to the error. This can be done with probe true or probe 0, where the value of the expression is not interesting but the presence of information for the probe in the error report says that the error reported occurred on a control path that passed through the probe. Rather than forcing the user to pick an arbitrary expression, the language allows a probe statement without an argument.

11.1. Labeled expressions

In its error reports, the prove operation also provides auxiliary information that comes from labeled conjuncts and inlined predicates. To that avail, any expression can be labeled:

Expr ::=
  | ...
  | : Expr

It is now time to describe the Ω.prove(e) operation in more detail. Here is pseudo code, where each yield shows one of the return values. In the top-level prove operation, the values returned will be reported.

if e can be proved in Ω {
  yield Success
} else {
  let RoadPostMarkers be values for the expressions of interest in Ω

  let e0,e1,... be the top-level conjuncts e0  e1  ... of e
  for each such conjunct ei {
    if ei is a labeled expression : ee {
      for each (roadPosts, auxInfo) in Ω.prove(ee) {
        yield (RoadPostMarkers,  $\cdot$ auxInfo)
      }
    }
  }

  if nothing is yielded above {
    if e is a call to a function P(args) where P has an FDefinition {
      let Pre be (the conjunction of expressions in) the FDefinition's requires clauses
      let Body be the body in the FDefinition
      let Pre',Body' be Pre,Body in which the formal parameters are replaced by args
      if Ω.prove(Pre') returns Success {
        for each (roadPosts, auxInfo) in Ω.prove(Body') {
          yield (RoadPostMarkers, "P" $\cdot$ auxInfo)
        }
      }
    }
  }

  if nothing is yielded above {
    yield (RoadPostMarkers, $\varepsilon$)
  }
}

(TODO: The above does not nothing for “opening” a quantifier.)

12. Dependency management

It is important to know which function definitions and axioms get included in a verification condition. There are two reasons for this. One reason is that more antecedents makes a verification condition logically weaker, that is, easier to discharge. In an extreme case, contradictory axioms (like the blatant axiom false) make everything logically provable. A second reason is that experience has shown that extraneous axioms, however irrelevant to the proof at hand, can severely slow down solver performance. The first reason implies we want to include enough axioms and the second reason implies that we don't want to include too many axioms, where “enough” and “too many” can be controlled by a program.

We want the included antecedents to be determined from the control path leading to a proof obligation. For example, we'd like the statements in a then branch of an if not to affect the set of antecedents included when proving the else branch. Therefore, the inclusion of such antecedents is done lazily. In a nutshell, for any solver operation Ω.extend(e) or Ω.prove(e), the solver state is first updated to include the axioms in the cone of influence of e.

In a little more detail, a solver operation that's given an expression e, say Ω.extend(e), starts by computing the cone of influence of e. This cone of influence, call it C, is a set of top-level declarations in the given B3 program. Then, the set of top-level declarations that have already been used to form the solver state Ω are subtracted, given a set of declarations Ω - C. Finally, the declarations in Ω - C are added added to the solver state, in the order in which these declarations occur in the program.

It remains to define the cone of influence of an expression e. It is the smallest set C of type, function, and axiom declarations such that

Each axiom declaration (see section 8) can explicitly declare which functions and types depend on the axiom:

Explains ::=
  | explains Identifier

where Identifier denotes either a function or a type. An axiom without any explicit explains clauses is called a global axiom. Global axioms and their dependencies are always included in every verification condition. That is, every solver state is initialized so that it includes all global axioms and their cones of influence.

12.0. Example

Suppose a program contains the following top-level declarations:

function H(): bool

type A
function J(): A
function K(x: int): A
function L(x: int): A
function F(x: int): A
  requires H()
{
  if G(2) = G(x) then L(0) else L(1)
}

type B
function G(): B

axiom explains H
  P
axiom explains B explains K
  Q
axiom S
axiom explains K
  forall x: int trigger K(x) trigger F(x) :: K(x) => F(x)
axiom explains A
  R

for some boolean expressions P, Q, R, and S whose form we ignore for the purposes of this example. Consider the processing of the statements

var y := J()
var z := F(5)

from a solver state Ω. Since every solver state contains all global axioms, we have that Ω contains S. The processing of y := J() performs the operation Ω.extend(y = J()), which will compute the cone of influence of J(), namely the declarations

Since axiom S is already accounted for in Ω, the operation Ω.extend(y = J()) will have the effect of creating a solver state Ω' that represents

Ω  R  y = J()

(shown here are only the constraints in the state, not the associated type and function declarations). Then, for Ω'.extend(z = F(5)), the cone of influence of F(5) is

Since axioms S and R are already accounted for in Ω', the operation Ω'.extend(z = F(5)) will have the effect of creating a solver state that represents

Ω'  P  Q  z = F(5)

Note the order in which P and Q are included, namely the order in which these two axioms are declared in the program.

13. Triggers

To facilitate goal-directed instantiations of universal quantifications (and negated existential quantifiers) in the solver, all quantified variables are given triggers (aka matching patterns). These are part of assertions (section 4), function definitions (9), forall/exists expressions (section 14), and closures (section 20).

Trigger ::=
  trigger Expr ( "," Expr)*

14. Expressions

Expressions are pretty standard. Noteworthy here is that

15. Variations on statements

15.0. Exit without argument

Common loops in source languages have predefined ways of exiting the loop. In B3, all loop exits use the exit statement (or the return statement). To reduce the number of labels a B3 user has to invent, B3 also allows exit to be used without an argument. It is allowed only in contexts that are enclosed by one or more loops. The meaning of exit is exit , where is the (given or by B3 generated) label of the closest enclosing loop.

Stmt ::=
  | ...
  | exit [  ]

15.1. Havoc

To set a variable x of type T to an arbitrary value, one can write

var tmp: T
x := tmp

However, this puts the burden on the B3 user to come up with a new name tmp. Plus, since the assignment to x generates a new incarnation anyway, the resulting solver context will contain two extra names.

To improve on these issues, B3 also supports a special assignment statement, commonly known as the havoc statement, that assigns to arbitrary value.

Stmt ::=
  | ...
  | x := *

It is processed as follows:

Process$\llbracket$ x := * $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let x' be σ.new(x)
  Process$\llbracket$ Cc, σ[x := x'], B, Ω $\rrbracket$

Note that the havoc statement is nondeterministic.

15.2. Initializing variable declaration

Stmt ::=
  | ...
  | var x [: Type] := Expr

The statement var x: T := e is a shorthand for

var x: T
x := e

However, the initializing form of the variable declaration

Process$\llbracket$ var x: T := e $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  let x' be σ.new(x)
  Process$\llbracket$ Cc, σ[x := x'], B, Ω.extend(x' = e') $\rrbracket$

15.3. Auto-invariants

A source language typically has more fine-grained types than those provided in B3. For example, one way to translate a source language with classes and subclasses is to use a single type for all references in B3, say type Ref. This means that the more specific type information, say, the fact that a parameter or local variable c: C in the source language is declared to be of a specific class C, is instead encoded as a logical condition, like IsC(c) for some B3 function function IsC(r: Ref): bool.

B3 supports auto-invariants as a way to help maintain such conditions. This affects loops and procedures.

15.3.0. Auto-invariants for loops

Deductive reasoning about a loop considers one arbitrary iteration of the loop, starting from a state where the loop body's assignment targets have arbitrary values satisfying the loop invariant (see section 10). Thus, if c is a variable that is assigned in the loop body, then we would like IsC(c) to be part of the loop invariant. Let's consider how a translation from a source language can arrange to emit such loop invariants.

One way is for the translation to add a loop invariant IsC(c) for every mutable variable in scope. But that would mean a lot of invariants about variables that aren't ever changed by the loop.

Alternatively, the translation from a source language could work out which variables are potentially modified by the loop and emit an invariant IsC(c) only if c may be modified by the loop body. But this means the translator needs to work out the loop's assignment targets, a task that is already provided by B3 (section 10).

To this end, B3 allows variable declarations to include an auto-invariant. B3 adds a variable's auto-invariant to the loop invariant of any loop that assigns to the variable.

15.3.1. Auto-invariants for procedures

Consider an in/out-parameter c of a procedure. To obtain an invariant (like IsC(c)) for c in any loops that modify c, we declare c with an auto-invariant. In addition, we want this condition to be a pre- and postcondition of the procedure. But it would be annoying to have to declare conditions like IsC(c) in three places: in a requires clause, in an ensures clause, and as an auto-invariant. To avoid this redundancy, B3 incorporates parameter auto-invariants into procedure pre- and postconditions as well.

In other words, an auto-invariant plays a role in two places: on procedure boundaries (for parameters), and on loop boundaries (for parameters and local variables assigned in the loop body). In other places, a B3 program is free to temporary violate the conditions stated in auto-invariants.

15.3.2. Auto-invariant declarations

To support auto-invariants, variable declarations are changed to

Stmt ::=
  | ...
  | var x: Type AutoInvariant*
  | var x [: Type] AutoInvariant* := Expr

and parameter declarations are changed to

CParam ::=
  | [inout | out] x: Type AutoInvariant*

where

AutoInvariant ::=
  | invariant Expr
  | invariant ABlock

The Expr or ABlock argument in an auto-invariant is allowed to mention the variable being declared. In addition, for a local variable, an auto-invariant is allowed to mention any of the variables in scope. For in/out-parameters, the auto-invariant is allowed to mention any in/out-parameter in both its ordinary form and in the form surround by old, and any in-parameter. For out-parameters, the auto-invariant is allowed to mention any of the parameters and also any in/out-parameter in its old form.

(Note for those familiar with Boogie 2: B3's auto-invariants are similar to Boogie's where clauses, but not exactly.)

15.3.3. Example

The auto-invariant of a variable x is added to the invariant of a loop if x is an assignment target (see section 10.0) of the loop body. The auto-invariant itself may mention additional variables, but whether or not those variables are assignment targets does not play a role in determining whether or not to apply the auto-invariant of x.

For example, consider the following variable declarations:

var x$defined: bool
var x: Ref invariant x$defined ==> IsC(x)

If a loop body assigns to x or to both x and x$defined, then x$defined ==> IsC(x) is added as a loop invariant. But if neither of those two variables is an assignment target or if only x$defined is, then the auto-invariant of x is not added to the loop invariant.

16. Variations on conditional statements

16.0. Omitted else and cascading if

The if statement shown in section 2 deterministically chooses between a then branch and an else branch. B3 supports two familiar syntactic variations of it. Syntactically, the else branch can be omitted, which stands for else { }. Also, if the else branch is a block with just an if statement, then the braces around it can be omitted, which makes for a cascading if statement.

If ::=
  | if Expr Block [ IfTail ]
  | ...
IfTail ::=
  | else Block
  | else If

16.1. If-case statement

The if statement gives two mutually exclusive branches. If the source program wants to prescribe a conditional with more than two branches that it knows to be mutually exclusive, then the if statement introduces unnecessary repetition of conditions. For example, in the cascading if statement

if x = 0 {
  A
} else if x < 0 {
  B
} else {
  C
}

the conditions assumed at the beginning of the second and third branches are ¬(x = 0) x < 0 and ¬(x = 0) ¬(x < 0), respectively, whereas just x < 0 and 0 < x would have sufficed.

Moreover, the branches of the if statement are also exhaustive, meaning that execution always continues to some branch. To omit a continuation in an else branch, one needs to explicitly place a assume false in that branch.

To improve on these situations, B3 provides a nondeterministic if-case statement.

Stmt ::=
  | ...
  | IfCase
IfCase ::=
  | if Case Case*
Case ::=
  | case Expr Block

It is processed as follows:

Process$\llbracket$ if case e0 S0 case e1 S1 ... $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e0' be σ[e0]
  Process$\llbracket$ S0 $\cdot$ Cc, σ, B, Ω.extend(e0') $\rrbracket$
  let e1' be σ[e1]
  Process$\llbracket$ S1 $\cdot$ Cc, σ, B, Ω.extend(e1') $\rrbracket$
  ...

Note that the cases need not be mutually exclusive, nor do they need to be exhaustive.

17. Variations on assertions

Going beyond the simple assertion statements in section 4, B3 offers a rich suite of ways to prescribe checks and learning. These assertions have been tailored around the patterns built up over the years in the Dafny verifier.

Assertion ::=
  | ...
  | ABlock
  | forall x: Type Trigger* ABlock
  | AIf
  | AIfCase
ABlockAssertion ::=
  | ...
  | var x [: Type] := Expr
AIf ::=
  | if Expr ABlock [ AIfTail ]
AIfTail ::=
  | else ABlock
  | else AIf
AIfCase ::=
  | if ACase ACase*
ACase ::=
  | case Expr ABlock

There's syntactic overlap between Assertion and other statements. For example, in the grammar presented,

if x < y {
  var z := x - y
  assert z  0
}

can be both an If and an AIf. The ambiguity is broken by accepting a statement as an Assertion if possible and as other statements otherwise.

We defined the processing of basic assertions in section 4 and defined the processing of var with an initializer in section 15.2. Here is the definition of Process for the other assertions:

Process$\llbracket$ { Aa } $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let  be a new label
  let L be Learn$\llbracket$ { Aa }, σ $\rrbracket$
  let B' be B[ := (domain of σ, assume L $\cdot$ Cc)]
  Process$\llbracket$ Aa $\cdot$ exit , σ, B', Ω $\rrbracket$

Process$\llbracket$ forall x: T A $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  Process$\llbracket$ var x: T A, σ, B, Ω $\rrbracket$
  let L be Learn$\llbracket$ forall x: T A, σ $\rrbracket$
  Process$\llbracket$ Cc, σ, B, Ω.extend(L) $\rrbracket$

Process$\llbracket$ if e A0 else A1 $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e' be σ[e]
  Process$\llbracket$ A0, σ, B, Ω.extend(e)
  Process$\llbracket$ A1, σ, B, Ω.extend(¬e)
  let L be Learn$\llbracket$ if e A0 else A1, σ $\rrbracket$
  Process$\llbracket$ Cc, σ, B, Ω.extend(L) $\rrbracket$

Process$\llbracket$ if case e0 A0 case e1 A1 ... $\cdot$ Cc, σ, B, Ω $\rrbracket$ =
  let e0' be σ[e0]
  Process$\llbracket$ A0, σ, B, Ω.extend(e0') $\rrbracket$
  let e1' be σ[e1]
  Process$\llbracket$ A1, σ, B, Ω.extend(e1') $\rrbracket$
  ...
  let L be Learn$\llbracket$ if case e0 A0 case e1 A1 ..., σ $\rrbracket$
  Process$\llbracket$ Cc, σ, B, Ω.extend(L) $\rrbracket$

Here is the definition of Learn for the new assertions:

Learn$\llbracket$ { Aa }, σ $\rrbracket$ =
  LearnSeq$\llbracket$ Aa, σ $\rrbracket$

LearnSeq$\llbracket$ $\varepsilon$, σ $\rrbracket$ =
  true

LearnSeq$\llbracket$ var x: T := e $\cdot$ Aa, σ $\rrbracket$ =
  let e' be σ[e]
  let x' be σ.new(x)
  let L be LearnSeq$\llbracket$ Aa, σ[x := x'] $\rrbracket$
  x' = e'  L

LearnSeq$\llbracket$ A $\cdot$ Aa, σ $\rrbracket$ =
  let L be Learn$\llbracket$ A, σ $\rrbracket$
  let L' be LearnSeq$\llbracket$ Aa, σ $\rrbracket$
  L  L'

Learn$\llbracket$ forall x: T A $\cdot$ Aa, σ $\rrbracket$ =
  let x' be σ.new(x)
  let σ' be σ[x := x']
  let L be Learn$\llbracket$ A, σ' $\rrbracket$
  forall x': T :: L

Learn$\llbracket$ if e A0 else A1, σ $\rrbracket$ =
  let L0 be Learn$\llbracket$ A0, σ $\rrbracket$
  let L1 be Learn$\llbracket$ A1, σ $\rrbracket$
  let e' be σ[e]
  (e  L0)  (¬e  Q1)

Learn$\llbracket$ if case e0 A0 case e1 A1 ..., σ $\rrbracket$ =
  let e0' be σ[e0]
  let L0 be Learn$\llbracket$ A0, σ $\rrbracket$
  let e1' be σ[e1]
  let L1 be Learn$\llbracket$ A1, σ $\rrbracket$
  ...
  (e0'  L0)  (e1'  Q1)  ...

17.0. Example

As an example, consider a source-language statement y := forall x: X :: P(x) && Q(x) where && is a short-circuit operator. If the source language has a notion of well-formedness that needs to be checked, then this statement can succinctly be translated into B3 as follows:

forall x: X {
  check PPre(x)
  assume PCanCall(x)
  if P(x) {
    check QPre(x)
    assume QCanCall(x)
  }
}
y := forall x: X :: P(x)  Q(x)

This leads to the following interactions with the solver. (Using the definitions above, there will be three variables in the processed form and all of them will have the name x. This leads to no confusion for machine processing, since two of the variables are bound variables that shadow the outer variables. To more clearly show these as three different variables in the following, the example instead uses the names x%0, x%1, and x%2.)

// process the assertions
Ω.prove(PPre(x%0))
Ω' := Ω.extend(PCanCall(x%0))
Ω" := Ω.extend(P(x%0))
Ω".prove(QPre(x%0))
Ω3 := Ω".extend(QCanCall(x%0))

// learn from the assertions and continue processing
Ω4 := Ω.extend(forall x%1: X :: PCanCall(x%1) && (P(x%1) ==> QCanCall(x%1)))
Process$\llbracket$ y := forall x%2: X :: P(x%2)  Q(x%2), ..., ..., Ω4 $\rrbracket$

Note that the well-formedness checking of the short-circuit boolean operator includes control flow (namely, the if statement that is part of the assertion block). However, what is learnt from this check (as assigned to Ω4) does not involve control flow. So, the well-formedness checking does not force an exponential explosion of control paths (unlike how Dafny is encoded into Boogie 2 today).

18. Variations on types

18.0. Type synonyms

It is sometimes convenient to have more than one name for a type. A type synonym provides that facility.

TopLevelDeclaration ::=
  | type T = Type

A type synonym is exactly thata synonym. There is no difference in using the name of the synonym or using the name of the type it stands for.

Declarations of type synonyms are not allowed be cylic. For example, the declarations

type A = B
type B = A

are illegal, as is the declaration

type D = D

18.1. Partitions

A partition is a separation of the values of a type into non-overlapping subsets. A partition can be used in the declaration of a function, as explained in section 19.1.

TopLevelDeclaration ::=
  | ...
  | partition T [ = Type ]

If the right-hand side type of a partition is omitted, the declaration also declares the underlying type. For example,

partition Nationality

is like the declarations

type t$Nationality
partition p$Nationality = t$Nationality

except that the name Nationality is used for both. If the name is used in a place where a partition is allowed, then the name refers to the partition; otherwise, it refers to the type.

19. Variations on functions

This section defines some variations on the declaration of functions (see section 9).

19.0. Injective parameters

By using the modifier injective for a function parameter, one declares that the function is injective (that is, one-to-one) in that argument.

FParamModifier ::=
  | injective

This automatically generates an inverse function together with (effectively) an axiom that states the injectivity property. For example,

type Color
function MakeColor(injective colorId: int, injective isSaturated: bool): Color

declares three functions and two axioms. The two additional functions and axioms are

function colorId(c: Color): int
function isSaturated(c: Color): bool

axiom explains MakeColor explains colorId
  forall colorId: int, isSaturated: bool trigger MakeColor(colorId, isSaturated) ::
    colorId(MakeColor(colorId, isSaturated)) = colorId
axiom explains MakeColor explains isSaturated
  forall colorId: int, isSaturated: bool trigger MakeColor(colorId, isSaturated) ::
    isSaturated(MakeColor(colorId, isSaturated)) = isSaturated

These inverse functions are available for use in the same way that MakeColor is.

19.1. Using partition as function result

A function's result can be declared to be either a type of a partition (see TypeOrPartition in section 9, and see section 18.1 on partitions). If the function's result is declared to be a partition, then the declaration says that the values returned by the function uniquely determine the function. For example, the declarations

type Person
partition Nationality = Person
function German(x: int): Nationality
function Italian(y: int, z: int): Nationality

say that the functions German and Italian return values from disjoint subsets of type Person. That is, these two function declarations effectively give rise to the declaration

axiom forall x: int, y: int, z: int :: German(x)  Italian(y, z)

The use of a partition as a function result is convenient since it saves the B3 user from having to declare quadratically many axioms like the one above or from having to declare some kind of tag function and arbitrarily assign distinct tag values.

Note, the use of a partition result does not imply anything about the exhaustiveness of values. For example, the declarations above allow the existence of a Person who is neither German nor Italian.

Also, the use of a partition result does not imply anything about injectivity. For example, the declarations above allow the possibility that German(17) and German(19) refer to the same Person.

By combining partition results and injective parameters, one can declare algebraic datatypes. For example,

partition List
function Nil(): List
function Cons(head: int, tail: List): List

sets up Nil and Cons as the common constructors of an algebraic List type. These declarations do not imply exhaustiveness; that is, they do not prevent the possibility that some List value is neither Nil() nor some Cons(h, t). The declarations also do not say which fixpoint is intended; for example, they allow the possibility of the usual inductive definition of List (where every list contains a finite number of elements) as well as a coinductive definition of List (where a list can be cyclic or infinite).

19.2. Additional bound variables for function definitions

For triggering purposes (see section 13 on triggers above), it can be desirable to obtain an axiom that quantifies over more variables than the function parameters. For example, one might want the definition of a function Elvis to be triggered only if it is used as an argument to another function Jukebox. The desired axiom is then

axiom forall age: int, coin: int trigger JukeBox(Elvis(age), coin) ::
  10 <= age && age <= 42 && 0 <= coin
  ==>
  Elvis(age) = FavoriteSong(1935 + age)

A first attempt at obtaining this axiom is

function Elvis(age: int)
  requires 10 <= age && age <= 42
  requires exists coin: int trigger JukeBox(Elvis(age), coin) :: 0 <= coin
{
  FavoriteSong(1935 + age)
}

This gives the same logical meaning as the desired axiom above. However, the triggers are different:

axiom forall age: int trigger Elvis(age) ::
  10 <= age && age <= 42 &&
  (exists coin: int trigger JukeBox(Elvis(age), coin) :: 0 <= coin)
  ==>
  Elvis(age) = FavoriteSong(1935 + age)

This axiom does not achieve the intended purpose of limiting its instantiations to uses of Elvis as an argument to Jukebox.

To provide a way to obtain the desired axiom, B3 allows a function to have a forall clause whose purpose it is to introduce further bound variables in the generated function-definition axiom. Using it, the Elvis function can be defined as follows:

function Elvis(age: int) forall coin: int
  trigger JukeBox(Elvis(age), coin)
  requires 10 <= age && age <= 42 && 0 <= coin
{
  FavoriteSong(1935 + age)
}

Any additional bound variables are allowed only in the preconditions, not in the body of the function.

Here is another example:

function FCanCall(x: int, fuel: Fuel): bool
function F(x: int): int forall fuel: Fuel
  trigger FCanCall(x, Succ(fuel))
  requires FCanCall(x, Succ(fuel))
{
  if x = 0 then 3 else 2 * F(x - 1) + 5
}

which in effect generates the axiom

axiom forall x: int, fuel: Fuel trigger FCanCall(x, Succ(fuel)) ::
  FCanCall(x, Succ(fuel))
  ==>
  F(x) = if x = 0 then 3 else 2 * F(x - 1) + 5

20. Closures

The translation of some source expressions into B3 follows a receipe like

translate this source expression into a B3 expression that we'll call s and lay down some axioms about the properties of s

Usually, the source expression has some free variables. In order to write an axiom about those, it is necessary to parameterize s, which involves some effort.

Here's a simple, concrete example. Suppose the source-language expressions E0, E1, and E2 of type int are translated into B3 as the expressions e0, e1, and e2. Then, a way to translate the sequence display expression [E0, E1, E2] is F(e0, e1, e2) where F is a new function

function F(x0: int, x1: int, x2: int): Sequence

with the properties defined via these axioms:

axiom forall x0: int, x1: int, x2: int trigger SeqLength(F(x0, x1, x2)) ::
  SeqLength(F(x0, x1, x2)) = 3
axiom forall x0: int, x1: int, x2: int trigger F(x0, x1, x2) ::
  SeqIndex(F(x0, x1, x2), 0) = Box(x0)
axiom forall x0: int, x1: int, x2: int trigger F(x0, x1, x2) ::
  SeqIndex(F(x0, x1, x2), 1) = Box(x1)
axiom forall x0: int, x1: int, x2: int trigger F(x0, x1, x2) ::
  SeqIndex(F(x0, x1, x2), 2) = Box(x2)

For functions, this process is known as lambda lifting. Using a closure, B3 performs the heavy lifting (pun intended) in creating such liftings.

Closure ::=
  | lift ClosureBinding*, into ClosureVar : Type by "{" ClosureProperty*, "}"
ClosureBinding ::=
  | ClosureVar [ "(" CParam*, ")" ] := Expr
CParam ::=
  | x: Type
ClosureProperty ::=
  | [ Trigger Trigger* :: ] Expr

where ClosureVar is a variable whose name starts with % to visually indicate that the variable is like a macro that denotes an expression.

Before giving the details of how to elaborate a closure into logic, it is instructive to look at some examples.

20.0. Example: display expression

Continuing the example above, to obtain the function F, the expression F(e0, e1, e2), and the associated axioms for the sequence expression above, one uses the B3 closure

lift
  %x0 := e0,
  %x1 := e1,
  %x2 := e2
into
  %s: Sequence
by {
  trigger SeqLength(%s) :: SeqLength(%s) = 3,
  SeqIndex(%s, 0) = Box(%x0),
  SeqIndex(%s, 1) = Box(%x1),
  SeqIndex(%s, 2) = Box(%x2)
}

20.1. Example: sequence constructor

Suppose, in the source language, that expression N denotes a desired sequence length, expression E denotes a function from indices to elements of type real, and that seq(N, E) denotes a sequence constructed to have length N and elements described by E. If N and E are translated into B3 expression n and e (of types int and Func, say), then the sequence constructor can be translated into the following closure:

lift
  %len := n,
  %f := e
into
  %s: Sequence
by {
  trigger SeqLength(%s) :: 0 <= %len ==> SeqLength(%s) = %len,
  forall j: int trigger SeqIndex(%s, j) ::
    0 <= j && j < %len ==> SeqIndex(%s, j) = Apply(%f, Box(j))
}

B3 turns this closure into

function F(len: int, f: Func): Sequence
axiom forall len: int, f: Func trigger SeqLength(F(len, f)) ::
  0 <= len ==> SeqLength(F(len, f)) = len
axiom forall len: int, f: Func, j: int trigger SeqIndex(F(len, f), j) ::
  0 <= j && j < len ==> SeqIndex(F(len, f), j) = Apply(f, Box(j))

for some new function F. If further axiomatization says that SeqLength always returns a non-negative integer, then it is important to remember the antecedents 0 <= %len in the closure, or else the resulting axiomatization would be unsound.

20.2. Example: set comprehension

Consider a source-language set comprehension set x | Constraint (representing "the set of every x satisfying Constraint), where x has type int and Constraint (which may mention x and other free variables) translates into B3 as constraint. The set comprehension can then be translated using the following closure:

lift
  %body(x: int) := constraint
into
  %s: Set
by {
  forall bx: Box trigger SetContains(%s, bx) ::
    SetContains(%s, bx) <==> %body(Unbox(bx))
}

Suppose constraint is the expression Abs(x - e0) + e1. Then, B3 turns the closure into

function F(body0: int, body1: int): Set
axiom forall body0: int, body1: int, bx: Box
  trigger SetContains(F(body0, body1), bx) ::
  SetContains(F(body0, body1), bx) <==> Abs(Unbox(bx) - body0) + body1)

for some new function F. Note how the closure automatically decomposes Abs(x - e0) + e1 into pieces that do not mention x.

20.3. Example: sequence constructor with an initializing lambda expression

As a variation of the sequence constructor above, a translation may want to optimize the case where the expression E is a lambda expression, like i => Body. If N and Body translate into n and body, then a translation of seq(N, i => Body) into a closure is

lift
  %len := n,
  %elements(i: int) := body
into
  %s: Sequence
by {
  trigger SeqLength(%s) :: 0 <= %len ==> SeqLength(%s) = %len,
  forall j: int trigger SeqIndex(%s, j) ::
    0 <= j && j < %len ==> SeqIndex(%s, j) = Box(%elements(j))
}

Suppose body is Abs(i - e0) + e1. Then, B3 turns this closure into:

function F(len: int, elements0: int, elements1: int): Sequence
axiom forall len: int, elements0: int, elements1: int
  trigger SeqLength(F(len, elements0, elements1)) ::
  0 <= len ==> SeqLength(F(len, elements0, elements1)) = len
axiom forall len: int, elements0: int, elements1: int, j: int
  trigger SeqIndex(F(len, elements0, elements1), j) ::
  0 <= j && j < len ==>
    SeqIndex(F(len, elements0, elements1), j) = Box(Abs(j - elements0) + elements1)

20.4. Elaboration

It's time for the full details of elaboration. As a concrete representative of a general closure, consider

lift %b(x: X) := e into %s: S by { p }
Function name

Elaboration first creates a name for the resulting function, say F. This function's result type is the given type S. The parameters of F are going to be determined as part of processing the closure bindings.

Holes

Then, for each binding %b(x: X) := e, elaboration finds non-overlapping subexpressions e0, e1, of e such that x is not a free variable of any ei and every occurrence of a non-x variable in e is contained in one of the ei. In the literature, the places where these ei appear in e can be considered to be holes, often denoted by $\square$.

For example, suppose e is the expression 1 + y + Abs(x - G(u)). There are several ways to carve out holes for this expression. One way is

1 + $\square$ + Abs(x - G($\square$))

The subexpressions removed to make these holes are y and u, each one a non-x free variable of the expression. Another way to carve out holes is

$\square$ + Abs(x - $\square$)

Here, the subexpressions removed to make these holes are 1 + y and G(u), and the strategy used to define these holes is to make each hole as large as possible (but not so large as to include an occurrence of x). Note that if the closure binding %b(x: X) doesn't have any parameters (in other words, it is just %b), then the as-large-as-possible strategy can always carve out the entire expression as a hole.

Any way of defining the holes works. For the purpose of continuing the description of elaboration, we'll use the as-large-as-possible strategy.

Function parameters

For each hole, introduce a parameter to the closure function. One way to do this is to use the name of the closure variable (in the example, %b) if there's just one hole, and otherwise to append %i to this name, where i is the number of holes preceding it. (In the motivating examples above, the names used were these, but without the %'s.)

For $\square$ + Abs(x - $\square$), we thus introduce parameters %b%0 and %b%1. The types of these parameters is the type of the expression removed to make the hole (in the example, int and int). Note that these are just variable names, not function (even though the closure declares %b(x: X) to look like a function).

Closure function

These “Holes” and “Function parameters” steps above are followed for all closure bindings. Concatenating all the function parameters, the closure function is declared. In the example, it is

function F(%b%0: int, %b%1: int): S
Lifting

The closure is now replaced by the closure function applied to the subexpressions that were carved out for each hole. In our example, this function application thus becomes

F(1 + y, G(u))
Lifting axioms

The final step in elaborating the closure is to generate axioms, one for each given closure property.

In the closure property, replace the closure variable %s by the closure function applied to its formal parameters (or, more precisely, to new bound variables that are like the parameters of the function). In the example, %s is thus replaced by the function application F(%b%0, %b%1).

Next, replace each application %b(e0) of a closure binding by the corresponding expression with holes, where each hole is replaced by the corresponding function parameter (again, more precisely, to the bound variable that is like the function parameter) and replace each occurrence of the binding parameter (in the example, x: X) with e0. In the example, %b(e) is thus replaced by the expression %b%0 + Abs(e0 - %b%1).

Finally, wrap a universal quantifier around the resulting closure property. The quantifier has the function parameters as its bound variable (these are the bound variables that were alluded to by “more precisely” in the previous two paragraphs). The trigger for the quantifier is the application of the closure function, that is, the expansion of %s.

There are two plot twists involved in wrapping the universal quantifier as just described. One plot twist is that if the the closure property has the form trigger tr ... :: p0, then the generated quantifier uses the specified triggers instead of the default trigger %s. The other plot twist is if the closure property is a universal quantifier, say Q, by itself. In that case, the generated quantifier is merged with the quantifier Q and the triggers used are the those given in Q.

Note that, since the bound variables of the universal quantifer are generated automatically during the elaboration of a closure, the only way to mention those variables in a trigger is to make use of %s.

21. Macros

Some expressions and statement lists are used in custom patterns. To facilitate simpler generation and reading of B3 programs, there are macros. The macros come in three forms, one for expressions, one for statements, and one for the special case of the statements being assertions:

TopLevelDeclaration ::=
  | ...
  | define Identifier "(" Identifier*, ")" Expr
  | define Identifier "(" Identifier*, ")" Block
  | define Identifier "(" Identifier*, ")" ABlock

Following Viper, the parameters are given names, but no types. Typing is done after the macros have been elaborated. Macros are not allowed to be recursive.

Since the macro parameters have no types, it is not possible to type check the bodies of the macros independently of their usage sites. However, two things are done with the variable names used inside the macro body. One is that the free names in the macro body must be parameters of the macro. The other is that all names that are bound inside the macro body are uniquely renamed. This automatic renaming can be done by prefixing each bound variable name with m%, where m is the name of the macro.

22. Attributes

All top-level declarations and some statements can be marked with custom attributes. These are not given any semantics by the B3 language but are useful for third-party tools that operate on B3 programs.