Opaque Types: Refined Vectors

#![allow(unused)]
extern crate flux_rs;
use flux_rs::attrs::*;

#[flux_rs::spec(fn (bool[true]))]
fn assert(b: bool) {
    if !b {
        panic!("assertion failed");
    }
}

Online demo

While rustc has a keen eye for spotting nasty bugs at compile time, it is not omniscient. We've all groaned in dismay at seeing deployed code crash with messages like

panicked at 'index out of bounds: the len is ... but the index is ...'

Next, lets see how flux's refinement and ownership mechanisms let us write a refined vector API whose types track vector sizes and ensure --- at compile time --- that vector accesses cannot fail at runtime.

Refining Vectors ...

To track sizes, lets define a struct that is just a wrapper around the std::vec::Vec type, but with a refinement index that tracks the size of the vector.

#[opaque]
#[refined_by(len: int)]
pub struct RVec<T> {
    inner: Vec<T>,
}

... to Track their Size

As with other structs we're using refined_by to index the RVec with an int value (that will represent the vector's length.)

The idea is that

RVec<i32>[10] represents a vector of i32 size 10, and
RVec<bool>{v:0 < v} represents a non-empty vector of bool, and
RVec<RVec<f32>[n]>[m] represents a vector of vectors of f32 of size m and each of whose elements is a vector of size n.

... but Opaquely

The opaque attribute tells flux that we're not going to directly connect the len to any of the RVec's fields' values.

This is quite unlike Positivei32 example where the index held the actual value of the field, or the [Timer example)(./04-enums.md) where the index held the value of the countdown.

Instead, with an opaque struct the idea is that the value of the index will be tracked solely by the API for that struct.

Next, lets see how to build such an API for creating and manipulating RVec, where the length is precisely tracked in the index.

Creating Vectors

I suppose one must start with nothing: the empty vector.

#[trusted]
impl<T> RVec<T> {
    #[spec(fn() -> RVec<T>[0])]
    pub fn new() -> Self {
        Self { inner: Vec::new() }
    }
}

The above implements RVec::new as a wrapper around Vec::new. The #[trusted] attribute tells flux to not check this code, i.e. to simply trust that the specification is correct. Indeed, flux cannot check this code.

If you remove the #trusted (do it!) then flux will complain that you cannot access the inner field of the opaque struct!

So the only way to use an RVec is to define a "trusted" API, and then use that in client code, where for example, callers of RVec::new get back an RVec indexed with 0 : the empty vector.

Pushing Values

An empty vector is a rather desolate thing.

To be of any use, we need to be able to push values into it, like so

#[trusted]
impl<T> RVec<T> {
    #[spec(fn(self: &mut RVec<T>[@n], T) ensures self: RVec<T>[n+1])]
    pub fn push(&mut self, item: T) {
        self.inner.push(item);
    }
}

The refined type for push says that it takes a updatable reference to an RVec<T> of size n and, a value T and ensures that upon return, the size of self is increased by 1.

Creating a Vector with `push`

Lets test that the types are in fact tracking sizes.

#[spec(fn () -> RVec<i32>[3])]
fn test_push() -> RVec<i32> {
    let mut v = RVec::new(); // v: RVec<i32>[0]
    v.push(1);               // v: RVec<i32>[1]
    v.push(2);               // v: RVec<i32>[2]
    v.push(3);               // v: RVec<i32>[3]
    v
}

EXERCISE: Can you correctly implement the code for zeros so that it typechecks?

#[spec(fn(n: usize) -> RVec<i32>[n])]
fn zeros(n:usize) -> RVec<i32> {
    let mut v = RVec::new(); // v: RVec<i32>[0]
    let mut i = 0;
    while i <= n {
        v.push(0);           // v: RVec<i32>[i]
        i += 1;
    }
    v
}

Popping Values

Not much point stuffing things into a vector if we can't get them out again. For that, we might implement a pop method that returns the last element of the vector.

Aha, but what if the vector is empty? You could return an Option<T> or since we're tracking sizes, we could require that pop only be called with non-empty vectors.

#[trusted]
impl<T> RVec<T> {
    #[spec(fn(self: &mut {RVec<T>[@n] | 0 < n}) -> T
           ensures self: RVec<T>[n-1])]
    pub fn pop(&mut self) -> T {
      self.inner.pop().unwrap()
    }
}

Note that unlike push which works for any RVec<T>[@n], the pop method requires that 0 < n i.e. that the vector is not empty.

Using the `push/pop` API

Now already flux can start checking some code, for example if you push two elements, then you can pop twice, but flux will reject the third pop at compile-time

fn test_push_pop() {
    let mut vec = RVec::new();   // vec: RVec<i32>[0]
    vec.push(10);                // vec: RVec<i32>[1]
    vec.push(20);                // vec: RVec<i32>[2]
    vec.pop();                   // vec: RVec<i32>[1]
    vec.pop();                   // vec: RVec<i32>[0]
    vec.pop();                   // rejected!
}

In fact, the error message from flux will point to exact condition that does not hold

#![allow(unused)]
fn main() {
error[FLUX]: precondition might not hold
   |
24 |     v.pop();
   |     ^^^^^^^ call site
   |
   = note: a precondition cannot be proved at this call site
note: this is the condition that cannot be proved
   |
78 |     #[spec(fn(self: &mut{RVec<T>[@n] | 0 < n}) -> T
   |                                        ^^^^^
}

Querying the Size

Perhaps we should peek at the size of the vector to make sure its not empty before we pop it.

We can do that by writing a len method that returns a usize corresponding to (and hence, by indexed by) the size of the input vector

#[flux_rs::trusted]
impl<T> RVec<T> {
    #[spec(fn(&RVec<T>[@vec]) -> usize)]
    pub fn len(&self) -> usize {
        self.inner.len()
    }
}

EXERCISE Can you fix the spec for len so that the code below verifies, i.e. so that flux "knows" that

after two pushes, the value returned by .len() is exactly 2, and
after two pops the size is 0 again.

fn test_len() {
    let mut vec = RVec::new();
    vec.push(10);
    vec.push(20);
    assert(vec.len() == 2);
    vec.pop();
    vec.pop();
    assert(vec.len() == 0);
}

Random Access

Of course, vectors are not just stacks, they also allow random access to their elements which is where those pesky panics occur, and where the refined vector API gets rather useful.

Now that we're tracking sizes, we can require that the method to get an element only be called with a valid index less than the vector's size

impl<T> RVec<T> {
    #[spec(fn(&RVec<T>[@n], i: usize{i < n}) -> &T)]
    pub fn get(&self, i: usize) -> &T {
        &self.inner[i]
    }

    #[spec(fn(&mut RVec<T>[@n], i: usize{i < n}) -> &mut T)]
    pub fn get_mut(&mut self, i: usize) -> &mut T {
        &mut self.inner[i]
    }
}

Summing the Elements of an `RVec`

EXERCISE Can you spot and fix the off-by-one error in the code below which loops over the elements of an RVec and sums them up? ¹

fn sum_vec(vec: &RVec<i32>) -> i32 {
    let mut res = 0;
    let mut i = 0;
    while i <= vec.len() {
        res += vec.get(i);
        i += 1;
    }
    sum
}

Using the `Index` trait

Its a bit of an eyesore to to use get and get_mut directly.

Instead lets implement the Index and IndexMut traits for RVec which allows us to use the [..] operator to access elements

impl<T> std::ops::Index<usize> for RVec<T> {
    type Output = T;
    #[spec(fn(&RVec<T>[@n], i:usize{i < n}) -> &T)]
    fn index(&self, index: usize) -> &T {
        self.get(index)
    }
}

impl<T> std::ops::IndexMut<usize> for RVec<T> {
    #[spec(fn(&mut RVec<T>[@n], i:usize{i < n}) -> &mut T)]
    fn index_mut(&mut self, index: usize) -> &mut T {
        self.get_mut(index)
    }
}

Summing Nicely

Now the above vec_sum example looks a little nicer

fn sum_vec_fixed(vec: &RVec<i32>) -> i32 {
    let mut res = 0;
    let mut i = 0;
    while i < vec.len() {
        res += vec[i]
        i += 1;
    }
    sum
}

Memoization

Lets put the whole API to work in this "memoized" version of the fibonacci function which uses a vector to store the results of previous calls

pub fn fib(n: usize) -> i32 {
    let mut r = RVec::new();
    let mut i = 0;
    while i < n {
        if i == 0 {
            r.push(0);
        } else if i == 1 {
            r.push(1);
        } else {
            let a = r[i - 1];
            let b = r[i - 2];
            r.push(a + b);
        }
        i += 1;
    }
    r.pop()
}

EXERCISE Flux is unhappy with the pop at the end of the function which returns the last value as the result: it thinks the vector may be empty and so the pop call may fail ... Can you spot and fix the problem?

#![allow(unused)]
fn main() {
error[FLUX]: precondition might not hold
  --> src/vectors.rs:40:5
   |
40 |     r.pop()
   |     ^^^^^^^
}

Binary Search

As a last example, lets look at a simplified version of the binary_search method from std::vec, into which we've snuck a tiny little bug

pub fn binary_search(vec: &RVec<i32>, x: i32) -> Result<usize, usize> {
    let mut size = vec.len();
    let mut left = 0;
    let mut right = size;
    while left <= right {
        let mid = left + size / 2;
        let val = vec[mid];
        if val < x {
            left = mid + 1;
        } else if x < val {
            right = mid;
        } else {
            return Ok(mid);
        }
        size = right - left;
    }
    Err(left)
}

Flux complains in two places

#![allow(unused)]
fn main() {
error[FLUX]: precondition might not hold
   --> src/vectors.rs:152:19
    |
152 |         let val = vec[mid];
    |                   ^^^^^^^^ call site
    |
    = note: a precondition cannot be proved at this call site
note: this is the condition that cannot be proved
   --> src/rvec.rs:189:44
    |
189 |     #[spec(fn(&RVec<T>[@n], usize{v : v < n}) -> &T)]
    |                                       ^^^^^

error[FLUX]: arithmetic operation may overflow
   --> src/vectors.rs:160:9
    |
160 |         size = right - left;
    |         ^^^^^^^^^^^^^^^^^^^
}

The vector access may be unsafe as mid could be out of bounds!
The size variable may underflow as left may exceed right!

EXERCISE Can you the spot off-by-one and figure out a fix?

Summary

Well then. We just saw how Flux's index and constraint mechanisms combine with Rust's ownership to let us write a refined vector API that ensures the safety of all accesses at compile time.

These mechanisms are compositional : we can use standard type machinery to build up compound structures and APIs from simple ones, as we will see when we use RVec to implment sparse matrices and a small neural network library.

Why not use an iterator? We'll get there in due course! ↩

Keyboard shortcuts

The Flux Book