Unfolding the universe of possibilities..

Whispers from the digital wind, hang tight..

Nine Rules to Formally Validate Rust Algorithms with Dafny (Part 2)

Lessons from Verifying the range-set-blaze Crate

By Carl M. Kadie and Divyanshu Ranjan

This is Part 2 of an article formally verifying a Rust algorithm using Dafny. We look at rules 7 to 9:

7. Port your Real Algorithm to Dafny.8. Validate the Dafny Version of Your Algorithm.9. Rework Your Validation for Reliability.

See Part 1 for rules 1 to 6:

Don’t Learn Dafny.Learn Dafny.Define Your Algorithm’s Basic Concepts.Specify Your Algorithm.Get Help from the Dafny Community.Validate a Different, Easier, Algorithm.

The rules are informed by our experience validating an algorithm from range-set-blaze, a Rust crate for manipulating sets of “clumpy” integers.

Recall that Rule 6, from Part 1, shows that we can verify an algorithm for InternalAdd, but it is not the algorithm used in the Rust crate. We turn to that algorithm next.

Rule 7: Port your Real Algorithm to Dafny.

Here is the Rust function of interest with some code excluded for now:

// https://stackoverflow.com/questions/49599833/how-to-find-next-smaller-key-in-btreemap-btreeset
// https://stackoverflow.com/questions/35663342/how-to-modify-partially-remove-a-range-from-a-btreemap
fn internal_add(&mut self, range: RangeInclusive<T>) {
let (start, end) = range.clone().into_inner();
assert!(
end <= T::safe_max_value(),
“end must be <= T::safe_max_value()”
);
if end < start {
return;
}
//… code continues …
}

And here is the start of the Dafny port:

method InternalAdd(xs: seq<NeIntRange>, range: IntRange) returns (r: seq<NeIntRange>)
requires ValidSeq(xs)
ensures ValidSeq(r)
ensures SeqToSet(r) == SeqToSet(xs) + RangeToSet(range)
{
var (start, end) := range;
if end < start {
r := xs;
return;
}
//… code continues …
}

Some points of possible interest:

The Rust code uses self and object-oriented-like encapsulation. Dafny supports this coding style, but — for simplicity — I don’t use it here. Specifically, the Rust code mutates self. I chose to write the Dafny code more functionally — it takes an immutable sequence and returns a new immutable sequence.The Rust code manages memory with the borrow checker. This leads to expressions such as range.clone(). Dafny manages memory with a garbage collector. In either case, memory safety will be taken care. We, therefore, ignore it in this validation.The Rust code is generic over T which I elsewhere define to include all the standard Rust integer types, for example, u8, isize, i128. The Dafny code is defined on int, a single type that represents integers of arbitrary size. This means this Dafny port need not check for integer overflows. [See a previous article for formally proving overflow safety with the Kani Rust verifier.]The Rust code includes a run-time assert! that is needed in Rust to forbid one special case: inserting u128::max_value into a RangeSetBlaze<u128>. Because Dafny uses the arbitrary-sized int, it ignores this special case.Aside: What is the length of the Rust inclusive range 0..=u128::max_value? The answer is u128::max_value+1, a value too large to represent with any standard Rust integer type. The range-set-blaze crate limits ranges to 0..=u128::max_value-1, so that lengths can be represented with a u128.

Let’s next consider the rest of the internal_add algorithm. Recall that we have some list of sorted disjoint ranges and some non-empty new range that we want to insert. For example

Credit: This and following figures by author.

The algorithm has us find which (if any) existing range is before (or at) the start of the new range. Call this the “before” range. We then consider four cases:

Case 1: The new range doesn’t touch its before range, so we insert the new range while checking to see if it touches any other ranges.Case 2: The new range touches the before range and extends beyond it, so we extend the end of the before range while checking to see if it touches any other ranges. (When no other ranges are touched, this will be very fast.)Case 3: The new range touches the before range but does not extend beyond it, so do nothing. (This will always be very, very fast.)Case 4: The new range starts before any range, so add it while checking to see if it touches any other ranges.

Here is the algorithm in Rust:

// code continued …
// FUTURE: would be nice of BTreeMap to have a partition_point function that returns two iterators
let mut before = self.btree_map.range_mut(..=start).rev();
if let Some((start_before, end_before)) = before.next() {
// Must check this in two parts to avoid overflow
if match (*end_before).checked_add(&T::one()) {
Some(end_before_succ) => end_before_succ < start,
None => false,
} {
self.internal_add2(&range);
} else if *end_before < end {
self.len += T::safe_len(&(*end_before..=end – T::one()));
*end_before = end;
let start_before = *start_before;
self.delete_extra(&(start_before..=end));
} else {
// completely contained, so do nothing
}
} else {
self.internal_add2(&range);
}
}

And here it is in Dafny:

// code continued …
var beforeHi := IndexAtOrBeforePlusOne(xs, start);
if beforeHi > 0 { // does not go at front
var (startBefore, endBefore) := xs[beforeHi-1];
if endBefore+1 < start {
r := InternalAdd2(xs, range);
} else if endBefore < end {
r := xs[..beforeHi-1] + [(startBefore, end)] + xs[beforeHi..];
assume exists i: nat :: i < |r| && r[i] == (startBefore,end) && ValidSeq(r[..i+1]) && ValidSeq(r[i+1..]);
r := DeleteExtra(r, (startBefore,end));
} else{
r := xs;
}
}
else // goes at front
{
r := InternalAdd2(xs, range);
}
}

Some points of possible interest:

The Rust code manipulates a BTreeMap via keys and values. The Dafny code manipulates a sorted seq with (random-access) indexes. I made the Dafny operations mirror the Rust operations although this makes the Dafny code less natural.The Rust code also updates self.len, the number of individual integers in the RangeSetBlaze. The Dafny code ignores this. (Updating len is a feature that could be added to the Dafny code in the future.)As before, the Rust version includes code to avoid overflow that Dafny ignores.

I continued the port by writing a Dafny version of internal_add2 and delete_extra, the two functions that internal_add calls. I finished the port by writing the methods these two methods call, etc. The full port is about 185 lines of code. You can see it here.

It does not validate. We’ll work on validation next.

Rule 8: Validate the Dafny Version of Your Algorithm.

In this step, you will add validation hints to your code, for example, in the form of assert statements. Dafny uses these hints to try to validate your code. As a Dafny novice, I (Carl) found adding hints harder than coding. In part, because I didn’t know when (or if) Dafny would be satisfied and I could stop.

I did, however, learn how I should start. For example, the code above for InternalAdd produces two verification errors. First, The Dafny verifier reports that one of the ensures may not hold:

This postcondition might not hold: SeqToSet(r) == SeqToSet(xs) + RangeToSet(range)Aside: “Postcondition” corresponds to ensures. “Precondition” corresponds to requires.

Second, the Dafny verifier complains that a precondition (that is one of the requires) for DeleteExtra can’t be proved.

We will focus on the first problem first by adding an assert to the bottom of the method. We write it to mirror the ensures.

// … adding this as the last line in InternalAdd
assert SeqToSet(r) == SeqToSet(xs) + RangeToSet(range);
}

We’ll explicitly ignore the DeleteExtra problem, for now, with an assume.

// …
assume exists i: nat :: i < |r| && r[i] == (startBefore,end) && ValidSeq(r[..i+1]) && ValidSeq(r[i+1..]);
r := DeleteExtra(r, (startBefore,end));
//…

The Dafny validator now only complains about our new final assert. It says “assertion might not hold.”

Recall that the InternalAdd code uses nested if statements to divide its work into five cases. We’ll next move our assert from the end of the method to the end of each case. Look for the lines ending with a // case comment in the result:

method InternalAdd(xs: seq<NeIntRange>, range: IntRange) returns (r: seq<NeIntRange>)
requires ValidSeq(xs)
ensures ValidSeq(r)
ensures SeqToSet(r) == SeqToSet(xs) + RangeToSet(range)
{
var (start, end) := range;
if end < start {
r := xs;
assert SeqToSet(r) == SeqToSet(xs) + RangeToSet(range); // case 0 – validates
return;
}

var beforeHi := IndexAtOrBeforePlusOne(xs, start);
if beforeHi > 0 { // does not go at front
var (startBefore, endBefore) := xs[beforeHi-1];
if endBefore+1 < start {
r := InternalAdd2(xs, range);
assert SeqToSet(r) == SeqToSet(xs) + RangeToSet(range); // case 1 – validates
} else if endBefore < end {
r := xs[..beforeHi-1] + [(startBefore, end)] + xs[beforeHi..];
assume exists i: nat :: i < |r| && r[i] == (startBefore,end) && ValidSeq(r[..i+1]) && ValidSeq(r[i+1..]);
r := DeleteExtra(r, (startBefore,end));
assert SeqToSet(r) == SeqToSet(xs) + RangeToSet(range); // case 2 – fails
} else{
r := xs;
assert SeqToSet(r) == SeqToSet(xs) + RangeToSet(range); // case 3 – fails
}
}
else // goes at front
{
r := InternalAdd2(xs, range);
assert SeqToSet(r) == SeqToSet(xs) + RangeToSet(range); // case 4 – validates
}
}

Dafny now tells us that cases 0, 1, and 4 validate. Case 2 fails (and contains that assume that we’ll need to eventually remove). For now, however, let’s work on case 3.

Recall from this article’s Rule 7, that case 3 is when we add a new range (red) that is completely covered by an existing range (the blue “before” range), so the code need do nothing.

So, thinking logically, what do we know? We know that the integers covered by the before range are a superset of the integers covered by the new range. We also know that the before range is part of our original sorted and disjoint list of ranges (the blue ranges). We’ll add these two hints to our code via assert statements:

Dafny agrees these two hints are true (green check marks), but it still doesn’t accept the assert of interest (red mark).

We seem to need one more hint. Specifically, we need to convince Dafny that the integers covered by the before range are a subset of the integers covered by the list of all sorted and disjoint ranges. Intuitively, this is true because the before range is one of the ranges in the list.

We write this hint as a lemma with no body. Dafny accepts it.

Aside: Why does Dafny accept this lemma with nothing in its body? I don’t know and don’t have a good intuition. This just worked. If it didn’t, I would have tried adding asserts to its body.

Using the lemma, case 3 now validates:

This means we have validated cases 0, 1, 3, and 4. We would next move on to case 2. In addition, some of the methods mentioned, for example, DeleteExtra, don’t yet validate and we would need to work on those. [You can see the code up to this point, here.]

For general advice on verification debugging, refer to this section of the Dafny User’s Guide. I also recommend this Stack Overflow answer and mini-tutorial by Prof. James Wilcox.

Overall, the idea is to divide the task of validating your algorithm into many smaller validation tasks. I found this harder than programming, but not too hard and still fun.

I ended up adding about 200 validation lines to the 185 regular code lines (full code here). When I finally validated the last method, I wrongly thought I was finished.

To my surprise (and disappointment) the work doesn’t end the first time that everything validates. You must also ensure that your project will validate again and validate for others. We’ll discuss this rule, next.

Rule 9: Rework Your Validation for Reliability.

I thought I was done. Then, I moved the six-line definition of the math Min function from the Dafny standard library to my code. This caused my validation to fail, for no logical reason (literally!). Later, after I thought I’d fixed that, I deleted a method that wasn’t used. Again, validation started to fail for no logical reason.

What’s going on? Dafny works heuristically via a random search. Changing code superficially (or changing random seeds) can change how much time the search needs. Sometimes, the amount of time changes drastically. If the new time goes beyond a user-set time limit, the validation will fail. [We’ll talk more about the time limit in tip #3, below.]

You should test the reliability of your validation by trying different random seeds. Here are the commands I used (on Windows) to validate a file with 10 random seeds.

@rem Find the location of Dafny and add it to your path
set path=C:Userscarlk.vscode-insidersextensionsdafny-lang.ide-vscode-3.1.2outresources4.2.0githubdafny;%path%
dafny verify seq_of_sets_example7.dfy –verification-time-limit:30 –cores:20 –log-format csv –boogie -randomSeedIterations:10

The result is a *.csv file that you can open as a spreadsheet and then look for failures:

Aside: For more ideas on measuring Dafny’s validation reliability, see this Stack Overflow answer about analyzing *.csv files and this GitHub discussion recommending the dafny-reportgenerator tool.

Having found the problem spots, I brought in co-author Divyanshu Ranjan to help. Divyanshu Ranjan used his experience with Dafny to fix the project’s validation problems.

Here are his tips, with examples from the project:

Tip #1: When possible, remove require statements involving “forall” and “exists”.

Recall from Rule 4 that ghost function SeqToSet returns the set of integers covered by a sorted and disjoint list of non-empty ranges. We define “sorted and disjoint” with function ValidSeq, which internally uses two forall expressions. We can remove the requirement that the list must be sorted and disjoint, like so:

ghost function SeqToSet(sequence: seq<NeIntRange>): set<int>
decreases |sequence|
// removed: requires ValidSeq(sequence)
{
if |sequence| == 0 then {}
else if |sequence| == 1 then RangeToSet(sequence[0])
else RangeToSet(sequence[0]) + SeqToSet(sequence[1..])
}

From our point of view, we have the same useful function. From Dafny’s point of view, the function avoids two forall expressions and is easier to apply.

Tip #2 Use calc to avoid guess work by Dafny.

With a Dafny calc statement, you list the exact steps needed to arrive at a conclusion. For example, here is a calc from the DeleteExtra method:

calc {
SeqToSet(xs[..indexAfter+1]) + SeqToSet(xs[indexAfter+1..]);
==
{ SeqToSetConcatLemma(xs[..indexAfter+1], xs[indexAfter+1..]); }
SeqToSet(xs[..indexAfter+1] + xs[indexAfter+1..]);
==
{ assert xs == xs[..indexAfter+1] + xs[indexAfter+1..]; }
SeqToSet(xs);
==
{ SetsEqualLemma(xs, r[indexAfter], r2, indexAfter, indexDel); }
SeqToSet(r2);
}

At this point in the code, xs is a sequence of ranges, but it may not be sorted and disjoint. The calc asserts that:

the integers covered by the two parts of xs, equalsthe integers covered by the concatenation of its two parts, equalsthe integers covered by xs, equalsrs.

For each step, we are allowed to include lemmas or asserts to help prove that step. For example, this assert helps prove the move from step 3 to 4:

{ assert xs == xs[..indexAfter+1] + xs[indexAfter+1..]; }

For efficiency and control, these lemmas and asserts will not be visible to the validator beyond their step. This keeps Dafny focused.

Tip #3: Use timeLimit to provide computation where needed.

Dafny stops trying to validate a method at a user-settable timeLimit. Limits of 10, 15, or 30 seconds are common because, as users, we generally want never-going-to-happen validations to fail fast. However, if we know that a validation will happen eventually, we can set a method-specific time limit. For example, Divyanshu Ranjan noticed that DeleteExtra usually does validate, but takes more time than the other methods, so he added a method-specific time limit:

method {:timeLimit 30} DeleteExtra(xs: seq<NeIntRange>, internalRange: IntRange) returns (r: seq<NeIntRange>)
// …Aside: timeLimit doesn’t account for the difference in speed between computers, so set it a bit generously.

Tip #4: Use split_here to divide a validation problem in two.

As the Dafny FAQs explain, sometimes validating a set of asserts together is faster and sometimes validating them one-at-a-time is faster.

Use an assert {:split_here} true; statement to split a sequence of asserts into two parts for the purpose of validation. For example, even with the timeLimit, DeleteExtra timed out until Divyanshu Ranjan added this:

// …
else
{
r := (s[..i] + [pair]) + s[i..];
assert r[..(i+1)] == s[..i] + [pair];
assert r[(i+1)..] == s[i..];
assert {:split_here} true; // split validation into two parts
calc {
SeqToSet(r[..(i+1)]) + SeqToSet(r[(i+1)..]);
// …

Tip #5: Keep lemmas small. If needed, split ensures across lemmas.

Sometimes lemmas try to do too much at once. Consider the SetsEqualLemma. It is related to deleting redundant ranges. For example, if we insert a into xs, the ranges marked with “X” become redundant.

The original version of SetsEqualLemma contained 12 requires and 3 ensures. Divyanshu Ranjan split it into two lemmas: RDoesntTouchLemma (11 requires and 2 ensures) and SetsEqualLemma (3 requires and 1 ensures). With this change, the project validated more reliably.

Applying these tips will improve the reliability of our proof. Can we make validation 100% reliable? Sadly, no. There is always a chance that with an unlucky seed Dafny will fail to validate. So, when do you stop trying to improve validation?

On this project, Divyanshu Ranjan and I improved the validation code until the chance of a validation error on any single run fell below 33%. So, over 10 random runs, we saw no more than 2 or 3 failures. We even tried 100 random runs. With 100 runs, we saw 30 failures.

Conclusion

So, there you have it: nine rules to prove a Rust algorithm’s correctness with Dafny. You may be discouraged that the process is not easier or more automatic. I, however, am encouraged that the process is possible at all.

Aside: Since geometry class — in high school — I’ve found math proofs fascinating and frustrating. “Fascinating” because a math theorem once proven is known true forever. (Euclid’s geometry is still considered true. Aristotle’s physics is not.) “Frustrating” because my math classes always seemed vague about which axioms I could assume and how big of steps my proof could take. Dafny and similar systems remove this vagueness with automatic proof checking. Even better, from my point of view, they help us create proofs about an area for which I care deeply: algorithms.

When is it worth doing a formal proof of an algorithm? Given the work involved, I will only do this again when the algorithm is some combination of tricky, important, or easy-to-prove.

How might the process improve in the future? I’d love to see:

Interchange between systems — A geometry theorem once proven need never be proven again. I’d love if the systems checking algorithmic proofs could use each other’s proofs.An all-Rust system as easy to use as Dafny — For work in this direction, see [1,2].Aside: Do you know of an easy-to-use Rust validation system? Please consider applying it to the validation of internal_add. This would let us compare the Rust system’s easy-of-use and power to Dafny’s.The proof analog of Rust’s Cargo.lock files — In Rust, we use the Cargo.lock to lock in a known-good combination of project dependencies. I wish when Dafny found a way to prove, for example, a method, that it would lock in the proof steps it found. This could make validation more reliable.Better AI for validation — My intuition is that ChatGPT, slightly improved, could be good at creating 90% of needed validation code. I find current ChatGPT 4 poor with Dafny, I assume for lack of Dafny training examples.Better validation for AI —When an AI generates code, we worry about the code’s correctness. Formal validation could help by proving correctness. (For a small example of this, see my article Check AI-Generated Code Perfectly and Automatically.)

Thank you for joining our journey into program correctness. We hope that if you have an algorithm for which you would like a proof, these steps will help you find that proof.

Please follow Carl on Medium. I write on scientific programming in Rust and Python, machine learning, and statistics. I tend to write about one article per month.

Read more of Divyanshu Ranjan’s work on his blog. In addition to formal methods, the blog touches on geometry, statistics, and more.

Nine Rules to Formally Validate Rust Algorithms with Dafny (Part 2) was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Comment