Modeling the train reservation kata

Modeling the train reservation kata - Part 1

30 November 2020

The train reservation kata is one of my favorite kata. It is close from a real case, yet simple enough to yield interesting results in a two hour session. The explicit rules are the following:

70% of maximum train capacity can be reserved in advance
All seats of one reservation must be in the same coach
Ideally, all coaches have 70% occupation

This last rule spices the system up. There is an ambiguous meaning around the ideal rule. We cover this aspect in another article. This article goal is to layout the basic TLA+ specification to understand the last rule. Most of my readers skip this kind of article, but you will regret not getting into this one.

Basic train reservations

Instead of getting in the code first, let's try to think of the problem at hand. What does it mean to reserve seats? We'll consider a simplified train of 2 coach of 10 seats for a total capacity of 20 seats. Here's an instance of a random reservation:

Each reservation have a unique number. They have the same color on the drawing. Numbers also orders reservation in time: reservation 1 is done first, then reservation 2, etc. As the representation suggests, without any rules, a reservation could span multiple coaches.

TLA+ is capable of exploring all the possible ways to reserve seats and pass the tests for every state.

TLA+ checking a lot of configuration for you

TLA+ syntax

TLA+ is based on set theory, and augments it with temporal logic. Our first task is to model how all seats of a train are laid out. We use this layout for reservation later.

A set is a collection of distinct objects. We describe coaches as a set of their identifier, a single letter like in the kata specification:

Coaches == { "A", "B" }

For seat numbers, you can easily create sets of integers with the range .. operator:

Coaches == { "A", "B" }
SeatNumbers == 1..10

Now if we want the set of all seats in the train. We need to combine coach identifier and seat numbers. We use the \X operator. It creates a list of each possible pairs between two sets. For our example:

Seats == Coaches \X SeatNumbers

Is equivalent to:

Seats == { <<"A", 1>>, <<"A", 2>>, <<"A", 3>>, <<"A", 4>>, <<"A", 5>>,
    <<"A", 6>>, <<"A", 7>>, <<"A", 8>>, <<"A", 9>>, <<"A", 10>>,
    <<"B", 1>>, <<"B", 2>>, <<"B", 3>>, <<"B", 4>>, <<"B", 5>>,
    <<"B", 6>>, <<"B", 7>>, <<"B", 8>>, <<"B", 9>>, <<"B", 10>> }

Note that the <<>> is the sequence symbol. A sequence is an ordered collection of objects. We use them as tuples here. The \X or \times operator avoid typing long and error prone collections. Also Seats is automatically updated when we change Coaches or SeatNumbers.

Now that we have modeled the seats in the train, we need to model the reservations that users make. We ignore the id of reservation and consider reservation as a set of seat reserved. The following could be legal reservations:

\* A reservation for A1 and another for A2 and B3
reservations == { {<<"A", 1>>}, {<<"A", 2>>, <<"B", 3>>} }
\* Reservations for the first picture in the article
reservations == { {<<"A", 1>>}, {<<"A", 5>>}, {<<"A", 8>>, <<"B", 7>>},
    {<<"A", 10>>, <<"B", 8>>}, {<<"A", 7>>}, {<<"A", 2>>, <<"A", 6>>,
    <<"A", 9>>, <<"B", 5>>, <<"B", 10>>}, {<<"A", 3>>, <<"A", 4>>},
    {<<"B", 1>>, <<"B", 2>>, <<"B", 3>>, <<"B", 4>>, <<"B", 6>>, <<"B", 9>>} }

TLA+ does not have any type checking mechanism. Since it's quite useful, we implement our own type checking. We use the SUBSET keyword that allow to generate the set of all subset. For instance, SUBSET Coaches = {{}, {"A"}, {"B"}, {"A", "B"}}. The variable reservations can be any set of subset of seats. So we write:

TypeCheck == reservations \in SUBSET SUBSET Seats

The operator \in check that the variable belongs to the right hand side set. TypeCheck is an invariant that we check for every states the model goes into. But first we need a specification to run a model.

The first specification

We just modeled reservations but aren't able to do anything with it. It's time to introduce the acronym TLA: Temporal Logic of Actions. Last paragraph is about Logic, let's see Temporal and Actions right now. Actions are operators that changes the state of the system. In our case, if we need to reserve from an empty train, we need a reserve action.

Reserve == reservations' = reservations \union {{<<"A", 1>>}}

You declare Actions like other constraints. We introduce here two new syntax changes: ' the prime operator and \union. The union operator acts on sets. It returns a set containing all elements of both sets. For instance, {1, 2} \union {3, -1} = {-1, 1, 2, 3}.

The prime operator allows to assign a new state for a variable. So in our case, Reserve is an Action that changes the state of reservations to add a reservation for seat "A1". This is a bit dull, but we enrich this action later.

Next we have to declare how the state initialize and possible actions that changes the state of the system.

Init == reservations = {}
Next == Reserve

Init is the conventionnal first state of the system. In our case, reservations are empty. Next is also a convention.

And 🎉 tada! We have our first specification. You can create a model in the TLA+ toolbox, run an "Initial predicate and next-state relation" behavior and add TypeCheck as an invariant. The specification runs and TLC (the TLA+ checker) finds 2 distinct states.

Why do so much work for 2 states. We could do everything by hand and spend less time on tooling! Let's challenge our model to surpass our brain power.

From 2 states to millions

We have a dull way of reserving seats, let's introduce a dumb way of doing it:

Reserve == \E singleSeat \in Seats: reservations' = reservations \union {{singleSeat}}

\E means "there is". We call this logical predicate existential quantification. The Reserve action now adds the reservation of a non deterministic seat of the train. TLC executes the Next predicate until it exhausts all possible execution paths. In this case, it executes all possible ways to book a single seat.

TLC checks for the 20 000 000 (millions!) ways to do it. It can take a while (27 seconds on my setup). If we had to check a property on such a case, we would have a very hard time to find bugs with unit testing.

Imagine having to come up with the 20 000 000 cases by hand! Notice that we have checked TypeCheck for every of these 20 000 000 states. This is how powerful TLA can be!

We have the base of a specification. Only problem: We haven't solve any rule of the kata yet. Let's start with the first one: Do not reserve more than 70% of the train.

Booking less than 70% train occupation

Before changing the implementation of the Reserve action, let's check that we break the first rule (TDD style!). It can be expressed as an invariant:

70PercentTrainOccupation == (70 * Cardinality(Seats)) \div 100
AtMost70PercentTrainOccupation == Cardinality(UNION reservations) <= 70PercentTrainOccupation

We use Cardinality to have the number of elemets in a set. The keyword UNION is the invert operation from SUBSET: it reduces a set of set of objects into a set of objects. For instance UNION {{1}, {2, 3}} is equal to {1, 2, 3}. Here UNION reservations yields all the seats reserved in a set.

When running TLC yields an expected error: Invariant AtMost70PercentTrainOccupation is violated. The counter example toolbox displays could be this one:

Of course we haven't implemented any restrictions to the Reserve action, so now it's time to stop reserving when the train is full:

Reserve == /\ Cardinality(UNION reservations) < 70PercentTrainOccupation
           /\ \E singleSeat \in Seats: reservations' = reservations \union {{singleSeat}}

This is where the fun begins! The ajunction operator /\ is a very familiar operator even if you don't know it yet. Predicates are boolean expressions, so /\ is the equivalent of && boolean operator in other languages. Why is it written /\ P /\ Q? It's a convention to ease reading of TLA+ code.

Our new specification is more satisfying although it isn't really perfect. Before being able to reserve in the same coach, it would be nice to reserve multiple seats instead of just one!

More state, more problems

To reserve multiple seats we need to introduce a parameter to the Reserve action. The syntax is quite familiar to everyone:

Reserve(seatCount) == /\ Cardinality(UNION reservations) < 70PercentTrainOccupation
                      /\ \E singleSeat \in Seats: reservations' = reservations \union {{singleSeat}}

We adapt Next to introduce non deterministic behavior. The model already accounts for the fact that seats can be reserved in any order. We now also need to introduce the fact that we can reserve any number of seats. We need non-determinism again:

Next == \E seatCount \in 1..Cardinality(Seats): Reserve(seatCount)

Now we do not reserve a single seat but the right amount of seats:

Reserve(seatCount) ==
    /\ Cardinality(UNION reservations) < 70PercentTrainOccupation
    /\ \E seats \in SUBSET Seats:
        /\ Cardinality(seats) = seatCount
        /\ reservations' = reservations \union {seats}

Remember our good friend SUBSET? It allows to generate set of seats for the reservation. The condition on the cardinality of seats allow to reserve the right amount of seats. Are we good? Are you sure? Try to run the specification and you find some counter example like this one:

Reserving 14 seats at once breaks our invariant

The error we introduced here is that we can reserve the whole train at once. The condition that worked for a single reservation does not stand anymore. Instead of Cardinality(UNION reservations) < 70PercentTrainOccupation, we need to account for the count of seats being reserved: Cardinality(UNION reservations) + seatCount <= 70PercentTrainOccupation.

If you try to run the latest specification you'll encounter the state space explosion problem. Instead of having a couple of millions of states, TLC has 400 millions of states to explore. We discuss this issue later. For now let's be patient and wait for TLC to check all states.

Same coach reservation

We can go on and solve the second constraint: reservations can only span a single coach. We first need to translate "same-coachness" of a reservation. One can express it this way:

SameCoach(seats) == \E someCoach \in Coaches: \A <<coach, _rr>> \in seats: coach = someCoach
AllReservationAreInTheSameCoach == \A reservation \in reservations: SameCoach(reservation)

\A is the universal quantification. It is interpreted as "for all". So the SameCoach function can be read: There's a coach "someCoach" such as every seat in the set "seats" belong to "someCoach". If any two seat are not in the same coach, the predicate return FALSE. Let's see how it fails with TLC:

Reserving in 2 different coaches breaks `AllReservationAreInTheSameCoach` invariant

Alright, you know what's going to happen, we're gonna add this contraint to our reservation routine. We now have a reservation that works for 2 rules right? Well not exactly. The resulting specification seems alright, but there is something wrong about it and it needs to be adressed before we get further.

Reservation conflict

Remember when I explained SUBSET, I took the example SUBSET {1, 2} = {{}, {1}, {2}, {1, 2}}. If we look carefully, number 1 belongs to both {1} and {1, 2}. See how this could be a problem in our case? If we replace numbers by seats, we get SUBSET {<<"A", 1>>, <<"A, 2>>} = {{}, {<<"A", 1>>}, {<<"A, 2>>}, {<<"A", 1>>, <<"A, 2>>}}. A1 is in both reservations! Since this is the only check we have on reservations, it means that the same seat can be reserved multiple times.

Let's proove this:

SeatsCanBeReservedOnce == \A seat \in Seats:
    Cardinality({reservation \in reservations: seat \in reservation}) <= 1

We introduce a new construct: set filtering. {x \in S: P(x)} is the set of x in S that satisfies P(x). In our case the set of reservations that includes the same seat. TLC finds yet another counter example.

This problem is quite symptomatic of using your tools blindly without thinking of the consequences. How would you fix this problem? There are many ways of doing it, but I think it boils down to an implicit predicate: we should only be reserving free seats. Free seats are the seats that are not yet in the reservations:

ReservedSeats == UNION reservations
FreeSeats == Seats \ ReservedSeats

Operator \ is the set substraction. For instance {1, 2} \ {2, 3} = {1}. Now that we can select only subsets of free seats in the reserve action:

Reserve(seatCount) ==
    /\ Cardinality(UNION reservations) + seatCount <= 70PercentTrainOccupation
    /\ \E seats \in SUBSET FreeSeats:
        /\ SameCoach(seats)
        /\ Cardinality(seats) = seatCount
        /\ reservations' = reservations \union {seats}

And this is it, we now have a great specification. Did you try to run it?

Because we're using the variable in the seat reservation, the TLC is much slower in running the specification. If you need to wait days before getting a positive result, the tool becomes useless. It is crucial to have a reasonnable state space to verify specification quickly.

Fortunately there's a saying in formal methodologies that can help us here: The small scope hypothesis. This hypothesis is explained by Daniel Jackson in his book Software abstractions, logic, language and analysis. It states that the majority of constraint infringement will appear in small configuration scope. You can verify that by the counter example that were found by TLC for our rules. TLC explores less than 200 states before it finds a counter example.

If small scopes are sufficient, we can check rules with small sized coach. For instance try to verify rules with coaches of 5 seats. Since state space grows exponentially, dividing the scope by two turns an untractable analysis into a reasonnable one.

Conclusion

I hope this post has been a better introduction for TLA+ than my previous article. We wrote an abstract specification for the problem at hand. This lead to a repetition in our design. We cover ways of implementing a more down to earth reservation specification in later episode of this series.

Also we still need to find a way to respect the last rule and this needs an article on its own. The word "ideally" is ambiguous and we have to discuss what it means. We explore the question in a following article.