How Amazon Web Services Uses Formal Methods
by Newcombe et al, appeared in Communications of the ACM in April 2015. It describes the use of Leslie Lamport's TLA+ (Temporal Logic of Actions) to refine the design of web services such as Dynamo DB and S3. (S3 stored 2 billion objects and handled 1.1 million transaction per second back in 2013.) Thanks to Jessica Kerr
for pointing me to this paper after an interview for the podcast Greater Than Code
We find a major benefit of having a precise, testable model of the core system is that we can quickly verify that even deep changes are safe or learn they are unsafe without doing harm. In several cases, we have prevented subtle but serious bugs from reaching production. In other cases we have been able to make innovative performance optimizations (such as removing or narrowing locks or weakening constraints on message ordering) we would not have dared to do without having model-checked those changes. A precise, testable description of a system becomes a what-if tool for designs, analogous to how spreadsheets are a what-if tool for financial models. We find that using such a tool to explore the behavior of the system can improve the designer’s understanding of the system.
Labels: Computing, Concurrency, Formal Methods, Programming Languages