## Talk Abstracts

SMT@Microsoft (invited talk), Leonardo de Moura, Microsoft Research

Program analysis and verification require decision procedures for satisfiability modulo theories (SMT), that decide the satisfiability, or, dually, the validity, of formulas in decidable fragments of specific theories or combinations thereof. Examples include the quantifier-free fragments of theories of arithmetic, bit-vectors, lists, arrays and records. The challenge is to have decision procedures that are simultaneously sound, complete, expressive and efficient, to handle problems of practical interest without incurring in false negatives or false positives. SMT solvers have proven highly scalable, efficient and suitable for integrating theory reasoning. The success of SMT for program analysis and verification is largely due to the suitability of supported background theories. In this talk, we describe how SMT solvers, with emphasis on Microsoft's Z3 solver, are used in program analysis and verification. Several concrete applications from program analysis at Microsoft will be presented.

Bio. Leonardo de Moura is a researcher in the Software Reliability group at Microsoft Research, Redmond WA, USA. He is currently working on the Z3 theorem prover. From 2001-2006, he was a Computer Scientist at SRI International, Menlo Park CA, USA. He obtained his Ph.D. in Computer Science from PUC, Rio de Janeiro RJ, Brazil in 2000. He is interested in theorem provers, decision procedures and software verification.

Formal Methods for Critical Systems (invited talk), Steven P. Miller, Rockwell Collins

Formal methods have traditionally been reserved for systems with requirements for extremely high assurance. However, the growing popularity of model-based development, in which models of system behavior are created early in the development process and used to auto-generate code, are making precise, mathematical specifications much more common in industry. At the same time, formal verification tools such as model checkers continue to grow more powerful. The convergence of these two trends opens the door for the practical application of formal verification techniques early in the life cycle for many systems. This talk will describe how Rockwell Collins has applied both theorem proving and model checking to commercial avionics to reduce costs and improve quality.

Bio. Dr. Steven Miller is a Senior Principal Software Engineer in the Advanced Technology Center of Rockwell Collins. He has over 30 years of experience in software development for both mainframe and embedded systems. He received his Ph.D. in computer science from the University of Iowa in 1991, and also holds a B.A. in physics and mathematics from the University of Iowa.
His current research interests include model-based development and formal methods. He was principle investigator on a 5-year project sponsored by NASA Langley's Aviation Safety Program and Rockwell Collins to investigate advanced methods and tools for the development flight critical systems. Prior to that he lead several research efforts at Rockwell Collins, including a collaborative effort with SRI International and NASA to formally verify the microcode in the AAMP5 and AAMP-FV microprocessors using the PVS verification system.

Behavioral Automata Composition for Automatic Topology Independent Verification of Parameterized Systems, Youssef Hanna, Iowa State University

Verifying correctness properties of parameterized systems is a long-standing problem. The challenge lies in the lack of guarantee that the property is satisfied for all instances of the parameterized system. Existing work on addressing this challenge aims to reduce this problem to checking the properties on smaller systems with a bound on the parameter referred to as the cut-off. A property satisfied on the system with the cut-off ensures that it is satisfied for systems with any larger parameter. The major problem with these techniques is that they only work for certain classes of systems with specific communication topology such as ring topology, thus leaving other interesting classes of systems unverified. We contribute an automated technique for finding the cut-off of the parameterized system that works for systems defined with any topology. Given the specification and the topology of the system, our technique is able to automatically get the cut-off specific to this system. We prove the soundness of our technique and demonstrate its effectiveness and practicality by applying it to several canonical examples where in some cases, our technique obtains smaller cut-off values than that presented in the existing literature.

Tisa: A Language Design and Modular Verification Technique for Temporal Policies in Web Services, Hridesh Rajan, Iowa State University

Web services are distributed software components, that are decoupled from each other using interfaces with specified functional behaviors. However, such behavioral specifications are insufficient to demonstrate compliance with certain temporal non-functional policies. An example is demonstrating that a patient's health-related query sent to a health care service is answered only by a doctor (and not by a secretary). Demonstrating compliance with such policies is important for satisfying governmental privacy regulations. It is often necessary to expose the internals of the web service implementation for demonstrating such compliance, which may compromise modularity.
In this work, we provide a language design that enables such demonstrations, while hiding majority of the service's source code. The key idea is to use greybox specifications to allow service providers to selectively hide and expose parts of their implementation. The overall problem of showing compliance is then reduced to two subproblems: whether the desired properties are satisfied by the service's greybox specification, and whether this greybox specification is satisfied by the service's implementation. We specify policies using LTL and solve the first problem by model checking. We solve the second problem by refinement techniques.

Verifying Correctness and Reliability of Cumulative Attestation Kernels on Flash MCUs, Michael LeMay, University of Illinois at Urbana-Champaign

We present the methodology for and results of formally verifying correctness and reliability properties of an embedded security component known as a Cumulative Attestation Kernel (CAK). There are increasing deployments of networked embedded systems and rising threats of malware intrusions on such systems. To mitigate this threat, it is desirable to enable commonly-used embedded processors known as flash MCUs to provide remote attestation assurances like the Trusted Platform Module (TPM) provides for PCs. However, flash MCUs have special limitations concerning cost, power efficiency, computation, and memory that influence how this goal can be achieved. Moreover, many types of applications require integrity guarantees for the system over an interval of time rather than just at a given instant. A CAK can address these concerns by providing cryptographically secure firmware auditing on networked embedded systems. To illustrate the value of CAKs, we demonstrate practical remote attestation for Advanced Metering Infrastructure (AMI), a core technology in emerging smart power grid systems that requires cumulative integrity guarantees. Since the CAK serves as a root of trust for the embedded system and must not be modified after the system is deployed, we must formally verify that it satisfies certain correctness and reliability properties. We verified that it correctly records firmware images that are installed on the system. Since many embedded systems operate with unreliable power supplies, we also verified that power failures do not result in firmware corruption, regardless of the number or timing of such failures, provided they eventually cease. We performed a similar verification for an internal flash filesystem used by the CAK as well. We used a Maude rewriting-logic model of the system as input to Maude's model checker during the verification. Each property was formalized in linear temporal logic.

Fast and Flexible Proof Checking for SMT, Duckki Oe and Andy Reynolds, The University of Iowa

Fast and flexible proof checking can be implemented for SMT using the Edinburgh Logical Framework with Side Conditions (LFSC). LFSC provides a declarative format for describing proof systems as signatures. We describe several optimizations for LFSC proof checking, and report experiments on QF IDL benchmarks showing proof-checking overhead of 32% of the solving time required by our clsat solver.

Register Transfer Level Test Generation as Meta-Reasoning about Simulation, Michael Katelman, University of Illinois at Urbana-Champaign

Modern verification methodology for register transfer level logic (RTL) is largely simulation based; about 95% of the overall verification effort according to the International Technology Roadmap for Semiconductors. Even the best automated testing algorithms fail to cover all verification obligations, and when they do, creating directed tests is extremely tedious given available tools. In this talk I will define a novel language that can be used to generate targeted stimuli for RTL logic and which mitigates the complexities of writing directed tests. The main idea is to treat directed testing as a meta-reasoning problem about simulation. This leads to a natural formalization of directed testing in terms of a proof-search strategy language; similar to the original idea of ML. I will demonstrate the utility of the language with several examples.

Rewriting Logic Semantics of a Plan Execution Language, Camilo Rocha, University of Illinois at Urbana-Champaign

Rewriting logic is a logic of concurrent change in which a wide range of models of computation and logics can be faithfully represented. Intuitively, a rewriting logic theory $\mathcal{R} = (\Sigma, E , R)$ specifies a {\em concurrent system} whose states are elements of an algebra specified by the equational theory $(\Sigma,E)$, and whose {\em concurrent transitions} are specified by the rules $R$. The conceptual distinction between equations and rules has important consequences when giving the rewriting logic semantics of a language (for instance, it has practical consequences for {\em program analysis} by model-checking). We present an overview of rewriting logic, and illustrate its use as a general semantic framework with a brief description of the rewriting logic semantics of PLEXIL, a synchronous language developed by NASA to support autonomous spacecraft operations.

The Linear Temporal Logic of Rewriting Model Checker, Kyungmin Bae, University of Illinois at Urbana-Champaign

The linear temporal logic of rewriting (LTLR) extends linear temporal logic with spatial action patterns. LTLR generalizes and extends various state-based and event-based logics and aims to avoid certain types of mismatches between a system and its correctness specification. We have implemented an LTLR model checker by extending the existing LTL model checker available from Maude's formal analysis tools. In this talk, I will give a brief explanation about our LTLR model checking framework and demonstrate its use with several case studies.

List and Set Proofs in Guru, John Hughes and Greg Witt, The University of Iowa

Lists are universal in code, and sets are universal in mathematics. In the Guru verified-programming language, we implement finite sets using lists. In this presentation, we will be discussing lemmas and proofs about lists viewed as finite sets, which we developed during our summer NSF REU project in the Guru lab.

Scheduling Test Using Atomicity Violations, Francesco Sorrentino, University of Illinois at Urbana-Champaign

In order to increase the computation performances, concurrency is largely used in the development of large system programs. Unfortunately, the interactions among communicating threads in the program can result in unexpected behaviors. These behaviors typically result in bugs that occur late in the software development cycle or even after the software is released. The use of concurrent multi-threaded systems necessitates the development of new methodologies to debug the programs. Traditional methods of testing, such stress and random testing, often miss these bugs. Moreover, to assure correctness of software, is fundamentally challenged for concurrent programs because of the "interleaving explosion problem". Formally, given a concurrent program $P$ and a "single" test input $i$ to it, there are a multitude of interleaved executions on $i$. These grow exponentially with the number of cores, making a systematic exploration of all executions on the test infeasible.
Our approach to tackle this problem is to choose (wisely) a subset of interleaved executions to test. Such subset is the class of executions that violate "atomicity". A programmer writing a procedure often wants uninterfered access to certain shared data that will enable him/her to reason about the procedure locally. The programmer often puts together concurrency control mechanisms to ensure atomicity, mainly by taking locks on the data accessed. This is however extremely error-prone: errors occur if not all the required locks for the data are acquired, non-uniform ordering of locking can cause deadlocks, and naive ways of locking can inhibit concurrency, which forces programmers to invent intricate ways to achieve concurrency and correctness at the same time. Recent studies of concurrency errors show that a majority of errors (69%) are atomicity violations. This motivates our choice in selecting executions that violate atomicity as the criterion for choosing interleavings to execute.
We study the problem of determining, given a run of a concurrent program, whether there is any alternate execution of it that violates atomicity. The intuition is that if a concurrent program adopts "nested locking", the problem of predicting atomicity violations is possible efficiently, without the exploration of all interleavings. In particular, for the case of atomicity violations involving only two threads and a single variable, which covers many of the atomicity errors reported in bug databases, efficient algorithms that work in time that is linear in the length of the runs, and quadratic in the number of threads, exist. A preliminary version of the tool that implements our algorithm has shown promising results. It scales well for benchmark concurrent programs and is effective in predicting a large number of atomicity violations even from a single run. In order to verify if a real error is found, we are developing a scheduler that given a program execution that violates the atomicity, it verifies if such execution is feasible and if the results produced differ from the expected ones.

Verifying Tolerant Systems using Polynomial Approximations, Pavithra Prabhakar, University of Illinois at Urbana-Champaign

Hybrid Automata are a popular formalism for modelling systems with mixed discrete-continuous behavior. The verification problems for the general class of hybrid automata are undecidable even for simple properties like reachability. The typical approach to verify such systems is to construct a system with simpler dynamics that abstracts the original system. Analyzing the abstraction can give useful information about the original system. If the abstraction is safe, then one can conclude the safety of the original system. However, if the abstraction is unsafe, then usually no reliable information can be inferred about the original system.
Instead of using abstractions to simplify dynamics, we take a slightly different approach. We approximate a hybrid system with arbitrary flow functions by systems with polynomial flows; the verification of certain properties in systems with polynomial flows can be reduced to the first order theory of reals, and is therefore decidable. The polynomial approximations that we construct epsilon-simulate (as opposed to simulate'') the original system, and at the same time are tight. We show that for systems that we call "tolerant", safety verification of a system can be reduced to the safety verification of the polynomial approximation. Our main technical tool in proving this result is a logical characterization of epsilon-simulations.
This is a joint work with Vladimeros Vladimerou, Mahesh Viswanathan and Geir Dullerud.

Verifying Imperative Abstractions with Dependent and Linear Types, Aaron Stump, The University of Iowa

The Guru verified-programming language combines a dependently typed pure functional programming language with a logic for machine-checked proofs about its programs. Imperative abstractions like mutable arrays or aliased linked data structures are supported by specifying a pure functional model, to use for proving properties of programs using the abstractions; and an imperative implementation in C, which the compiler inserts during compilation in place of the pure functional model. Linear types help ensure that the functional model's behavior matches the imperative implementation's. This talk will survey some imperative abstractions implemented using this approach.

Sireum/Topi LDP: A Lightweight Semi-Decision Procedure for Optimizing Symbolic Execution-based Analyses, Jason Belt, Kansas State University

Automated theorem proving techniques such as Satisfiability Modulo Theory (SMT) solvers have seen significant advances in the past several years. These advancements, coupled with vast hardware improvements, have drastic impact on, for example, program verification techniques and tools. The general availability of robust general purpose solvers have reduced a significant engineering overhead when designing and developing program verifiers. However, most solver implementations are designed to be used as a black box, and due to their aim as general purpose solvers, they often miss optimization opportunities that can be done by leveraging domain-specific knowledge.
This paper presents our effort to leverage domain-specific knowledge for optimizing symbolic execution (SymExe)-based analyses; we present optimization techniques incorporated as a lightweight semi-decision procedure (LDP) that provides up to an order of magnitude faster analysis time when analyzing realistic programs and well-known algorithms. LDP sits in the middle between a SymExe-based analysis tool and an existing SMT solver; it aims to reduce the number of solver calls by intercepting them and attempting to solve constraints using its lightweight deductive engine.

Decision Procedure for Partial Orders, Elena Sherman, University of Nebraska - Lincoln

Advanced automated decision procedures are gaining popularity for automating the analysis and verification of software. SMT frameworks come equipped with a collection of ready-made decision procedures, but what happens if a program analysis needs a new decision procedure? One approach is to use uninterpreted, or partially interpreted, functions, but this can lead to imprecision that might significantly degrade the quality of the software analysis results.
In this presentation, we describe how work on symbolic execution of Java programs has identified the need to reason about constraints related to the inheritance hierarchy. Those constraints can be expressed in a theory of partial orders and we describe our work on developing a decision procedure for partial orders (DP-PO). We summarize our findings to date and assess the benefits of using DP-PO in support of symbolic execution.

SAT Solvers and Search-Based Client Applications, Brady Garvin, University of Nebraska - Lincoln

Combinatorial interaction testing (CIT) is a cost-effective sampling technique for discovering interaction faults in highly-configurable systems. But it assumes that feature choices are independent, which is not generally true of real software. Hence, testers often need an extension called constrained CIT where the legal feature combinations are encoded in a propositional formula.
During our recent work to improve a meta-heuristic constrained CIT sample generator we discovered that almost all of the challenges we faced stemmed from constraints. While we achieved acceptable performance using a SAT solver to determine if formulae were SAT or UNSAT, it became clear that by asking other sorts of questions we could do better. For instance, it would be very useful if, given a formula and a set of terms, we could request an assignment that maximizes the number of terms that are true.
Accordingly, we outline the mismatches between the needs of our application and the features that solvers offer. We also demonstrate that this disparity is not limited to the constrained CIT problem, but will arise in a variety of settings.