A logical way to test online software

This is an old blog post that has been later edited, so that it no longer describes an older version of Apilog. We have kept the post as it explains some of the motivation behind the project.

Experienced software developers know how much effort it takes to produce highly correct and bug free software, even for relatively simple offline applications. That effort is multiplied for online software services which often have complex, distributed internal architecture and continuously evolving functionality.

Developers of such services have many techniques to improve quality, like good implementation languages, static types, model checking, test frameworks, continuous integration, healthchecks and monitoring. One technique is obviously extra important: testing the service in question through its external interfaces, i.e. system testing. In fact, there is really no substitute for such testing, since it is the only practical way to verify the safety of the implementation as a whole. However, for system testing to be really effective it needs to be done continuously and systematically. This has traditionally required a lot of manual effort.

Fortunately, techniques for automatized testing is a long-standing research topic, and two such techniques in particular have become popular in the software development community: property-based random testing and fuzz testing. Both techniques generate random input data to obtain large test coverage, saving developers the manual effort of writing concrete test cases.

A very common interface to online software services, and thus a natural medium through which to test them, is of course RPC protocols of various kinds, and in particular HTTP-based APIs.

But although there are many interesting test services and tools that target HTTP APIs, they are often not based on random testing, and if they are, they are limited to certain specific patterns of API design.

On this blog, we will cover the development of a new service dedicated to automatic system testing through RPC APIs. It is being built from the ground up with a strong focus on random testing and with few assumptions about API structure.

The project, which has already been under development for several years, consists of two components: a language that we call Apilog and an upcoming online testing service that we call apilog.net.

The language Apilog is made for specifying RPC APIs. It is based on logic programming and designed so that large and varied random data can be generated automatically from succinct API specifications.

The level of specification detail is decided by the author. It is easy to get going quickly by pinning down some simple healthchecks, and later work on filling in more detail where it may pay off. Such detail can consist of request and response body schemas, which can also vary in detail, thanks to logic variables acting as “holes” where any value is accepted. It is possible to go further and add pre- and post-conditions of operations, modelling parts of the state of the system under test.

Another idea behind the language is to allow API specifications to be written in a simple endpoint-by-endpoint fashion, a bit like Swagger or RAML descriptions, although with less indentation-based nesting.

We would like to close off with some references to work that has inspired us.

The idea of a standalone language for expressing test-data generators was proposed by Lampropoulos et al, with their language Luck. It is a functional-logic language, and so slightly different from Apilog which is a pure logic programming language. Another difference is that control of size and distribution of data must be explicitly programmed into Luck generators, whereas with Apilog we want to make this aspect as automatic as possible. To achieve this, we are experimenting with Boltzmann Sampling.

The idea of using logic programming for random testing of properties is not new. In fact, Roberto Blanco, Dale Miller and Alberto Momigliano published an article while we were developing Apilog, arguing for this idea. Their article, Property-Based Testing via Proof Reconstruction, also describes an interesting technique for shrinking counterexamples which we would like to add to our system.

One of the more interesting fuzz-testing tools for HTTP APIs is RESTler, by Microsoft Research (Patrice Godefroid, Marina Polishchuk). It is based on Swagger, which imposes a certain structure and type system on the API, and which cannot encode dependencies between operations or other logical properties. To find interesting sequences of operations to test, RESTler tries to automatically infer dependencies between operations. apilog.net is also able find such interesting sequences, if dependencies have been explicitly specified. Since RESTLers focus is on fuzz-testing, it mainly finds status 500 errors (internal crashes), whereas apilog.net can find any kind of error, given a detailed enough specification. However, fuzzing is really great at finding 500-errors, since it generates a lot of “almost correct” input. We would like to explore a general way of supporting fuzz testing, by adapting our logic programming search algorithm to find “almost correct” solutions to goals.

Finally, a promising project in the area of testing of online services is Quickstrom by Oscar Wickström. Its focus is not primarily testing of HTTP APIs, but testing of web application UIs. It does so rather beautifully with a specification language based on temporal logic; a good fit for abstractly capturing user interface flows. HTTP APIs, however, are typically less oriented around states and flows (state changes) and more oriented around resources and operations. We believe they are more naturally modelled by state machine transition relations expressed in simpler logic.

That is all for now. Thank you for reading, and do not hesitate to contact us if you have any questions or comments, or if you just want to discuss testing!