Google Summer of Code

In this post I go over the the work I did in this year’s edition of Google Summer of Code (GSoC) in detail. I had to hand in a slightly rushed technical report as a final deliverable for my project, where I documented the work I did with bare minimum details. Now I’ve had a couple more days to recall my whole experience and write down a proper blog entry.
Preface
While adding support for Riak in the benchmark that I’m implementing, I needed to fork the Riak Erlang client to be able to work with modern Erlang versions. Since I was getting to know the database more and more, it made sense to me to reach out to Russell Brown in order to get a project for this year’s GSoC. To make a long story short, he accepted to mentor me in a project, we had some difficulties in getting Riak accepted in time for GSoC but in the end I got in. Sweet.
The project
I offered to do some work in places I know were needed, but Russell told me there is a community of people experienced with Riak that are slowly making contributions and apparently there are multiple pain points being ironed out, which led to discussing other ideas. Russell proposed to rework Riak Test since it was infamous for being tough to set up, to the point where people wouldn’t even try to use it. The idea seemed challenging, and I thought that looking at some Riak test suites would be a good way to figure out how some of the internal components work, so I said yes.
Opening the hood on Riak Test
The code within the Riak Test repository is composed of three main components: a custom test runner, a code intercept library very useful for the types of tests that are actually performed and lastly some useful modules for startup and teardown of Riak clusters along with extra set up of the testing environment.
The test runner
The easiest way to understand how the old test runner worked is to look at some of the test suites. Fortunately there are some trivial examples which I’m going to show you. Here is a test suite that is supposed to fail:
%% @doc A test that always returns `fail'.
-module(always_fail_test).
-export([confirm/0]).
confirm() ->
fail.
And here we see an example of how to build a passing test:
%% @doc A test that always returns `pass'.
-module(always_pass_test).
-behavior(riak_test).
-export([confirm/0]).
confirm() ->
pass.
There are several things to unpack here. The first is that confirm/0 is the callback that the test runner calls,
and the test result is an atom with the logical value of the test success. You are supposed to add the setup,
test and the teardown code inside this callback, effectively making these test modules contain logic for a single test.
At least that might have been the idea, but…
In theory there is no difference between theory and practice. In practice there is.
There were some test suites with hundreds of lines of code that were all packed into the confirm/0 callback.
To make things worse there is no sort of detailed reporting, which made failing test cases extremely frustrating.
You have a test consisting of a bunch of code you might not be familiar with that was passing one day and failing the next,
and the only bit of information you’d get was the fail result: sometimes there wasn’t even a function name or line number to refer to, and you’d be on your own.
Today we have Common Test which is excellent to build the sorts of intricate test cases that you can probably imagine Riak has, but I thought that at the time Riak Test was made things might not have been the same. While I was looking into the source code Russell brought Gordon Guthrie and Bryan Hunt onboard and none of them was aware of any real reason why Basho would try to reinvent the wheel in this way, especially taking into account what looked to be a botched end result. Bryan described it as “a classic example of technical debt”, a description I found intriguing. As a student I’m usually racing to jump from one project to the next, and I generally don’t stop to think about code that I already wrote; technical debt for me was a buzzword that gets thrown around in software engineering classes and yet there it was. To have experienced it first hand[^1] gave me a perfect impression of why that was such an important problem to fix, as this was restricting contributions to the fairly small number of people that could successfully operate Riak Test.
Using Riak Test involves booking an entire morning
Getting Riak Test to run wasn’t exactly difficult once you know how it works, but it still takes ages to run[^2] the simplest of test cases: the project build script would download and generate releases for multiple Riak versions, which isn’t exactly ideal if you wish to get something running very quickly. Despite this problem being exacerbated in the first run, it became obvious that we needed to improve in this dimension as well.
Test suites are just one big test with lots of nested cases
Let’s look at a slightly more complex example of a test suite and its confirm/0 callback:
-module(basic_command_line).
-behavior(riak_test).
-export([confirm/0]).
confirm() ->
%% Deploy a node to test against
lager:info("Deploy node to test command line"),
[Node] = rt:deploy_nodes(1),
?assertEqual(ok, rt:wait_until_nodes_ready([Node])),
%% Verify node-up behavior
ping_up_test(Node),
attach_direct_up_test(Node),
status_up_test(Node),
console_up_test(Node),
start_up_test(Node),
getpid_up_test(Node),
%% Stop the node, Verify node-down behavior
stop_test(Node),
ping_down_test(Node),
attach_down_test(Node),
attach_direct_down_test(Node),
status_down_test(Node),
console_test(Node),