What do code reviews at Microsoft and in Open Source Projects have in common?
The first talk I attended was given by Alberto Bacchelli who did a very interesting study on code reviews. He started out with the fact that regardless of the tool the common factors are that code review is informal, tool-based and asynchronous.
In the good ol' days we had something like code inspection, a process that took forever and it evolved into code review. The main research question of Alberto was "Why do we do code review?". He approached Microsoft Research and started out with observations, interviews, surveys of management and developers and the top reason seems to be to improve the code quality rather than finding bugs as one would expect.
When Alberto reviewed the submitted comments in MS Codeflow, the system Microsoft uses to do its code review, he used a clustering method. The interesting observation that came out of this analysis is that the comments were focused around low level defects and not discussing any design.
To make the comparison with the FOSS world Alberto chose Gromacs and conQAT. Two projects I am not familiar with. Alberto's observation was that in both cases the majority where non-functional changes and the same pattern was observed at Microsoft. Personally I think that is very interesting.
Currently Alberto is busy with studies in the field of software analytics. This is data science applied to software code. He takes data sources like IDE logs, versioning system logs, issue track logs and review data. He classifies the data, looks for patterns and clusters it. The current questions he tries to find an answer to are:
- Who is the optimal person to review my code?
- How many times do you have multiple changes in one iteration?
- Are there parts that are more likely to contain bugs, how can we focus on the risky things?
Rspamd is an open source spam filtering system developed by Vsevolod Stakhov. The origins of rspamd is the Vsevolod's frustration managing a big cluster of spamassin machines that couldn't handle the load. He wrote rspamd in C and uses an event-driven model with the possibility to make LUA rules.
Vsevolod argues that there are basically two kinds of spam:
- fraud: Nigerian fraud, phishing, ...
- advertisement: classic Viagra, social networks, ...
The second part of the talk was rather technical on how Vsevolod implemented his ideas. I really liked the talk but was a bit disappointed when he said that there are issues to find package maintainers for the major Linux distro's. It looks a promising piece of software but in a commercial environment you are often required to work with standard packages and can't tell management the spam filter is going down because you need to recompile the new version.
The next talk I attended was systemtap. The presentor, Frank Ch. Eigler, is a rather funny character on stage. The idea of virtual patching has been around for a while but I haven't seen implementations in real life. Frank's idea finds its origins in dtrace and (scripted) dgb.
systemtap is also an event-driven system and implemented with a kernel module. The logic is:
- study the vulnerability
- analyze the conditions of the vulnerability
- draft an algorithm to make the hostile data safe or reject it
- express the algorithm in a script
- run the script
The evil side in my head was wondering how we could use such a system as a rootkit.
How to run a telco on free software
This talk given by Dave Neary was eye-opening. Although my previous employer was an ISP there is still a big difference between an ISP and a telco.
Dave started out with the history of telco's in the Western world. According to him there are 2 major revolutions in the telco world. The first being the addition of data where it used to be a voice-only story and the second revolution is the change in medium where we used to have copper wire we now have fibre, mobile, satelite, etc.
During his talk Dave explained OpenNFV. NFV stands for network function virtualization. Examples of NFV are loadbalancers, firewalls and intrusion detection devices. The OpenNFV standard is based on devops principles and it was interesting to see the projects Dave mentioned.
Here is a list in random order: