I was at The Free and Open Source Developers European Meeting 2016. Commonly shortened to FOSDEM, the event brings thousands of people in the free and open source software community from all over the world to ULB, Brussels, to exchange best practices, share new releases, and generally discuss the state of open source development.
The weekend was frighteningly dense (569 speakers! 618 events! 52 tracks!) so there’s very little hope of writing a full summary without the help of a small army of engineers. Based on the small sample of talks and panels I did get to attend, a few trends in the FOSS community became clear: code quality keeps getting better, developer experience as part of a community is growing in importance, and the steady move to microservice architecture is making its mark on best practices. Each section contains a link to the relevant talk page on fosdem.org, and in many cases slides and recordings are also available.
The Never-Ending Task of Improving Code Quality
On the Enterprise track, Alberto Bacchelli took an academic look at how engineering manager expectations don’t necessarily align with reality in code reviews. His team looked at comments from reviews at Microsoft and discovered that the vast majority were quality oriented rather than bug focussed (which was the manager-mandated Reason Number One for reviews). Whether the fact there was a misalignment here is actually a bad thing is up for debate – bugs were, after all, found. He followed with quick overview of what makes a good code review (familiarity with the code being reviewed unsurprisingly being most important), and a discussion about some prototyped data-driven tools that could improve code reviews:
- Automatic risk detection uses past review comments against similar code, identifying potential problem changes and flagging them for closer attention. The speaker flagged some simple cases like possible white-space inconsistencies and missing documentation comments, but I could see this also being extended to function calls that frequently throw exceptions, or areas of frequently changed code due to bugs. This sounds promising.
- Automated reviewer suggestion based on experience with particular code sections or types. Could be biased to prefer new developers if you want knowledge transfer.
- Change untangling separates one large commit into several self-contained changes, each able to be reviewed separately. While it could be technically feasible to separate an already logically coherent commit into functionally distinct reviewable chunks, I’m not convinced that it is extremely valuable, since context is often important to reviewing a change. Instead, I would encourage developers to separate their commits into smaller, logically distinct parts as the norm. I guess in many cases the problem is laziness; some developers don’t want to separate out their commits in this way. In an environment where it’s unavoidable, I can see the merit in a change untangler.
In the ‘Coding for Language Communities’ dev room, Dwayne Bailey gave a rundown of common failures in the internationalisation of open source projects which frequently rely on external contributors to act as localisers. Chiefly, he focused on tooling. Localisers use tools, just like devs. Instead of reinventing the wheel when it comes to translation file formats, he recommends using one of the other formats already in existence, like PO, XLIFF, or TS file formats. Making new tools and formats creates unnecessary barriers to contribution as localisers struggle with unfamiliar/bad tools.
- He also covered some other common things developers do that make life difficult for localisation contributors:
- splitting sentences – other languages don’t segment in the same way, so keep sentences whole.
- overusing cultural idioms
- assuming plurals act the same across languages
- not designing in right-to-left support from the start
- improperly formatting numbers and dates – Differing formats, while generally legible, will alienate your userbase. As a Briton I will buy you several drinks of your choosing if your app does DD/MM/YYYY for me.
In a lively session, Richard Stallman discusses the ins and outs of the GPLv3 license, focusing especially on the affero clause and why it was left out of GPLv3.
An interesting point was raised about GitHub: it’s too easy to make code available with no license, creating murky legal waters in this culture of code reuse. GitHub could maybe improve matters here by communicating the importance of licensing to repository maintainers. They’ve recently started prompting for a license, and they also operate choosealicense.com where the implications of having no license are discussed, but perhaps they could do more to educate people as to the implications of what no license means at the repo init time. As an aside, I’d like to point out that all of MediaMath’s open source projects specify a license.
Developers, Developers, Developers
In this Keynote Michael Kerrisk, the maintainer of man-pages, explored kernel API development with a view to long term reliability while avoiding breaking the ABI and incurring the wrath of Linus. The ‘don’t break the user experience’ mantra means that API changes are rendered almost impossible without painful workarounds. This is a big topic and is difficult to cover in a summary, but the main thrust was on testing, specifications, and how to deal with regressions or changes in requirements.
Many Kernel APIs have very poor test coverage, and often little to no formal specification. This results in a product that grows organically and with no source of truth for expected data. Unit tests allow for a certain degree of assurance that regressions haven’t been introduced, which may in turn become dependencies in applications, which results in you having to support idiosyncrasies going forward. Even with tests, problems arise when real-world applications start driving your API.
Versioning in a kernel API is hard because you can’t turn anything off (resulting in multiple function names, each one doing slightly different things for fixed behaviour).
Having gone through one round of deprecation on the TerminalOne Execution and Management API, it makes me wonder how expectations of backwards compatibility will change as more and more services expose a user-facing REST API over the web.
Before UX started to become important in defining the identity of software companies, the developer/designer relationship was somewhat antagonistic. Today, these two roles are becoming increasingly dependent on each other. The fact that there was a whole segment of the conference dedicated to improving the working relationships between developers and designers indicates that organizations are beginning to pay attention to how interpersonal friction slows down innovation.
In the Open Source Design devroom, UX designers Belen Barros Pera and Hollie Lubbock went over some of the pain points that they’ve experienced. The biggest takeaway from these two talks is that communication is of paramount importance. Too many software teams treat design as a waterfall process, where instead it really must be considered part of the agile model of the software development itself. Domain specific knowledge/language, tooling and communication methods are all barriers to fluid communication here; Hollie said in her talk that most arguments she’s been involved in were due to misunderstandings rather than actual differences in opinion. Belen suggested meeting the devs on their own terms (learn the product, git, IRC, Jira, … ) But there could probably be a flipside conversation to be had on how the devs can do more to accommodate less technical project contributors. Both speakers stressed the importance of prototyping as an effective form of specification – not only does it foster collaboration (as it it something devs can be directly involved in, and even use as a starting point for implementation) but it’s also far more effective than verbose design documents, which, surprisingly, designers prefer.
Jan Iversen on how LibreOffice strives to maintain an open, welcoming environment for contributors new to their large codebase, thanks in large part to the care they put into creating a personal relationship with individuals. In addition to assigning mentors, new people on the mailing list get an email response including links to a getting started guide and an ‘easy-hack fix’ list of issues/features. Part of what makes them easy to implement is that their tickets are straightforward, so the comments on the easy-hack tickets are kept pruned down to the essentials. The importance of a well-maintained wiki was stressed, as well as taking part in community events such as Google’s Summer of Code. Many companies – MediaMath included – are trying to model their contributor experience around good customer service models, and it’s great to see LibreOffice setting a great example here.
Microservices and service architecture
Adrian Cole hosts an introduction to Zipkin – a utility for generating trace graphs from events in a distributed system (think data flow through a microservice). This was actually very similar to a product we developed at a previous employer to trace an activity through our system by generating data points in Splunk with an overarching ‘Activity ID’, but with the improvement of adding a head and tail ID to each event. This allows for the generation of pretty graphs. Really appreciated the light-touch approach to implementation here – add the zipkin lib and some decorators to your code, and it’ll send events to your existing keystore.
Towards the end of the weekend, the conference organisers talked about their setup in some detail and show off a cool example of infrastructure as code using ansible playbooks. There’s a desire to make the conference infrastructure as portable as possible – partly so that the community can reuse their ‘conference in a box’, and partly so they can get some sleep the night before!
See you at FOSDEM ‘17!