Experiments with C and Go: Spec-ing out a new changelog
Here at MediaMath, we store and act on a lot of data – many terabytes a day. A small subset of that data – all of the marketing campaign, ad unit, client, spend, billing, and reporting data, which is used for our digital marketing platform, TerminalOne (T1) – is stored in a PostgreSQL database. And while a changelog exists for this database, it was built early in MediaMath’s development and is tightly coupled with – and therefore reflective of – only select core T1 API endpoints, ignorant to other endpoints entirely.
We needed to build a new service that could replace our existing changelog, and we had four main requirements:
- The new service must be scalable.
- The new service must be a standalone, decoupled from other services.
- The new service must listen to all changes in the database, with the intent that changes can be filtered further along in the pipe.
- The new service must not interfere with the normal operation of any other services.
I was lucky enough to be given this proof of concept (POC) project to build out a new changelog. My experience building it has allowed me to experiment with lots of new tools and services, and it has turned out to be a great example of why I love working for MediaMath.
An Early Idea, Ultimately Scrapped
One early idea was to use triggers. This idea was scrapped, because the triggers would have added more work for PostgreSQL to do on every commit, and would have required creating a trigger for any new table. A mechanism for publishing the data, like writing to a file or RabbitMQ, and possibly installing modules would also have been needed. While doable, it seemed likely to slow down Postgres, and definitely would have required modifications to the Postgres server, which we were trying to avoid.
WAL files & xlogdump
It was suggested that I look into reading the write-ahead log (WAL) files that Postgres uses for replication/crash recovery. I found a cool utility called “xlogdump” that can read and print the contents of WAL files. I considered running that in a loop and parsing the text, but found it wouldn’t be terribly efficient.
As a POC, I rewrote the xlogdump utility as a C library, and used its functions to build a small Go program that watched the files for changes and exposed a small API over TCP. Messages were published in a simple text-based protocol over a socket. When clients connect, they say either “start” or “start xxxxxx,” the latter specifying the last message id they saw to mitigate problems with slow/late joiners.
How a Gopher talks to C
Calling C libraries from Go code turned out to be ridiculously simple! To link with the library and include headers this is all you need:
#cgo CFLAGS: -Imylibrary
#cgo LDFLAGS: mylibrary/mylibrary.a
And then to call code and marshall types you do something like this:
result := C.parseWalFile(C.CString(filename), C.uint32_t(lastOffset))
Making bindings for native libraries in other languages usually requires a lot more boilerplate with utilities like SWIG simplifying it a little bit. I was really impressed with how easy it was to get working with Go.
Finding out when things change
There is also an amazing Go library called “fsnotify” (https://github.com/go-fsnotify/fsnotify) that publishes events for changes in a directory. Integrating that into my POC allowed me to avoid polling, while also incrementally reading the changes from the files that had the most recent data, rather than parsing every file every time.
Ready for prod?
Feedback on the POC has been positive thus far, and our data infrastructure and API teams have come up with all kinds of potential use cases for it. One likely outcome will be to rebuild tasks that materialize artifacts from Postgres so that instead of running on bulk data dumps, they can run on incrementally updated datasets.
However, my POC wasn’t quite production-ready. When it came time to automate the build, it was clear that the C library was going to be a hassle. Go has largely the same capabilities that C has, but in addition, it has garbage collection, which simplifies things like reading data structures from disk into memory. I scrapped the library used in the POC in favor of a pure Go solution, and in the process did a lot of clean up. Around this time another developer came on to the project and drastically improved test coverage, as well.
At the time of this writing, the service is deployed internally and teams are now starting to integrate with it. Having the freedom to explore new technologies and have a say in how we approach problems is what makes this company the best place to work in my opinion.
Postscript: Postgres 9.4 & Logical Decoding
Readers familiar with the Postgres 9.4 feature “Logical Decoding” may realize that it provides similar capabilities to my POC. In fact, if we could guarantee that all of our Postgres servers were on 9.4, we might have used that feature instead. Even when we start to deploy Postgres 9.4, my software will still be useful, as it provides a unified API for our legacy servers and the new 9.4 ones. Further, my WAL-based solution is available even when Postgres is not running unlike “Logical Decoding.”