Some words about my relationship to and with code.
🛠️ 💻 ⌨️
Software engineering is a profession, but I also like to think of it as a craft and art. I like the art of system design, of building, and of programming.
I respect complexity, I love to simplify. Taming complexity is key. Simplification is key.
Most of my passion for software engineering stems from my fascination for the magic behind the scenes: fault tolerance, network technology, scalable and distributed systems. I really like reading The Architecture of Open Source Applications and The Linux Programming Interface by Michael Kerrisk. Around 2010, I discovered How Complex Systems Fail by Richard Cook and it opened my eyes.
Getting feedback, seeing value — I love building products. I really like to get software out there, I do get excitement from those really challenging debugging sessions when a lot is at stake.
It feels like I have built quite a lot of software of various kinds in my life so far. Here, I would like to highlight two products that I have poured a particularly large amount of passion and energy into, that have coined me, and that influence how I act and think about software engineering and software engineering organizations today:
- DC/OS. At Mesosphere/D2iQ, we were the forefront of large-scale containerization, enabled the industry to enter the new era of distributed application design before Kubernetes was ready. DC/OS was EOLd in 2021, but “For more than five years, DC/OS has enabled some of the largest, most sophisticated enterprises in the world to achieve unparalleled levels of efficiency, reliability, and scalability from their infrastructure”. For example, League of Legends ran on DC/OS :-).
- Opstrace. We built a novel Observability product: cloud-native, secure, deeply tested. Got acquired by GitLab in 2021.
As a researcher by heart and education, I love data and have things to say and thoughts about data presentation, about numerical data analysis, statistics, and machine learning.
At the heart of the software engineering day-to-day, I have learned to focus on defining what desired behavior is, on creating clear concepts. I care about specification and interfaces most. I love to refine the problem statement before entering solution space. In any engineering org, I might be the strongest advocate for the importance of deliberate error handling, and for providing great error messages. To save three types of people a bunch of time and energy: youerself, your fellow engineers, and your users.
Code readability, quality, testing, maintainability: sometimes I feel like I have too strong opinions about these. But it might be a good thing? I know there are no simple truths, but there’s quite a wide spectrum between “probably good” and “hurtfully bad”, and we always have a choice where in that spectrum we want to be.
Security is special focus of mine: I took part in hacking competitions in my childish programming days. Later, at Mesosphere, I led the security team. We dealt with the canonical bits of enterprise security (compliance/certification, X.509 hierarchy, single sign-on, LDAP, etc), but also did innovative work in the field of multi-tenancy and fine-grained authorization in distributed systems (patent).
I got my fair share of experience in trying to build really robust software. For Enterprise DC/OS, my team built the identity and access management service (IAM). Downtime was intolerable. We delivered it as a highly available service distributed across five machines, backed by a distributed (consistent) relational database (well, that was a ride). It had to work reliably in air-gapped environments.
Two more things: I think that it’s the longer-term maintenance of software with a larger user base that teaches many important lessons, and we should always want to learn and like to discover new things! Paradigms change, and progress grows expoentially.
Open source software projects that I started and (maybe, more or less) maintain:
- gipc, providing gevent-cooperative child processes and inter-process communication. Used by Quantopian, Ajenti, Chronology, GDriveFS and others.
- WP-GeSHi-Highlight, a syntax highlighting plugin for WordPress, used by more than 1.000 websites.
- python-cmdline-bootstrap, a structure template for Python command line applications, simplifying release and distribution via setuptools/PyPI/pip for Python 2 and 3.
- goeffel, a tool for measuring the resource utilization of a specific process over time.
- github-repo-stats, a GitHub Action for advanced repository traffic analysis and reporting.
A selection of articles published through my blog
- Reading files line by line in C++ using ifstream: dealing correctly with badbit, failbit, eofbit, and perror()
- Distributing a Python command line application
- Sharing state in AngularJS: be aware of $watch issues and race conditions during app initialization
- Discourse Docker container: send mail through Exim
- Building numpy and scipy with Intel compilers and Intel MKL on a 64 bit machine
- A command line argument is raw binary data. It comes with limitations and needs interpretation
- Data analysis with pandas: enjoy the awesome
- In-memory SQLite database and Flask: a threading trap
- Setting up Torque (PBS) for GPU job scheduling
- Recover a gzipped HTML response from the browser cache
More to be found in the technology category of this blog.
A web service that I once created (not maintained anymore):
- Songkick events for Google’s Knowledge Graph, extracts tour dates for a certain artist from Songkick and converts the data to a special JSON-LD format, applicable for showing events in Google’s Knowledge Graph. Used by Milky Chance, The Black Feathers and others.
Some open source software projects that I have contributed to:
- gevent, a coroutine-based network library
- pandas, a data analysis and manipulation library for Python
- Radicale, a CalDAV/CardDAV server
- ldap3, a Python LDAP library
- gunicorn, a Python WSGI (HTTP) Server
- Falcon, a Python web framework
- pysaml2, a Python SAML library
- python3-saml, another Python SAML library
- docker-openldap, a docker image to run OpenLDAP
- Twitter Bootstrap
- Open Babel
Stale/idle software projects that I once put into the world:
🪦 The graveyard? One day… :-)
- timegaps, a command line program for thinning out a collection of items, whereas the “time gaps” between accepted items become larger with increasing age of items. Useful for implementing backup retention policies.
- molecular-structure-comparison, a Python framework for analyzing molecular structures, featuring, for instance, a DBSCAN clustering method with automatic parameter optimization, and a simple PDB file format parser.
- beautiful-readme, creating a mobile-friendly static website from your README file.
- clobi, a job scheduling system supporting virtual machines in multiple Infrastructure-as-a-Service computing clouds, developed in the course of my Google Summer of Code 2009.
- AWSAC, Amazon Web Services for ATLAS Computing.
- Fotobatch, for renaming, resizing, and turning many digital photos with only one click.
- Keks, a highly decoupled micro job queuing system based on Python, Redis, and gevent.
- Schlonz, a Python program for building modern static image galleries using cutting-edge HTML5 techniques and some of the best Lightbox and image gallery techniques available.
- latexletter, a Python module that provides a simple abstraction for the creation of LaTeX letters based on template files.
- Galleria Classicmod, a free theme for Galleria, providing, among others, a fullscreen option.
Update: I put this XKCD here in early 2010, but in 2023 it has new relevance! :)