Category Archives: Python

pyenv has proven itself over the years

Back around 2010 I was manually managing production-critical CPython installations, compiled in special ways. I learned a lot about how to properly isolate Python installations from each other. I also learned how important it is to never confuse a “system Python” (as many Linux distributions have one) with a Python build that you can be in control of. Since then I have never touched a system Python anymore (it is the system’s Python after all, and none of my business).

For developing, testing, and running my software, I needed to manage custom Python installations. That involved building CPython from source, many times. At some point around 2015 I adopted pyenv for doing exactly that. Back then, I also adopted pyenv-virtualenv for managing so-called “virtual environments” derived from the individual Python builds managed by pyenv.

I come to say: this method has been a joy to use over the last years. It has never let me down; it has always provided me with simplicity, reliability, a predictable outcome based on which serious engineering work could be done. I have adopted it for continuous integration pipelines; I use it for building container images, towards more reproducible test environments. It always works.

I want to share the boring stuff I did today with pyenv/pyenv-virtualenv, as I have done it for years, always with a smile, thinking and appreciating “that just worked as expected”.

Today, for a program I am developing right now, I wanted to set up a CPython 3.6.9 build, with a virtual environment derived from it to install specific dependencies into it. I started with

cd ~/.pyenv && git pull

for updating my local pyenv installation to learn about current Python releases (I have been using the same pyenv installation for years, migrated it between development machines). I issued the following command for pyenv to download a CPython 3.6.9 source archive, and to build it:

pyenv install 3.6.9

I found that the resulting build didn’t provide the lzma module because I didn’t have the corresponding header files available on my Fedora installation. I installed the relevant header files with a

dnf install xz-devel

and simply repeated

pyenv install 3.6.9

This resulted in a CPython build including the lzma module. I then went ahead and derived a virtual environment from that build, for the specific purpose of developing my current pet project:

pyenv virtualenv 3.6.9 venv369-goeffel

I then activated said virtual environment with

pyenv activate venv369-goeffel

and started to use it for specific development work.

Not spectacular, right? That’s it, precisely. The power here comes from the simplicity and reliability. A big thank you goes out to the pyenv maintainers.

I think the following shows that I should start deleting old things, but I think it also shows the use case where pyenv/pyenv-virtualenv really shines:

$ pyenv versions
* system (set by /home/jp/.pyenv/version)
  2.7.15
  2.7.15/envs/gipcdev-2715
  3.5.2
  3.5.2/envs/venv352-analysis-2
  3.5.2/envs/venv352-beets
  3.5.2/envs/venv352-bouncer
  3.5.2/envs/venv352-bouncer19
  3.5.2/envs/venv352-bouncer19-2
  3.5.2/envs/venv352-bouncer-open
  3.5.2/envs/venv352-dcos-commons-simumation
  3.5.2/envs/venv352-jupyter
  3.5.2/envs/venv352-modern-crypto
  3.6.6
  3.6.6/envs/dcos-docker
  3.6.6/envs/dcos-e2e-zk35
  3.6.6/envs/gipc-py-366-gevent14
  3.6.6/envs/messer-hdf5
  3.6.6/envs/pandas-dev-env
  3.6.6/envs/test-gipc-100-release
  3.6.6/envs/test-gipc-101-release
  3.6.6/envs/venv366-bouncer-enterprise
  3.6.6/envs/venv366-bouncer-open
  3.6.6/envs/venv366-dataanalysis
  3.6.9
  bouncer-deptest
  cryptotest
  dataanalysis
  dcos-docker
  dcos-e2e-zk35
  gipcdev-2715
  gipc-py-366-gevent14
  messer-hdf5
  pandas-dev-env
  pypy2.7-5.9.0
  pypy2.7-5.9.0/envs/pypy27-gevent-gipc-dev
  pypy27-gevent-gipc-dev
  pypy3.5-5.9.0
  pypy3.5-5.9.0/envs/pypy35-gevent-gipc-dev
  pypy3.5-6.0.0
  pypy3.5-6.0.0/envs/venvpypy35600-gipc
  pypy35-gevent-gipc-dev
  pypy-5.6.0
  pypy-5.6.0/envs/venvpypy560-gevent120-gipc
  test-gipc-100-release
  test-gipc-101-release
  venv2710-kazoo
  venv2710-pysamltest
  venv2710-samldev
  venv342
  venv342-bouncer
  venv342-bouncer2
  venv342-bouncerdeps
  venv342-bouncerdeps2
  venv342-bouncerold
  venv342-bouncertmep
  venv342-pyoidc
  venv342-samldev
  venv342-test
  venv342-tmp
  venv343-bouncer
  venv343-saml
  venv344-bouncer
  venv344-consensus
  venv344-dev
  venv344-gunicorndev
  venv345-bouncer
  venv345-cryptography-patch
  venv345-kazam
  venv352-analysis-2
  venv352-beets
  venv352-bouncer
  venv352-bouncer19
  venv352-bouncer19-2
  venv352-bouncer-open
  venv352-dcos-commons-simumation
  venv352-jupyter
  venv352-modern-crypto
  venv366-bouncer-enterprise
  venv366-bouncer-open
  venv366-dataanalysis
  venv-dcos-342
  venv-graphtool
  venv-kazam
  venvpy3boto3
  venvpypy35600-gipc
  venvpypy560-gevent120-gipc
  venvtmp

gipc 1.0.0 release

More than three years after the 0.6.0 release I have published gipc version 1.0.0 today.

Quick links

Release highlights

This release focuses on reliability and platform compatibility. It brings along a number of changes relevant for running it on Windows and macOS, as well as for running it under PyPy.

Both, gevent 1.2 and 1.3 are now officially supported. On Linux, gipc now officially supports CPython 2.7, 3.4, 3.5, 3.6, PyPy2.7, and PyPy3. On Windows, gipc officially supports gevent 1.3 on CPython 2.7, 3.4, 3.5, 3.6, and 3.7. Support for gevent 1.1 and CPython 3.3 has been dropped.

The API did not change. In view of the stability of the API over the recent years I thought it is time to officially declare it as such, and to follow the semantic versioning spec‘s point 5: “Version 1.0.0 defines the public API” :-).

For this release most of the work went into

  • fixing a small number of platform-dependent bugs (this one was interesting, and this one was pretty insightful and ugly).
  • setting up a continuous integration (CI) pipeline for Linux and macOS (on Travis CI) as well as for Windows (via AppVeyor).
  • Re-writing and re-styling significant parts of the documentation: the new docs are online and can be found at https://gehrcke.de/gipc (for comparison: old docs).
  • moving the repository from Bitbucket to GitHub (I also migrated issues using this well-engineered helper).
  • making tests more stable.
  • running the example programs as part of CI, on all supported platforms (required a number of consolidations).

Acknowledgements

I would like to thank the following people who have helped with this release, be it by submitting bug reports, by asking great questions, with testing, or with a bit of code: Heungsub Lee, James Addison, Akhil Acharya, Oliver Margetts.

Changelog for this release

For the record, the complete changelog for this release copied from CHANGELOG.rst:

New platform support:

  • Add support for PyPy on Linux. Thanks to Oliver Margetts and to Heungsub Lee for patches.

Fixes:

  • Fix a bug as of which gipc crashed when passing “raw” pipe handles between processes on Windows (see issue #63).
  • Fix can't pickle gevent._semaphore.Semaphore error on Windows.
  • Fix ModuleNotFoundError in test_wsgi_scenario.
  • Fix signal handling in example infinite_send_to_child.py.
  • Work around segmentation fault after fork on Mac OS X (affected test_wsgi_scenario and example program wsgimultiprocessing.py).

Test / continuous integration changes:

  • Fix a rare instability in test_exitcode_previous_to_join.
  • Make test_time_sync more stable.
  • Run the example programs as part of CI (run all on Linux and Mac, run most on Windows).
  • Linux main test matrix (all combinations are covered):
    • gevent dimension: gevent 1.2.x, gevent 1.3.x.
    • Python implementation dimension: CPython 2.7, 3.4, 3.5, 3.6, PyPy2.7, PyPy3.
  • Also test on Linux: CPython 3.7, pyenv-based PyPy3 and PyPy2.7 (all with gevent 1.3.x only).
  • Mac OS X tests (all with gevent 1.3.x):
    • pyenv Python builds: CPython 2.7, 3.6, PyPy3
    • system CPython
  • On Windows, test with gevent 1.3.x and CPython 2.7, 3.4, 3.5, 3.6, 3.7.

Potentially breaking changes:

  • gevent 1.1 is not tested anymore.
  • CPython 3.3 is not tested anymore.

How to raise UnicodeDecodeError in Python 3

For debugging work, I needed to manually raise UnicodeDecodeError in CPython 3(.4). Its constructor requires 5 arguments:

>>> raise UnicodeDecodeError()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: function takes exactly 5 arguments (0 given)

The docs specify which properties an already instantiated UnicodeError (the base class) has: https://docs.python.org/3/library/exceptions.html#UnicodeError

These are five properties. It is obvious that the constructor needs these five pieces of information. However, setting them requires knowing the expected order, since the constructor does not take keyword arguments. The signature also is not documented when invoking help(UnicodeDecodeError), which suggests that this interface is implemented in C (which makes sense, as it affects the bowels of CPython’s text processing).

So, the only true reference for finding out the expected order of arguments is the C code implementing the constructor. It is defined by the function UnicodeDecodeError_init in the file Objects/exceptions.c in the CPython repository. The essential part are these lines:

if (!PyArg_ParseTuple(args, "O!OnnO!",
     &PyUnicode_Type, &ude->encoding,
     &ude->object,
     &ude->start,
     &ude->end,
     &PyUnicode_Type, &ude->reason)) {
         ude->encoding = ude->object = ude->reason = NULL;
         return -1;
}

That is, the order is the following:

  1. encoding (unicode object, i.e. type str)
  2. object that was attempted to be decoded (here a bytes object makes sense, whereas in fact the requirement just is that this object must provide the Buffer interface)
  3. start (integer)
  4. end (integer)
  5. reason (type str)

Hence, now we know how to artificially raise a UnicodeDecodeError:

>>> o = b'\x00\x00'
>>> raise UnicodeDecodeError('funnycodec', o, 1, 2, 'This is just a fake reason!')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'funnycodec' codec can't decode byte 0x00 in position 1: This is just a fake reason!

gipc 0.6.0 released

I have just released gipc 0.6.0 introducing support for Python 3. This release has been motivated by gevent 1.1 being just around the corner, which will also introduce Python 3 support.

Changes under the hood

The new gipc version has been verified to work on CPython 2.6.9, 2.7.10, 3.3.6, and 3.4.3. Python 3.4 support required significant changes under the hood: internally, gipc uses multiprocessing.Process, whose implementation drastically changed from Python 3.3 to 3.4. Most notably, on Windows, the arguments to the hard-coded CreateProcess() system call were changed, preventing automatic inheritance of file descriptors. Hence, implementation of a new approach for file descriptor transferral between processes was required for supporting Python 3.4 as well as future Python versions. Reassuringly, the unit test suite required close-to-zero adjustments.

The docs got a fresh look

I have used this opportunity to amend the documentation: it now has a fresh look based on a brutally modified RTD theme. This provides a much better reading experience as well as great support for narrow screens (i.e. mobile support). I hope you like it: https://gehrcke.de/gipc

Who’s using gipc?

I have searched the web a bit for finding interesting use cases. These great projects use gipc:

Are you successfully applying gipc in production? That is always great to hear, so please drop me a line!

Availability

As usual, the release is available via PyPI (https://pypi.python.org/pypi/gipc). Please visit https://gehrcke.de/gipc for finding API documentation, code examples, installation notes, and further in-depth information.

Git: list authors sorted by the time of their first contribution

I have created a micro Python script, git-authors, which parses the output of git log and outputs the authors in order of their first contribution to the repository.

By itself, this is a rather boring task. However, I think the resulting code is quite interesting, because it applies a couple of really important concepts and Python idioms within just a couple of lines. Hence, this small piece of code is not only useful in practice; it also serves an educational purpose.

The latter is what this article focuses on. What can you expect to learn? Less than ten lines of code are discussed. Center of this discussion is efficient stream processing and proper usage of data structures: with a very simple and yet efficient architecture, git-authors can analyze more than 500.000 commits of the Linux git repository with near-zero memory requirements and consumption of only 1 s of CPU time on a weak machine.

Usage: pipe formatted data from git log to git-authors

The recommended usage of git-authors is:

$ git log --encoding=utf-8 --full-history --reverse "--format=format:%at;%an;%ae" | \
    git-authors > authors-by-first-contrib.txt

A little background information might help for understanding this. git-authors expects to be fed with a data stream on standard input (stdin), composed of newline-separated chunks. Each chunk (line) is expected to represent one commit, and is expected to be of a certain format:

timestamp;authorname;authoremail

Furthermore, git-authors expects to retrieve these commits sorted by time, ascendingly (newest last).

git log can be configured to output the required format (via --format=format:%at;%an;%ae) and to output commits sorted by time (default) with the earliest commits first (using --reverse).

git log writes its output data to standard output (stdout). The canonical method for connecting stdout of one process to stdin of another process is a pipe, a transport mechanism provided by the operating system.

The command line options --encoding=utf-8 and --full-history should not be discussed here, for simplicity.

The input evaluation loop

Remember, the git-authors program expects to retrieve a data stream via stdin. An example snippet of such a data stream could look like this:

[...]
1113690343;Bert Wesarg;wesarg@informatik.uni-halle.de
1113690343;Ken Chen;kenneth.w.chen@intel.com
1113690344;Christoph Hellwig;hch@lst.de
1113690345;Bernard Blackham;bernard@blackham.com.au
1113690346;Jan Kara;jack@suse.cz
[...]

The core of git-authors is a loop construct built of seven lines of code. It processes named input stream in a line-by-line fashion. Let’s have a look at it:

  1. seen = set()
  2. for line in stdin:
  3.     timestamp, name, mail = line.strip().split(";")
  4.     if name not in seen:
  5.         seen.add(name)
  6.         day = time.strftime("%Y-%m-%d", time.gmtime(float(timestamp)))
  7.         stdout.write("%04d (%s): %s (%s)\n" % (len(seen), day, name, mail))

There are a couple of remarks to be made about this code:

  • This code processes the stream retrieved at standard input in a line-by-line fashion: in line 2, the script makes use of the fact that Python streams (implemented via IOBase) support the iterator protocol, meaning that they can be iterated over, whereas a single line is yielded from the stream upon each iteration (until the resource has been entirely consumed).

  • The data flow in the loop is constructed in a way that the majority of the payload data (the line’s content) is processed right away and not stored for later usage. This is a crucial concept, ensuring a small memory footprint and, even more important, a memory footprint that does (almost) not depend on the size of the input data (which also means that the memory consumption becomes largely time-independent). The minimum amount of data that this program requires to keep track of across loop iterations is a collection of unique author names already seen (repetitions are to be discarded). And that is exactly what is stored in the set called seen. How large might this set become? An estimation: how many unique author names will the largest git project ever accumulate? A couple of thousand maybe? As of summer 2015, the Linux git repository counts more than half a million commits and about 13.000 unique author names. Linux should have one of the largest if not the largest git history. It can safely be assumed that O(10^5) short strings is the maximum amount of data this program will ever need to store in memory. How much memory is required for storing a couple of thousand short strings in Python? You might want to measure this, but it is almost nothing, at least compared to how much memory is built into smartphones. When analyzing the Linux repository, the memory footprint of git-authors stayed well below 10 MB.

  • Line 3 demonstrates how powerful Python’s string methods are, especially when cascaded. It also shows how useful multiple assignment via sequence unpacking can be.

  • “Use the right data structure(s) for any given problem!” is an often-preached rule, for ensuring that the time complexity of the applied algorithm is not larger than the problem requires. Lines 1, 4, and 5 are a great example for this, I think. Here, we want to keep track of the authors that have already been observed in the stream (“seen”). This kind of problem naturally requires a data structure allowing for lookup (Have I seen you yet?) and insert (I’ve seen you!) operations. Generally, a hash table-like data structure fits these problems best, because it provides O(1) (constant) complexity for both, lookup and insertion. In Python, the dictionary implementation as well as the set implementation are both based on a hash table (in fact, these implementations share a lot of code). Both dicts and sets also provide a len() method of constant complexity, which I have made use of in line 7. Hence, the run time of this algorithm is proportional to the input size (to the number of lines in the input). It is impossible to scale better than that (every line needs to be looked at), but there are many sub-optimal ways to implement a solution that scales worse than linearly.

  • The reason why I chose to use a set instead of a dictionary is rather subtle: I think the add() semantics of set fit the given problem really well, and here we really just want to keep track of keys (the typical key-value association of the dictionary is not required here). Performance-wise, the choice shouldn’t make a significant difference.

  • Line 6 demonstrates the power of the time module. Take a Unix timestamp and generate a human-readable time string from it: easy. In my experience, strftime() is one of the most-used methods in the time module, and learning its format specifiers by heart can be really handy.

  • Line 7: old-school powerful Python string formatting with a very compact syntax. Yes, the “new” way for string formatting (PEP 3101) has been around for years, and deprecation of the old style was once planned. Truth is, however, that the old-style formatting is just too beloved and established, and will probably never even become deprecated, let alone removed. Its functionality was just extended in Python 3.5 via PEP 461.

Preparation of input and output streams

What is not shown in the snippet above is the preparation of the stdin and stdout objects. I have come up with the following method:

kwargs = {"errors": "replace", "encoding": "utf-8", "newline": "\n"}
stdin = io.open(sys.stdin.fileno(), **kwargs)
stdout = io.open(sys.stdout.fileno(), mode="w", **kwargs)

This is an extremely powerful recipe for obtaining the same behavior on Python 2 as well as on 3, but also on Windows as well as on POSIX-compliant platforms. There is a long story behind this which should not be the focus of this very article. In essence, Python 2 and Python 3 treat sys.stdin/sys.stdout very differently. Grabbing the underlying file descriptors by their balls via fileno() and creating TextIOWrapper stream objects on top of them is a powerful way to disable much of Python’s automagic and therefore to normalize behavior among platforms. The automagic I am referring to here especially includes Python 3’s platform-dependent automatic input decoding and output encoding, and universal newline support. Both really can add an annoying amount of complexity in certain situations, and this here is one such case.

Example run on CPython’s (inofficial) git repository

I applied git-authors to the current state of the inofficial CPython repository hosted at GitHub. As a side node, this required about 0.1 s of CPU time on my test machine. I am showing the output in full length below, because I find its content rather interesting. We have to appreciate that the commit history is not entirely broken, despite CPython having switched between different version control systems over the last 25 years. Did you know that Just van Rossum also was a committer? :-)

0001 (1990-08-09): Guido van Rossum (guido@python.org)
0002 (1992-08-04): Sjoerd Mullender (sjoerd@acm.org)
0003 (1992-08-13): Jack Jansen (jack.jansen@cwi.nl)
0004 (1993-01-10): cvs2svn (tools@python.org)
0005 (1994-07-25): Barry Warsaw (barry@python.org)
0006 (1996-07-23): Fred Drake (fdrake@acm.org)
0007 (1996-12-09): Roger E. Masse (rmasse@newcnri.cnri.reston.va.us)
0008 (1997-08-13): Jeremy Hylton (jeremy@alum.mit.edu)
0009 (1998-03-03): Ken Manheimer (klm@digicool.com)
0010 (1998-04-09): Andrew M. Kuchling (amk@amk.ca)
0011 (1998-12-18): Greg Ward (gward@python.net)
0012 (1999-01-22): Just van Rossum (just@lettererror.com)
0013 (1999-11-07): Greg Stein (gstein@lyra.org)
0014 (2000-05-12): Gregory P. Smith (greg@mad-scientist.com)
0015 (2000-06-06): Trent Mick (trentm@activestate.com)
0016 (2000-06-07): Marc-André Lemburg (mal@egenix.com)
0017 (2000-06-09): Mark Hammond (mhammond@skippinet.com.au)
0018 (2000-06-29): Fredrik Lundh (fredrik@pythonware.com)
0019 (2000-06-30): Skip Montanaro (skip@pobox.com)
0020 (2000-06-30): Tim Peters (tim.peters@gmail.com)
0021 (2000-07-01): Paul Prescod (prescod@prescod.net)
0022 (2000-07-10): Vladimir Marangozov (vladimir.marangozov@t-online.de)
0023 (2000-07-10): Peter Schneider-Kamp (nowonder@nowonder.de)
0024 (2000-07-10): Eric S. Raymond (esr@thyrsus.com)
0025 (2000-07-14): Thomas Wouters (thomas@python.org)
0026 (2000-07-29): Moshe Zadka (moshez@math.huji.ac.il)
0027 (2000-08-15): David Scherer (dscherer@cmu.edu)
0028 (2000-09-07): Thomas Heller (theller@ctypes.org)
0029 (2000-09-08): Martin v. Löwis (martin@v.loewis.de)
0030 (2000-09-15): Neil Schemenauer (nascheme@enme.ucalgary.ca)
0031 (2000-09-21): Lars Gustäbel (lars@gustaebel.de)
0032 (2000-09-24): Nicholas Riley (nriley@sabi.net)
0033 (2000-10-03): Ka-Ping Yee (ping@zesty.ca)
0034 (2000-10-06): Jim Fulton (jim@zope.com)
0035 (2001-01-10): Charles G. Waldman (cgw@alum.mit.edu)
0036 (2001-03-22): Steve Purcell (steve@pythonconsulting.com)
0037 (2001-06-25): Steven M. Gava (elguavas@python.net)
0038 (2001-07-04): Kurt B. Kaiser (kbk@shore.net)
0039 (2001-07-04): unknown (tools@python.org)
0040 (2001-07-20): Piers Lauder (piers@cs.su.oz.au)
0041 (2001-08-23): Finn Bock (bckfnn@worldonline.dk)
0042 (2001-08-27): Michael W. Hudson (mwh@python.net)
0043 (2001-10-31): Chui Tey (chui.tey@advdata.com.au)
0044 (2001-12-19): Neal Norwitz (nnorwitz@gmail.com)
0045 (2001-12-21): Anthony Baxter (anthonybaxter@gmail.com)
0046 (2002-02-17): Andrew MacIntyre (andymac@bullseye.apana.org.au)
0047 (2002-03-21): Walter Dörwald (walter@livinglogic.de)
0048 (2002-05-12): Raymond Hettinger (python@rcn.com)
0049 (2002-05-15): Jason Tishler (jason@tishler.net)
0050 (2002-05-28): Christian Tismer (tismer@stackless.com)
0051 (2002-06-14): Steve Holden (steve@holdenweb.com)
0052 (2002-09-23): Tony Lownds (tony@lownds.com)
0053 (2002-11-05): Gustavo Niemeyer (gustavo@niemeyer.net)
0054 (2003-01-03): David Goodger (goodger@python.org)
0055 (2003-04-19): Brett Cannon (bcannon@gmail.com)
0056 (2003-04-22): Alex Martelli (aleaxit@gmail.com)
0057 (2003-05-17): Samuele Pedroni (pedronis@openend.se)
0058 (2003-06-09): Andrew McNamara (andrewm@object-craft.com.au)
0059 (2003-10-24): Armin Rigo (arigo@tunes.org)
0060 (2003-12-10): Hye-Shik Chang (hyeshik@gmail.com)
0061 (2004-02-18): David Ascher (david.ascher@gmail.com)
0062 (2004-02-20): Vinay Sajip (vinay_sajip@yahoo.co.uk)
0063 (2004-03-21): Nicholas Bastin (nick.bastin@gmail.com)
0064 (2004-03-25): Phillip J. Eby (pje@telecommunity.com)
0065 (2004-08-04): Matthias Klose (doko@ubuntu.com)
0066 (2004-08-09): Edward Loper (edloper@gradient.cis.upenn.edu)
0067 (2004-08-09): Dave Cole (djc@object-craft.com.au)
0068 (2004-08-14): Johannes Gijsbers (jlg@dds.nl)
0069 (2004-09-17): Sean Reifschneider (jafo@tummy.com)
0070 (2004-10-16): Facundo Batista (facundobatista@gmail.com)
0071 (2004-10-21): Peter Astrand (astrand@lysator.liu.se)
0072 (2005-03-28): Bob Ippolito (bob@redivi.com)
0073 (2005-06-03): Georg Brandl (georg@python.org)
0074 (2005-11-16): Nick Coghlan (ncoghlan@gmail.com)
0075 (2006-03-30): Ronald Oussoren (ronaldoussoren@mac.com)
0076 (2006-04-17): George Yoshida (dynkin@gmail.com)
0077 (2006-04-23): Gerhard Häring (gh@ghaering.de)
0078 (2006-05-23): Richard Jones (richard@commonground.com.au)
0079 (2006-05-24): Andrew Dalke (dalke@dalkescientific.com)
0080 (2006-05-25): Kristján Valur Jónsson (kristjan@ccpgames.com)
0081 (2006-05-25): Jack Diederich (jackdied@gmail.com)
0082 (2006-05-26): Martin Blais (blais@furius.ca)
0083 (2006-07-28): Matt Fleming (mattjfleming@googlemail.com)
0084 (2006-09-05): Sean Reifscheider (jafo@tummy.com)
0085 (2007-03-08): Collin Winter (collinw@gmail.com)
0086 (2007-03-11): Žiga Seilnacht (ziga.seilnacht@gmail.com)
0087 (2007-06-07): Alexandre Vassalotti (alexandre@peadrop.com)
0088 (2007-08-16): Mark Summerfield (list@qtrac.plus.com)
0089 (2007-08-18): Travis E. Oliphant (oliphant@enthought.com)
0090 (2007-08-22): Jeffrey Yasskin (jyasskin@gmail.com)
0091 (2007-08-25): Eric Smith (eric@trueblade.com)
0092 (2007-08-29): Bill Janssen (janssen@parc.com)
0093 (2007-10-31): Christian Heimes (christian@cheimes.de)
0094 (2007-11-10): Amaury Forgeot d'Arc (amauryfa@gmail.com)
0095 (2008-01-08): Mark Dickinson (dickinsm@gmail.com)
0096 (2008-03-17): Steven Bethard (steven.bethard@gmail.com)
0097 (2008-03-18): Trent Nelson (trent.nelson@snakebite.org)
0098 (2008-03-18): David Wolever (david@wolever.net)
0099 (2008-03-25): Benjamin Peterson (benjamin@python.org)
0100 (2008-03-26): Jerry Seutter (jseutter@gmail.com)
0101 (2008-04-16): Jeroen Ruigrok van der Werven (asmodai@in-nomine.org)
0102 (2008-05-13): Jesus Cea (jcea@jcea.es)
0103 (2008-05-24): Guilherme Polo (ggpolo@gmail.com)
0104 (2008-06-01): Robert Schuppenies (okkotonushi@googlemail.com)
0105 (2008-06-10): Josiah Carlson (josiah.carlson@gmail.com)
0106 (2008-06-10): Armin Ronacher (armin.ronacher@active-4.com)
0107 (2008-06-18): Jesse Noller (jnoller@gmail.com)
0108 (2008-06-23): Senthil Kumaran (orsenthil@gmail.com)
0109 (2008-07-22): Antoine Pitrou (solipsis@pitrou.net)
0110 (2008-08-14): Hirokazu Yamamoto (ocean-city@m2.ccsnet.ne.jp)
0111 (2008-12-24): Tarek Ziadé (ziade.tarek@gmail.com)
0112 (2009-03-30): R. David Murray (rdmurray@bitdance.com)
0113 (2009-04-01): Michael Foord (fuzzyman@voidspace.org.uk)
0114 (2009-04-11): Chris Withers (chris@simplistix.co.uk)
0115 (2009-05-08): Philip Jenvey (pjenvey@underboss.org)
0116 (2009-06-25): Ezio Melotti (ezio.melotti@gmail.com)
0117 (2009-08-02): Frank Wierzbicki (fwierzbicki@gmail.com)
0118 (2009-09-20): Doug Hellmann (doug.hellmann@gmail.com)
0119 (2010-01-30): Victor Stinner (victor.stinner@haypocalc.com)
0120 (2010-02-23): Dirkjan Ochtman (dirkjan@ochtman.nl)
0121 (2010-02-24): Larry Hastings (larry@hastings.org)
0122 (2010-02-26): Florent Xicluna (florent.xicluna@gmail.com)
0123 (2010-03-25): Brian Curtin (brian.curtin@gmail.com)
0124 (2010-04-01): Stefan Krah (stefan@bytereef.org)
0125 (2010-04-10): Jean-Paul Calderone (exarkun@divmod.com)
0126 (2010-04-18): Giampaolo Rodolà (g.rodola@gmail.com)
0127 (2010-05-26): Alexander Belopolsky (alexander.belopolsky@gmail.com)
0128 (2010-08-06): Tim Golden (mail@timgolden.me.uk)
0129 (2010-08-14): Éric Araujo (merwok@netwok.org)
0130 (2010-08-22): Daniel Stutzbach (daniel@stutzbachenterprises.com)
0131 (2010-09-18): Brian Quinlan (brian@sweetapp.com)
0132 (2010-11-05): David Malcolm (dmalcolm@redhat.com)
0133 (2010-11-09): Ask Solem (askh@opera.com)
0134 (2010-11-10): Terry Reedy (tjreedy@udel.edu)
0135 (2010-11-10): Łukasz Langa (lukasz@langa.pl)
0136 (2012-06-24): Ned Deily (nad@acm.org)
0137 (2011-01-14): Eli Bendersky (eliben@gmail.com)
0138 (2011-03-10): Eric V. Smith (eric@trueblade.com)
0139 (2011-03-10): R David Murray (rdmurray@bitdance.com)
0140 (2011-03-12): orsenthil (orsenthil@gmail.com)
0141 (2011-03-14): Ross Lagerwall (rosslagerwall@gmail.com)
0142 (2011-03-14): Reid Kleckner (reid@kleckner.net)
0143 (2011-03-14): briancurtin (brian.curtin@gmail.com)
0144 (2011-03-24): guido (guido@google.com)
0145 (2011-03-30): Kristjan Valur Jonsson (sweskman@gmail.com)
0146 (2011-04-04): brian.curtin (brian@python.org)
0147 (2011-04-12): Nadeem Vawda (nadeem.vawda@gmail.com)
0148 (2011-04-19): Giampaolo Rodola' (g.rodola@gmail.com)
0149 (2011-05-04): Alexis Metaireau (alexis@notmyidea.org)
0150 (2011-05-09): Gerhard Haering (gh@ghaering.de)
0151 (2011-05-09): Petri Lehtinen (petri@digip.org)
0152 (2011-05-24): Charles-François Natali (neologix@free.fr)
0153 (2011-07-17): Alex Gaynor (alex.gaynor@gmail.com)
0154 (2011-07-27): Jason R. Coombs (jaraco@jaraco.com)
0155 (2011-08-02): Sandro Tosi (sandro.tosi@gmail.com)
0156 (2011-09-28): Meador Inge (meadori@gmail.com)
0157 (2012-01-09): Terry Jan Reedy (tjreedy@udel.edu)
0158 (2011-05-19): Tarek Ziade (tarek@ziade.org)
0159 (2011-05-22): Martin v. Loewis (martin@v.loewis.de)
0160 (2011-05-31): Ralf Schmitt (ralf@systemexit.de)
0161 (2011-09-12): Jeremy Kloth (jeremy.kloth@gmail.com)
0162 (2012-03-14): Andrew Svetlov (andrew.svetlov@gmail.com)
0163 (2012-03-21): krisvale (sweskman@gmail.com)
0164 (2012-04-24): Marc-Andre Lemburg (mal@egenix.com)
0165 (2012-04-30): Richard Oudkerk (shibturn@gmail.com)
0166 (2012-05-15): Hynek Schlawack (hs@ox.cx)
0167 (2012-06-20): doko (doko@ubuntu.com)
0168 (2012-07-16): Atsuo Ishimoto (ishimoto@gembook.org)
0169 (2012-09-02): Zbigniew Jędrzejewski-Szmek (zbyszek@in.waw.pl)
0170 (2012-09-06): Eric Snow (ericsnowcurrently@gmail.com)
0171 (2012-09-25): Chris Jerdonek (chris.jerdonek@gmail.com)
0172 (2012-12-27): Serhiy Storchaka (storchaka@gmail.com)
0173 (2013-03-31): Roger Serwy (roger.serwy@gmail.com)
0174 (2013-03-31): Charles-Francois Natali (cf.natali@gmail.com)
0175 (2013-05-10): Andrew Kuchling (amk@amk.ca)
0176 (2013-06-14): Ethan Furman (ethan@stoneleaf.us)
0177 (2013-08-12): Felix Crux (felixc@felixcrux.com)
0178 (2013-10-21): Peter Moody (python@hda3.com)
0179 (2013-10-25): bquinlan (brian@sweetapp.com)
0180 (2013-11-04): Zachary Ware (zachary.ware@gmail.com)
0181 (2013-12-02): Walter Doerwald (walter@livinglogic.de)
0182 (2013-12-21): Donald Stufft (donald@stufft.io)
0183 (2014-01-03): Daniel Holth (dholth@fastmail.fm)
0184 (2014-01-27): Yury Selivanov (yselivanov@sprymix.com)
0185 (2014-04-15): Kushal Das (kushaldas@gmail.com)
0186 (2014-06-29): Berker Peksag (berker.peksag@gmail.com)
0187 (2014-07-16): Tal Einat (taleinat@gmail.com)
0188 (2014-10-08): Steve Dower (steve.dower@microsoft.com)
0189 (2014-10-18): Robert Collins (rbtcollins@hp.com)
0190 (2015-03-22): Paul Moore (p.f.moore@gmail.com)

Resources

Similar functionality is provided by the more full-blown frameworks grunt-git-authors and gitstats.

Some resources that might be insightful for you: