Distributing a Python command line application

In this article I show how to create a minimal Python command line application, called ‘bootstrap’. I describe how to set it up for publication on PyPI, after which the user can conveniently install it via pip install bootstrap. The installation immediately makes the ‘bootstrap’ command available to the user — for convenient invocation on Unix as well as on Windows. I show how to make the example application live within a proper package structure and how to make it callable and testable in different convenient ways. On Python 2 and Python 3 (I actually tested this on CPython 2.7 and 3.3).

Update March 25, 2014: Thanks for all the feedback. I have updated the article in many places. The template structure is now using a __main__.py for convenience (see below). I have created a git repository from this template structure. Feel free to clone, fork, and star: python-cmdline-bootstrap on GitHub.

Background

There are many ways to achieve the same thing. In the paragraphs below, I try to give proper advice, including current official recommendations, and schemes well-established in the Python community. One thing you need to know, and probably already realized yourself, is that Python packaging and package distribution can be quite tedious. In the past years, the recommendations for doing things “the right way” have often changed. Finally, however, it looks like we have something definite which I can base my article on.

Besides using the right tools, such as setuptools (instead of distribute) and twine, there is a lot of tension hidden in the details of the file/directory structure, and in the way you organize your application in terms of packages and modules. When do you need absolute or relative imports? What would be a convenient entry point for your application? For which parts do you need a wrapper? What is the right way to test the application without installing it? I do not deeply want to go into all these details in this article, but rather present a working solution. Just be sure that I have consulted official docs and guidelines, and taken a deeper look into how various established applications (such as sphinx, coverage, pep8, pylint) are set up in this regard. I have also consulted several great answers on StackOverflow (e.g. this, this, this, this, and this), and finally implemented things myself (also here).

For this article, I try to break down all this valuable input to a minimal bare bones bootstrap project structure that should get you going. I try to reduce complexity, to avoid confusing constructs, and to not discuss difficulties anymore, from here on. The outcome is a very short and simple thing, really.

File structure

I recommend the following basic structure:

python-cmdline-bootstrap/
├── docs
├── test
├── bootstrap
│   ├── __init__.py
│   ├── __main__.py
│   ├── bootstrap.py
│   └── stuff.py
├── bootstrap-runner.py
├── LICENSE
├── MANIFEST.in
├── README.rst
└── setup.py

I have created a git repository from this structure template: python-cmdline-bootstrap on GitHub. Fell free to clone and fork.

Might look random in parts, but it is not. Clarification:

  • All relevant application code is stored within the bootstrap package (which is the bootstrap/ directory containing the __init__.py file).
  • bootstrap-runner.py is just a simple wrapper script that allows for direct execution of the command line application from the source directory, without the need to ‘install’ the application.
  • bootstrap/__main__.py makes the bootstrap directory executable as a script.
  • bootstrap/bootstrap.py is meant to be the main module of the application. This module contains a function main() which is the entry point of the application.
  • bootstrap/stuff.py is just an example for another module containing application logic, which can be imported from within bootstrap.py
  • README.rst and LICENSE should be clear.
  • MANIFEST.in makes sure that (among others) the LICENSE file is included in source distributions created with setuptools.
  • setup.py contains instructions for setuptools. It is executed when you, the creator, create a distribution file and when the user installs the application. Below, I describe how to configure it in a way so that setuptools creates an executable upon installation.

File contents: bootstrap package

The contents of the files in the bootstrap package, i.e. the application logic. Remember, you can find all this on GitHub.

__init__.py:
This file makes the bootstrap directory a package. In simple cases, it can be left empty. We make use of that and leave it empty.

bootstrap.py:

# -*- coding: utf-8 -*-
 
 
"""bootstrap.bootstrap: provides entry point main()."""
 
 
__version__ = "0.2.0"
 
 
import sys
from .stuff import Stuff
 
 
def main():
    print("Executing bootstrap version %s." % __version__)
    print("List of argument strings: %s" % sys.argv[1:])
    print("Stuff and Boo():\n%s\n%s" % (Stuff, Boo()))
 
 
class Boo(Stuff):
    pass

As stated above, this module contains the function which is the main entry point to our application. We commonly call this function main(). This main() function is not called by importing the module, it is only called when main() is called directly from an external module. This for instance happens when the bootstrap directory is executed as a script — this is magic performed by __main__.py, described below.

Some more things worth discussing in the bootstrap.py module:

  • The module imports from other modules in the package. Therefore it uses relative imports. Implicit relative imports are forbidden in Python 3. from .stuff import Stuff is an explicit relative import, which you should make use of whenever possible.
  • People often define __version__ in __init__.py. Here, we define it in bootstrap.py, because it is simpler to access from within bootstrap.py (;-)) and still accessible from within setup.py (where we also need it).

stuff.py:

# -*- coding: utf-8 -*-
 
 
"""bootstrap.stuff: stuff module within the bootstrap package."""
 
 
class Stuff(object):
    pass

As you can see, the bootstrap.stuff module defines a custom class. Once again, bootstrap.bootstrap contains an explicit relative import for importing this class.

__main__.py:

# -*- coding: utf-8 -*-
 
 
"""bootstrap.__main__: executed when bootstrap directory is called as script."""
 
 
from .bootstrap import main
main()

Certain workflows require the bootstrap directory to be treated as both a package and as the main script, via $ python -m bootstrap invocation. Actually, this calls the __main__.py file if existing (or fails if not). From within this file, we simply import our main entry point function (relative import!) and invoke it.

Executing the application: running the entry point function

You might be tempted to perform a $ python bootstrap.py, which would fail with ValueError: Attempted relative import in non-package. Is something wrong with the file structure or imports? No, is not. The invocation is wrong.

The right thing is to cd into the project’s root directory, and then execute

 $ python -m bootstrap arg1

Output:

Executing bootstrap version 0.2.0.
List of argument strings: ['arg1']
Stuff and Boo():
<class 'bootstrap.stuff.Stuff'>
<bootstrap.bootstrap.Boo object at 0x7f6e975e0b10>

Does this look unusual to you? Well, this is not a 1-file-Python-script anymore. You are designing a package, and Python packages have special behavior. This is normal. The $ python -m package kind of invocation actually is quite established and your package should support it. As you can see in the output above, command line argument support is as expected.

There is a straight-forward way for achieving the “normal” behavior that you are used to. That is what the convenience wrapper bootstrap-runner.py is made for. Its content:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
 
 
"""Convenience wrapper for running bootstrap directly from source tree."""
 
 
from bootstrap.bootstrap import main
 
 
if __name__ == '__main__':
    main()

Should be self-explanatory, still: it imports the entry point function main from module bootstrap.bootstrap and — if executed by itself as a script — invokes this function. Hence, you can use bootstrap-runner.py as a normal script, i.e. as the command line front end to your application. Set permissions via $ chmod u+x bootstrap-runner.py and execute it:

$ ./bootstrap-runner.py argtest
Executing bootstrap version 0.2.0.
List of argument strings: ['argtest']
Stuff and Boo():
<class 'bootstrap.stuff.Stuff'>
<bootstrap.bootstrap.Boo object at 0x7f5402343b50>

Straight-forward, right? You can now use $ python -m bootstrap or bootstrap-runner.py for testing or production purposes, without the need to install the application.

Preparing setup.py

Code upfront:

# -*- coding: utf-8 -*-
 
 
"""setup.py: setuptools control."""
 
 
import re
from setuptools import setup
 
 
version = re.search(
    '^__version__\s*=\s*"(.*)"',
    open('bootstrap/bootstrap.py').read(),
    re.M
    ).group(1)
 
 
with open("README.rst", "rb") as f:
    long_descr = f.read().decode("utf-8")
 
 
setup(
    name = "cmdline-bootstrap",
    packages = ["bootstrap"],
    entry_points = {
        "console_scripts": ['bootstrap = bootstrap.bootstrap:main']
        },
    version = version,
    description = "Python command line application bare bones template.",
    long_description = long_descr,
    author = "Jan-Philip Gehrcke",
    author_email = "jgehrcke@googlemail.com",
    url = "http://gehrcke.de/2014/02/distributing-a-python-command-line-application",
    )

Some things to discuss:

  • Might appear trivial, but from setuptools import setup is the currently recommended way to go.
  • Your setup.py should not import your package for reading the version number. This fails for the end-user. Instead, always read it directly. In this case, I used regular expressions for extracting it. This is up to you. But never import your own module/package.
  • The setup function has many more useful arguments than shown here. For a serious project read the docs and make proper use of author, classifiers, platform, etc.
  • I have called the project cmdline-bootstrap here instead of just bootstrap, because I do really upload this to PyPI later on (see below). And “bootstrap”, although still free, is just too much of a popular name to use it for something that small.

The essential arguments here are packages and entry_points. packages = ["bootstrap"] tells setuptools that we want to install our bootstrap package to the user’s site-packages directory. The console_scripts item 'bootstrap = bootstrap.bootstrap:main' instructs setuptools to generate a script called bootstrap. This script will invoke bootstrap.bootstrap:main, i.e. the main function of our bootstrap.bootstrap module, our application entry point. This is the same as realized within bootstrap-runner.py — the difference is that setuptools automatically creates a wrapper script in the user’s file system when she/he installs bootstrap via pip install bootstrap. setuptools places this wrapper into a directory that is in the user’s PATH, i.e. it immediately makes the bootstrap command available to the user. This also works on Windows, where a small .exe file is created in something like C:\Python27\Scripts.

Testing the setup

We use virtualenv to reproduce what users see. Once, for CPython 2(.7), once for CPython 3(.3). Create both environments:

$ virtualenv --python=/path/to/python27 venvpy27
...
$ virtualenv --python=/path/to/python33 venvpy33
...

Activate the 2.7 environment, and install the bootstrap application:

$ source venvpy27/bin/activate
$ python setup.py install
running install
running bdist_egg
running egg_info
[...]
Installed /xxx/venvpy27/lib/python2.7/site-packages/cmdline_bootstrap-0.2.0-py2.7.egg
Processing dependencies for cmdline-bootstrap==0.2.0
Finished processing dependencies for cmdline-bootstrap==0.2.0

See if (and where) the command has been created:

$ command -v bootstrap
/xxx/venvpy27/bin/bootstrap

Try it:

$ bootstrap arg
Executing bootstrap version 0.2.0.
List of argument strings: ['arg']
Stuff and Boo():
<class 'bootstrap.stuff.Stuff'>
<bootstrap.bootstrap.Boo object at 0x7f1234d31190>

Great. Repeat the same steps for venvpy33, and validate:

$ command -v bootstrap
/xxx/venvpy33/bin/bootstrap
$ bootstrap argtest
Executing bootstrap version 0.2.0.
List of argument strings: ['argtest']
Stuff and Boo():
<class 'bootstrap.stuff.Stuff'>
<bootstrap.bootstrap.Boo object at 0x7f4cf931a550>

A note on automated tests

In the test/ directory you can set up automated tests for your application. You can always directly import the development version of your modules from e.g. test/test_api.py, if you modify sys.path:

sys.path.insert(0, os.path.abspath('..'))
from bootstrap.stuff import Stuff

If you need to directly test the command line interface of your application, then bootstrap-runner.py is your friend. You can easily invoke it from e.g. test/test_cmdline.py via the subprocess module.

Upload your distribution file to PyPI

Create a source distribution of your project, by default this is a gzipped tarball:

$ python setup.py sdist
$ /bin/ls dist
cmdline-bootstrap-0.2.0.tar.gz

Register your project with PyPI. Then use twine to upload your project (twine is still to be improved!):

$ pip install twine
$ twine upload dist/cmdline-bootstrap-0.2.0.tar.gz 
Uploading distributions to https://pypi.python.org/pypi
Uploading cmdline-bootstrap-0.2.0.tar.gz
Finished

Final test: install from PyPI

Create another virtual environment, activate it, install cmdline-bootstrap from PyPI and execute it:

$ virtualenv --python=/xxx/bin/python3.3 venvpy33test
...
$ source venvpy33test/bin/activate
$ bootstrap
bash: bootstrap: command not found
$ pip install cmdline-bootstrap
Downloading/unpacking cmdline-bootstrap
  Downloading cmdline-bootstrap-0.2.0.tar.gz
  Running setup.py egg_info for package cmdline-bootstrap
 
Installing collected packages: cmdline-bootstrap
  Running setup.py install for cmdline-bootstrap
 
    Installing bootstrap script to /xxx/venvpy33test/bin
Successfully installed cmdline-bootstrap
Cleaning up...
 
$ bootstrap testarg
Executing bootstrap version 0.2.0.
List of argument strings: ['testarg']
Stuff and Boo():
<class 'bootstrap.stuff.Stuff'>
<bootstrap.bootstrap.Boo object at 0x7faf433edb90>

That was it, I hope this is of use to some of you. All code is available on GitHub.

2 Pingbacks/Trackbacks

  • durden20

    Great article. I really appreciate having all this information condensed here for reference later.

  • Alessandro Pisa

    Very interesting reading, thanks a lot for putting all those pieces together!
    One note: please suggest to make pypi upload tests using the test pypi server https://wiki.python.org/moin/TestPyPI, otherwise we will find gazillions of “bootstrap like” projects on pypi, it already happens with the printer of nested lists :)
    Thanks again for the awesome writing!

    • Thanks for the feedback and point taken. I’ll probably update the article correspondingly.

  • Kyle

    Awesome, thank you!

  • Pingback: Python-digest #19. Новости, интересные проекты, статьи и интервью [16 марта 2014 — 23 марта 2014] » CreativLabs()

  • Nick

    This is great! Thanks

  • Василий Макаров

    Please, explain in more detail why I shouldn’t import my module in setup.py? Under what conditions it may fail for user?

    • Very good question. It is actually quite simple. Installation of your package requires execution of your setup.py. If this setup.py imports your package, then your package usually imports its dependencies. If these dependencies are not available on the user’s system, this import will fail, i.e. execution of setup.py crashes with an ImportError and the installation aborts. I once fell for this in my gipc project: https://bitbucket.org/jgehrcke/gipc/issue/1/incompatibility-with-pip-requirementstxt#comment-3827522

      On the other hand, if your package has no dependencies that might not be available on the user’s system, I guess it is safe to import your package in setup.py. But remember to remove this import when you add a dependency later on ;-).

      Do whatever you think is the right thing and test if it works. The general recommendation however should be not to import your module/package from within setup.py.

      • Василий Макаров

        Ah, now it’s clear. Thanks!

  • Amr Mostafa

    Instead of the runner, I usually use “sudo python setup.py develop”, which installs all console scripts as symlinks pointing to my development source tree

  • Aleksi

    Very useful – thanks!

  • JT

    This is incredibly useful, thanks. I’m setting up a new project with this structure now and one thing I’m having to work out is which parts of the bootstrap template to rename. I’ve assumed that I should rename anything called bootstrap to be my_project_name instead, e.g. the bootstrap directory and bootstrap.py within it. Is that what you tend to do, or do you leave it as is and build the project around it?

    • Renaming all “bootstrap” occurrences to your project name is a proper approach. Just make it step by step and after every step verify that things still work.

      • JT

        Thanks Jan-Philip. I’ve just started using PyCharm and that’s handling that refactoring very nicely.

  • Thanks a lot for this article, it’s a great help! I just wanted to recommend tox (if you don’t know of it yet): https://tox.readthedocs.org/en/latest/. It automates the process of setting up multiple virtual environments to test a distribution for various Python versions.
    Best,
    Peter

    • Thanks for commenting! Sure, I know tox. And I also love pyenv + pyenv-virtualenv! A perfect tool combination.

  • Thanks for this! This inspired me to create small tool to automatically create files and directories you describe. PyPi: https://pypi.python.org/pypi/pipapp and GitHub: https://github.com/samisalkosuo/pipapp

  • abpindia1944

    Can you please explain why we use ‘from .stuff import Stuff’ and not ‘from stuff import Stuff’? Also, if I have other module, say foo, from which stuff will import should it be like this ‘from .foo import bar’ or without the ‘.'(import goes in stuff.py and not bootstrap.py).

  • Pingback: State of the GitHub()