Category Archives: Linux

Hello SPF record.

Introduction

Say Google’s or Yahoo’s mail transfer agent (MTA) retrieves an email sent from an MTA running on the host with the (fictional) IP address 1.1.1.1. Say the sending MTA identifies itself as sending on behalf of important.com (within the EHLO/HELO handshake, or with a From address with the domain important.com). Then the receiving MTA (Google or Yahoo, or something else) usually queries the important.com domain via DNS for the so-called SPF (Sender Policy Framework) record. Using this SPF record the retrieving MTA can verify whether the owner of the domain important.com intended to send mail from 1.1.1.1 or not.

The SPF record might say that important.com does not send mails. Then the receiving MTA knows for sure that the incoming mail is invalid/fake/spam. Or this record says that important.com actually sends mail, but only from the (fictional) IP address 2.2.2.2. Then the retrieving MTA also knows for sure that something is wrong with the mail. Another possibility is that the record states that important.com actually sends mail from 1.1.1.1, in which case the incoming mail likely is intended by the owner of the domain important.com. This does not mean that this mail is not spam, but at least one can be quite sure that nobody crafted that mail on behalf of important.com.

That was a simplified description for why it is important to have a proper SPF record set for a certain domain in case one intends to send mail from this domain. In the next paragraphs I explain how I have set the SPF record for gehrcke.de and why.

SPF record for gehrcke.de

Domainfactory is the tech-c for gehrcke.de. However, the DNS entries for gehrcke.de are managed by Cloudflare (free, and quite comfortable). Today, I added an SPF record for gehrcke.de via Cloudflare’s web interface. The first important thing to take note of is that the SPF record actually is a TXT record. There once was a distinct SPF record proposed, but in RFC 7208 (which specifies SPF) we clearly read:

The SPF record is expressed as a single string of text found in the RDATA of a single DNS TXT resource record

and

SPF records MUST be published as a DNS TXT (type 16) Resource Record (RR) [RFC1035] only. Use of alternative DNS RR types was supported in SPF’s experimental phase but has been discontinued.

By now it should be clear that I had to add a new TXT record for gehrcke.de, with a special syntax, the SPF syntax. This syntax is specified here. What I need to express in the record is that mail sent on behalf of gehrcke.de is only sent by the machine with the IPv4 address 5.45.109.1 / or from the IPv6 subnet fe80::5054:8dff:fe9b:358d/64. This is the machine that I know should send mail. No other people / machines should send mail on behalf of my domain. The corresponding record is:

v=spf1 ip4:5.45.109.1 ip6:fe80::5054:8dff:fe9b:358d/64 -all

The prefix v=spf1 declares the SPF version, the ip4 and ip6 markers declare that mail from a certain address or subnet are valid. -all expresses that mail from all other hosts is disallowed. The fact that I do not “allow” others to send mail on behalf of gehrcke.de is just my declaration of intent. A retrieving MTA is not required to respect my SPF record. However, larger mail providers should respect it and at least classify a mail that does not pass the SPF test as spam, if not reject it right away.

Whenever I switch machines or IP addresses, I need to remember to update this record. This is a minor disadvantage. To circumvent this, in many cases a short

v=spf1 a -all

would be enough. It expresses that mail sent from the IP address corresponding to the DNS A record of the domain is allowed. In my case, however, this does not work since the A record points to Cloudflare’s cache rather than directly to my machine.

Testing the record

There are many web-based tools for checking whether a certain domain has an SPF record set and whether its syntax is valid. Particularly useful is, for instance, mxtoolbox.com/SuperTool.aspx. However, a real full-system validation obviously requires you to send a mail to an external service that resembles a serious mail provider and debugs the communication for you, meaning that it informs you about all externally visible details of your mail setup, including the SPF record. A great verification tool that does exactly this is provided by Port25. On the command line I used my local MTA (Exim4) to send mail to their service, instructing them to return the corresponding report to may Googlemail address:

$ echo "This is a test." | mail -s Testing check-auth-jgehrcke=googlemail.com@verifier.port25.com

The following is an excerpt of the test result, indicating that the actual mail sender matches the data provided in the SPF record:

HELO hostname:  gurke2.gehrcke.de
Source IP:      5.45.109.1
mail-from:      user@gehrcke.de
 
----------------------------------------------------------
SPF check details:
----------------------------------------------------------
Result:         pass 
ID(s) verified: smtp.mailfrom=user@gehrcke.de
DNS record(s):
    gehrcke.de. SPF (no records)
    gehrcke.de. 300 IN TXT "v=spf1 ip4:5.45.109.1 ip6:fe80::5054:8dff:fe9b:358d/64 -all"

Great.

Good to know checkrestart from debian-goodies

The security audits triggered by Heartbleed lead to an increasing discovery of security issues in libraries such as GnuTLS and OpenSSL within the past weeks. These issues were dealt with responsibly, i.e. they were usually published together with the corresponding binary updates by major Linux distributions such as Debian (subscribing to debian-security-announce or comparable sources is highly recommended if you want to keep track of these developments).

After downloading an updated binary of a shared library, it is important to restart all services (processes) that are linked against this library in order to make the update take effect. Usually, processes load a shared library code segment to random access memory once during startup (actually, the program loader / runtime linker does this), and do not reload that code afterwards throughout their life time, which may be weeks or months in case of server processes. Prime examples are Nginx/Apache/Exim being linked against OpenSSL or GnuTLS: if you update the latter but do not restart the former, you have changed your disk contents but not updated your service. In the worst case the system is still vulnerable. It should therefore be the habit of a good system administrator to question which services are using a certain shared library and to restart them after a shared library update, if required.

There is a neat helper utility in Debian, called checkrestart. It comes with the debian-goodies package. First of all, what is debian-goodies? Let’s see (quote from Wheezy’s package description):

Small toolbox-style utilities for Debian systems
 
These programs are designed to integrate with standard shell tools, extending them to operate on the Debian packaging system.
 
 dgrep  - Search all files in specified packages for a regex
 dglob  - Generate a list of package names which match a pattern
 
These are also included, because they are useful and don't justify their own packages:
 
 debget          - Fetch a .deb for a package in APT's database
 dpigs           - Show which installed packages occupy the most space
 debman          - Easily view man pages from a binary .deb without extracting
 debmany         - Select manpages of installed or uninstalled packages
 checkrestart    - Help to find and restart processes which are using old
                   versions of upgraded files (such as libraries)
 popbugs         - Display a customized release-critical bug list based on
                   packages you use (using popularity-contest data)
 which-pkg-broke - find which package might have broken another

Checkrestart is a Python application wrapping lsof (“list open files”). It tries to identify files used by processes that are not in the file system anymore. How so?

Note that during an update a certain binary file becomes replaced: the new version is first downloaded to disk and then rename()ed in order to overwrite the original. During POSIX rename() the old file becomes deleted. But the old file is still in use! The standard says that if any process still has a file open during its deletion, that file will remain “in existence” until the last file descriptor referring to it is closed. While these files that are still held “in existence” for running processes by the operating system, they are not listed in the file system anymore. They can however easily be identified via the lsof tool. And this is exactly what checkrestart does.

Hence, checkrestart “compares” the open files used by running processes to the corresponding files in the file system. If the file system contains other (e.g. newer) data than the process is currently using, then checkrestart proposes to restart that process. In a tidy server environment, this usually is the case only for updated shared library files. Below you can find example output after updating Java, Python, OpenSSL, and GnuTLS:

# checkrestart
Found 12 processes using old versions of upgraded files
(5 distinct programs)
(5 distinct packages)
 
Of these, 3 seem to contain init scripts which can be used to restart them:
The following packages seem to have init scripts that could be used
to restart them:
nginx-extras:
    20534   /usr/sbin/nginx
    20533   /usr/sbin/nginx
    20532   /usr/sbin/nginx
    19113   /usr/sbin/nginx
openssh-server:
    3124    /usr/sbin/sshd
    22964   /usr/sbin/sshd
    25724   /usr/sbin/sshd
    22953   /usr/sbin/sshd
    25719   /usr/sbin/sshd
exim4-daemon-light:
    3538    /usr/sbin/exim4
 
These are the init scripts:
service nginx restart
service ssh restart
service exim4 restart
 
These processes do not seem to have an associated init script to restart them:
python2.7-minimal:
    2548    /usr/bin/python2.7
openjdk-7-jre-headless:amd64:
    4348    /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java

Nginx (web server) and OpenSSH (SSH server) are linked against OpenSSL, and Exim (mail transfer agent) is linked against GnuTLS. So that output makes sense. Obviously, after an update of Python and Java processes using this interpreter or VM (respectively) also need to be restarted in order to use the new code. Checkrestart is extremely helpful, thanks for this nice tool.

Update 2014-09-03:

With the upcoming Debian 8 (Jessie, currently unstable), a new package has been introduced — needrestart. It is a more modern version of checkrestart, and tightly integrates with the systemd service infrastructure. Notably, needrestart comes in its own package and with the goal to become more popular than checkrestart (which was hidden in debian-goodies, as discussed above). Currently, a discussion is going on on the debian-security mailing list about installing and running needrestart in a default Debian installation.

exim4: add hostname for local routing

These days, I was confused by exim4 behavior on debian:

# hostname
gurke2
# cat /etc/hostname
gurke2
# exim -bt herbert@gurke2
R: nonlocal for herbert@gurke2
herbert@gurke2 is undeliverable: Mailing to remote domains not supported

So, although the recipient herbert@gurke2 clearly should be treated as a recipient in the local domain (the local host is named gurke2), exim thinks that gurke2 is a remote domain.

After digging into the config files, I came across the options named local_domain and MAIN_LOCAL_DOMAINS, searched the web for these and found the solution in this thread. The dc_other_hostnames is the right place to add host names (or fully qualified domain names) that should be treated by exim’s local_user router. I set this to gurke2, reloaded the exim configuration, and tested again. It works:

# vi /etc/exim4/update-exim4.conf.conf
# cat /etc/exim4/update-exim4.conf.conf | grep dc_other_hostnames
dc_other_hostnames='gurke2'
# service exim4 reload
[ ok ] Reloading exim4 configuration files: exim4.
# exim -bt herbert@gurke2
R: system_aliases for herbert@gurke2
R: userforward for herbert@gurke2
R: procmail for herbert@gurke2
R: maildrop for herbert@gurke2
R: lowuid_aliases for herbert@gurke2 (UID 1000)
R: local_user for herbert@gurke2
herbert@gurke2
  router = local_user, transport = mail_spool

numpy built with recent Intel compilers: MKL FATAL ERROR

I just spent hours trying to build numpy 1.8.0 with the most recent Intel MKL, Intel C++ Composer, and Intel Fortran Composer (Version 2013 SP1, from January 2014). numpy built fine, imported fine, and basic math worked fine. But one of the earlier tests (when running numpy.test()) triggered this error:

MKL FATAL ERROR: Cannot load libmkl_mc3.so or libmkl_def.so.

This error message is obviously not generated by the system, but written out by some MKL code. I double-checked that LD_LIBRARY_PATH was set correctly, and that these libraries were actually available. MKL itself seems to be confused about how to access these libraries.

There is not too much to find about this error — the two most important resources being

  • An Intel forum thread, where even the Intel engineers are struggling in identifying the problem. The user there solved his issue by accident, there was no logical solution.
  • and a StackOverflow answer saying “Intel people claim this was MKL library bug.

More people have this kind of problem:

The Intel guys have a general “solution” to this class of problem: http://software.intel.com/en-us/articles/mkl-fatal-error-cannot-load-neither-xxxx-with-intel-mkl-10x — this article provides insights into how things should work, but did not really help in my scenario.

Some of the workarounds that magically solve this issue for some people involve explicit declaration of a list of libraries to link to (via mkl_libs in site.cfg in case of numpy), but that did not help in my case. According to http://software.intel.com/en-us/articles/numpy-scipy-with-mkl and other Intel resources, it should be the best idea to rely on mkl_rt for dynamic dependency resolution, so this is what I wanted to do. And most importantly the mkl_rt method should, according to Intel, prevent this error from happening (reference). This was also propagated in a forum thread in 2010 (here):

“To who’m concerns,

The mkl dynamic library issue will be fixed thoroughly in MKL 10.3 and future version. A new librar mkl_rt was introduced. Now to build the custom MKL library based on dynamic MKL, one can easilyuse the library libmkl_rt.so instead of -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core

The new mkl version will be integrated to next intel Compiler version too, which targeted to be release in Nov.”

Yeah, that’s ironic. I do not want to say that it is impossible to get things running using recent Intel compiler suites, this statement would be wrong. But it is definitely tedious! See https://github.com/GalSim-developers/GalSim/issues/261. One quote from that thread:

Actually, this might be harder than I thought. I googled the error you were having and found a thread that ends with this comment:

“I remember now that I had the same problem recently – it is a
fundamental incompatibility between MKL and Python way of loading shared
libraries through dlopen. AFAIK, there is no solution to this problem,
except for using the static libraries.”

I’m not sure what they mean about using static libraries though, since python usually can’t handle static libraries.

For sure, the conclusion after this bit of research is that this error message is the result of a serious issue, and the direct solution to it is not well documented. There is a simpler way to tackle this:

I went a few steps back and built with the 12.1.3 suite (MKL, icc, ifort). It just works (and suggests that there are actual issues with newer releases of MKL/icc/ifort).

I describe the build process in another blog post here.

Building numpy and scipy with Intel compilers and Intel MKL on a 64 bit machine

At work we make heavy use of the Python/numpy/scipy stack. We have built numpy and scipy against Intel’s MKL, with Intel’s C++ compiler icc and Intel’s Fortran compiler ifort. As far as I know, and also from my own experience, this combination of tools lets you get close-to-optimum performance for numerical simulations on classical hardware, especially for linear algebra calculations. If you get the build right, and write proper code, i.e. use numpy’s data types and numpy’s/scipy’s functions the right way, then — spoken in general terms — no commercial or other open source software package is able to squeeze more performance from your hardware. I just updated the installation to the most recent Python 2 (2.7.6), numpy 1.8.0 and scipy 0.13.3. I built on a 64 bit machine with the Intel suite version 12.1.3 — newer versions of the Intel suite have troubles with respect to resolving library dependencies (which is the topic of this blog post). The documentation on the topic of building numpy and scipy with the Intel suite is sparse, so I describe the procedure I took in this article.

A few notes on performance and version decisions: compiling numpy and scipy against MKL provides a significant single-thread performance boost over classical builds. Furthermore, building against Intel’s OpenMP library can provide extremely efficient multicore performance through automatic threading of normal math operations. These performance boosts are significant. You want this boost, believe me, so it makes a lot of sense to build numpy/scipy with the Intel suite. However, it does not really make sense to build (C)Python with the Intel suite. It is not worth the effort — somewhere I have read that it might even be slower than an optimized GCC build. In any case, Python is not so much about math. It is about reliability and control. We should do math in numpy/scipy and use optimized builds for them — then there simply is no significance in optimizing the Python build itself. Regarding the Intel suite, I am pretty sure that newer versions do not provide large performance improvements compared to the 12.1.3 one. These reasons made me build things the following way:

  • CPython 2.7.6 in classical configure/make/make install fashion, with the system’s GCC.
  • numpy and scipy with Intel suite (MKL, icc, ifort) version 12.1.3.

Prerequisites

I assume that you have built Python as a non-privileged user and installed it to some location in your file system (btw: never just overwrite your system’s Python with a custom build!). Set up the environment for this build, so that

$ python

invokes it. You can validate this via $ python --version and $ command -v python. In the following steps, we will work with the same user used for building and installing Python: numpy and scipy installation files will go right into the directory tree of the custom Python build.

I also assume that you have a working copy of MKL, icc, ifort, and that you have set it up like this:

$ source /path_to_intel_compilers/bin/compilervars.sh intel64

After invoking compilervars.sh, your PATH and LD_LIBRARY_PATH environment variables should contain the directories where all the Intel binaries reside. A simple test (which you also should perform):

$ icc --version
icc (ICC) 12.1.3 20120212

Prepare, build, install, and validate numpy

In the numpy source directory create the file site.cfg with the following content:

[mkl]
library_dirs = /path_to_intel_compilers/mkl/lib/intel64/
include_dirs = /path_to_intel_compilers/mkl/include/
mkl_libs = mkl_rt
lapack_libs =

That is right, no more contents are needed in that file, it can be that simple. This approach is also recommended in http://software.intel.com/en-us/articles/numpy-scipy-with-mkl.

What is left to be done is setting compiler flags. The build process uses compiler abstractions stored in the two files: numpy/distutils/fcompiler/intel.py and numpy/distutils/intelccompiler.py. You might feel the need to treat these files with respect, because they appear to be so important. I am frightened of these files, because they are partly inconsistent in themselves and have a quite bloaty appearance not justified by the very simple purpose I expect them to serve. My point is: do not be afraid, and make some drastic reductions in order to be sure about what happens during the build. In numpy/distutils/fcompiler/intel.py you can safely edit the get_flags* methods of the IntelEM64TFCompilerclass, so that they look like this:

class IntelEM64TFCompiler(IntelFCompiler):
    compiler_type = 'intelem'
    [...]
 
    def get_flags(self):
        return ['-O3 -g -xhost -openmp -fp-model strict -fPIC']
 
    def get_flags_opt(self):
        return []
 
    def get_flags_arch(self):
        return []

See, this compiler class describes itself as intelem and that is the compiler type we are going to use in the build command line. The “IntelEM64TFCompiler” has no meaning at all. We are just editing one of the classes, make sure that it has the compiler flags we want and eventually use this compiler abstraction during build. In the code above, I have made sure that the compiler flags '-O3 -g -xhost -openmp -fp-model strict -fPIC' are used by making the method get_flags return them, and make all other related methods return “nothing”. Additionally (not shown above), I have also edited the possible_executables attribute of the class:

possible_executables = ['ifort']

just to make sure that ifort will be used. I have also removed all compiler classes not needed from that file, just to get a better overview.

With the above’s steps, the Fortran compiler has been set up. Next, edit numpy/distutils/intelccompiler.py for configuring the C compiler. I have deleted tons of stuff from this file. What follows is all the content remaining (you can copy/paste it like this, I think, this should be safe for most purposes):

from __future__ import division, absolute_import, print_function
 
from distutils.unixccompiler import UnixCCompiler
from numpy.distutils.exec_command import find_executable
 
class IntelEM64TCCompiler(UnixCCompiler):
    """ A modified Intel x86_64 compiler compatible with a 64bit gcc built Python.
    """
    compiler_type = 'intelem'
    cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'
    #cc_args = "-fPIC"
    def __init__ (self, verbose=0, dry_run=0, force=0):
        UnixCCompiler.__init__ (self, verbose, dry_run, force)
        compiler = self.cc_exe
        self.set_executables(compiler=compiler,
                             compiler_so=compiler,
                             compiler_cxx=compiler,
                             linker_exe=compiler,
                             linker_so=compiler + ' -shared')

Regarding the compiler flags I trusted the Intel people somewhat and took most of them from http://software.intel.com/en-us/articles/numpy-scipy-with-mkl. I read the docs for all of them, and they seem to make sense. But you should think these through for your environment. It is good to know what they mean.

Then build numpy:

python setup.py build --compiler=intelem | tee build.log

Copy the build to the Python tree (install):

python setup.py install | tee install.log

Validate that the numpy build works:

20:20:57 $ python
Python 2.7.6 (default, Feb 18 2014, 15:09:15) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.8.0'
>>> numpy.test()
Running unit tests for numpy
NumPy version 1.8.0
NumPy is installed in /projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy
Python version 2.7.6 (default, Feb 18 2014, 15:09:15) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
nose version 1.3.0
[... snip ...]
----------------------------------------------------------------------
Ran 4969 tests in 63.975s
 
OK (KNOWNFAIL=5, SKIP=3)
<nose.result.TextTestResult run=4969 errors=0 failures=0>

That looks great. Proceed.

Build, install, and validate scipy

Now that numpy is installed(!), we can go ahead with building scipy. Extract the scipy source, enter the source directory and invoke

python setup.py config --compiler=intelem --fcompiler=intelem build_clib --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem --fcompiler=intelem install | tee build_install.log

See how we here enforce usage of intelem (which is just a label/name) for all components? This is important. The scipy build process uses the same build settings as used by numpy, especially the distutils compiler abstraction stuff (which is why numpy needs to installed before — this is a simple fact that the official docs do not explain well). I built and installed at the same time, on purpose. When doing build and install in separate steps, the install step involves some minor compilation tasks which are then performed using gfortran instead of ifort. At first I thought that this is none of an issue, but when executing scipy.test() I soon got a segmentation fault, due to mixing of compilers. When using the command as above (basically taken from http://software.intel.com/en-us/articles/numpy-scipy-with-mkl), the test result is positive:

Python 2.7.6 (default, Feb 18 2014, 15:09:15) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.test()
Running unit tests for scipy
NumPy version 1.8.0
NumPy is installed in /projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy
SciPy version 0.13.3
SciPy is installed in /projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/scipy
Python version 2.7.6 (default, Feb 18 2014, 15:09:15) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
nose version 1.3.0
/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy/lib/utils.py:134: DeprecationWarning: `scipy.lib.blas` is deprecated, use `scipy.linalg.blas` instead!
  warnings.warn(depdoc, DeprecationWarning)
/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy/lib/utils.py:134: DeprecationWarning: `scipy.lib.lapack` is deprecated, use `scipy.linalg.lapack` instead!
  warnings.warn(depdoc, DeprecationWarning)
 
[ ... snip ...]
======================================================================
ERROR: test_fitpack.TestSplder.test_kink
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/scipy/interpolate/tests/test_fitpack.py", line 329, in test_kink
    splder(spl2, 2)  # Should work
  File "/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/scipy/interpolate/fitpack.py", line 1186, in splder
    "and is not differentiable %d times") % n)
ValueError: The spline has internal repeated knots and is not differentiable 2 times
 
----------------------------------------------------------------------
Ran 8934 tests in 131.506s
 
FAILED (KNOWNFAIL=115, SKIP=220, errors=1)
<nose.result.TextTestResult run=8934 errors=1 failures=0>

The one failing test is a corner case known to fail in some scenarios (https://github.com/scipy/scipy/issues/2911). I might be able to narrow it down and help resolving named issue.

Congratulations, done. Enjoy the performance.