Concatenate byte strings in Python 3

What is the recommended way to concatenate (two or a few) byte strings in Python 3? What is the recommended way to do so if the code should work for both, Python 2 and 3?

In Python 2 we can easily concatenate two or more byte strings using string formatting:

>>> a = "\x61"
>>> b = "\x62"
>>> "%s%s" % (a, b)
'ab'

It is payback time and repr("%s" % b"a") semi-intuitively returns '"b\'a\'"' in Python 3(.3) (and b"%s" % b"a" throws TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes'). This is the result of Python 3’s strict distinction between text (sequence of unicode code points) and bytes (sequence of raw bytes). Eventually, in Python 3 the concatenation of byte strings via string formatting yields something entirely different from what Python 2 does:

>>> a = b"\x61"
>>> b = b"\x62"
>>> "%s%s" % (a,b)
"b'a'b'b'"

The outcome is text (a sequence of unicode code points) instead of a byte string (a sequence of bytes). In Python terminology, the result of this byte string concatenation is the concatenation of the representations (repr() / __repr__()) of these byte strings.

Concatenating two byte strings

In Python 3, the __add__ operator returns what we want:

>>> a + b
b'ab'

__add__ also works for Python 2:

>>> a + b
'ab'

Concatenating many byte strings

The above would be the preferred method if you want to concatenate only two byte strings. In case you have a longer sequence of byte strings that you need to concatenate, the good old join() will work in both, Python 2.7 and 3.x.

Python 3 output:

>>> b"".join([a, b])
b'ab'

Python 2.7 output:

>>> b"".join([a, b])
'ab'

In Python 3, the 'b' separator string prefix is crucial (join a sequence of strings of type bytes with a separator of type bytes). In Python 2.7, the 'b' prefix is ignored and string literals are byte strings by default anyway. In older versions of Python 2 the 'b' prefix is a syntax error.

For a largish sequence of byte strings, the join()-based concatenation clearly is more efficient than the +-based one.