Monthly Archives: January 2014

NDR interview with Edward Snowden

Do not only listen to what people are saying about him. Get to know him a bit.

NDR journalist Hubert Seipel interviews Edward Snowden in Moscow (~30 minutes):

Original source: http://www.ndr.de/ratgeber/netzwelt/snowden271.html
If this video is blocked for you or has been deleted, you might want to try https://www.google.de/search?q=vimeo+ndr+snowden.

Wannabe scientists as lens benchmarkers

Topic photography. Unfortunately, many lens reviews mainly discuss quantitative aspects that are of negligible relevance in practice.

Regarding the optical performance of a lens, especially image sharpness, there are plenty of metrics that can be measured. Not surprisingly, these metrics can be measured uber precisely — given expensive lab equipment. Obviously, lenses differ in these metrics. The issue is that many of these photography review sites out there forget putting these differences into context. They forget to explain that often these differences are insignificant when doing photography under real-world conditions. At the same time they treat those differences among lenses that actually make a difference in practice as a side note (such as focus speed, bokeh, material and weight). Why is that? I guess people are easily blinded by their cool and uber precise lab equipment as well as by their ‘research results’ — so that they forget what actually matters.

Talking bottlenecks

The concept of bottlenecks is a concept every engineer understands and should cherish. Actually everybody understands, I guess. However, many people do not apply it on a day-to-day basis (including those wannabe scientists writing lens reviews). When applying the concept of bottlenecks, one identifies the weakest component/entity in a system regarding a certain output criterion. That thing, the weakest component, the bottleneck, determines the output quality.

  • Insight 1: improving everything else except for this bottleneck does not change the output quality. As a direct consequence, any effort in improving other components is wasted.
  • Insight 2: it is possible to improve the bottleneck component that much so that now a different component becomes the new bottleneck. This would be a serious improvement.

Now, regarding image sharpness, there are three major components in the optical system that could potentially be bottlenecks, depending on the scenario:

  • Subject movement vs. exposure time
  • Subject location and dimension vs. focal plane location and orientation
  • Optical performance of the lens (or: “how sharp is the image under ideal conditions, i.e. subject is ideally aligned with the focal plane and the subject does not move”)

When reading these points, you probably do already realize that in real-world photography, the first two are real bottlenecks. Whereas the third point is about physical limitations imposed by the lens, the first two points are physical limitations imposed by the scene itself. There is no way to trick these physical limitations.

Unbreakable limits

Let’s explicitly state what these physical limitations are:

  • Subjects move. Think people, animals, or plants moving in the wind. Things just move. There is at least some micro movement. When things practically don’t move, we’re under laboratory conditions.
  • Depending on the exposure time and the camera movement, the subject movement yields a certain degree of blurriness in the image. This can be very large or very small. Also, in the image it might appear that the subject is frozen whereas the background is blurry. In any case, any movement in the scene is responsible for decreasing sharpness somewhere in the image.
  • Taking an image of a scene means projecting 3D on 2D. There is only one focal plane where this projection is sharp (ignoring movement). It helps to vividly imagine a real plane standing around somewhere in the scene. Imagine how this plane cuts the scene and the actual subject. Clearly, almost all of the scene is *not* in focus, i.e. does not cut the focal plane. These parts of the scene are not sharp, even if not moving. Regarding the background this might be what you want. Regarding the actual subject, however, we also need to appreciate that only a tiny fraction of its 2D projection ever is in focus. In other words, if the subject has an obvious 3D shape, such as a face, an animal, a building or a plant, most of the subject is *not* in focus (arguably, it might be pretty close to focus). If the subject is flat and the alignment of the subject with the focal plane is perfect, we’re under laboratory conditions.

As you can see, the physical limitations can be minimized under laboratory conditions. No surprise, this is what a lab is good for. The right question to ask at this point: Honestly, considering your shooting habits, how large is the fraction of photos you’re taking under laboratory conditions? 1 % would already be a large fraction, I guess.

Insight: we usually don’t shoot under idealized conditions and, therefore, there are severe physical limitations due to scene movement and focal issues. Given a certain scene, these limitations are unbreakable theoretical limits.

The photographer is the bottleneck. Only then, …

I stressed the term in the optical system above, ignoring the photographer as part of the system. But he/she does belong to the real system, right? So, a little abstraction here: What are the bottlenecks “Subject movement vs. exposure time” and “Subject location and dimension vs. lens focus” considering that the photographer needs to master focus and exposure? Let’s bring things in order: regarding image sharpness and thinking in the concept of bottlenecks,

  • the skill and abilities of the photographer him(her)self in mastering focus and exposure are the most relevant bottleneck.
  • Considering a quite skilled photographer, the physical limitations in focus and exposure are the very next bottlenecks to be considered as relevant.
  • Only under idealized (laboratory) conditions the physical limitations w.r.t. subject movement and alignment with the focal plane can be controlled in a way that the actual sharpness of the lens plays a role.

Lens sharpness? Seriously?

What we know is that practically *all* ((D)SLR) lenses on the market, even the cheapest ones, provide an image sharpness which is so good (good enough), that the physical limitations discussed above are by orders of magnitude more limiting. According to the concept of bottlenecks, the differences of these lenses in terms of their image sharpness under laboratory conditions do not matter in real-world photography. So, why do many lens reviews spend 90 % of their texts talking about the optical limitations of the lenses? An open question.

A quote from Ken Rockwell, a popular lens reviewer who writes about lenses what actually matters:

It’s the least skilled hobbyists who waste the most time blaming fuzzy pictures on their lenses, while real shooters know that few photos ever use all the sharpness of which their lenses are capable due to subject motion and the fact that real subjects are rarely perfectly flat.

Zur Erinnerung: Elbe Hochwasser 2013 in Dresden

Ziemlich bedrückende Stimmung für so viel Schokopudding. Ich erinnere ich mich an diesen Moment: 5. Juni 2013, eine Viertelstunde nach Sonnenuntergang. Ich legte die Kamera auf die alte Mauer der Dresdner Albertbrücke, stützte das neue Weitwinkelobjektiv mit dem Kameragurt und belichtete für 4 Sekunden:

Dresden Hochwasser 2013, Albertbrücke

Dresden Hochwasser 2013, Albertbrücke

Vielleicht mache ich in den nächsten Tagen ein Bild in gleicher Perspektive. Wir haben jetzt weniger Wasser, aber dafür ein bisschen Schnee.

Filter web server logs for missing file errors (a grep, sort, uniq example)

I had the suspicion that some (image) files belonging to some pages of my web presence might not be in the proper place anymore. For systematically finding these cases, I had a look into my nginx error logs and saw various missing file errors such as

2014/01/23 09:18:18 [error] 22000#0: *3901754 open() "/XXX/apple-touch-icon.png" failed (2: No such file or directory), client: 173.245.53.224, server: gehrcke.de, request: "GET /apple-touch-icon.png HTTP/1.1", host: "gehrcke.de"

The above is perfectly valid (Apple devices by default poll for apple-touch-icon* files). For finding real problems, I then wanted to filter all missing file errors and sort them by frequency. This one-liner helps:

cat nginx_error.log | grep -Eo 'open\(\) "/.+" failed' | sort | uniq --count | sort -nk 1 | less

At the tail of the output I found that I am really missing two important image files on the server that belong to one blog post:

2681 open() "/XXX/wp/blog_content/websocket_test_empty.png" failed
2688 open() "/XXX/wp/blog_content/websocket_test.png" failed

The above one-liner works in the following way:

  • First, cat writes the nginx error log to stdout.
  • Then, grep reads these lines from stdin, processes line by line, looking for a pattern interpreted in extended regex mode (-E option). It writes lines containing the pattern to stdout, whereas it actually only writes that part of the line corresponding to the matched pattern and not the entire line (-o option).
  • sort brings these lines into order, i.e. repetitive occurrences (duplicate lines) become adjacent to each other.
  • uniq --count merges repetitive occurrences into one single occurrence and adds the number of occurrences to the beginning of the merged line.
  • sort -nk 1 sorts these merged lines by the first whitespace-separated field (-k 1 option) in numerical mode (-n option), in ascending order.
  • The final less visualizes the outcome. Go to the end of the output and you find the most frequent missing file errors.

Consumer hard disks: a large-scale reliability study

Backblaze, an online backup service, have published statistics about the reliability of consumer hard disk drives (HDDs) in their enterprise environment: http://blog.backblaze.com/2014/01/21/what-hard-drive-should-i-buy/

Throughout the last couple of years, they have applied different models of Seagate, Western Digital, and Hitachi drives in large scale (thousands of individual disks) in their storage pools, enabling them to collect conclusive statistical data, such as the “annual failure rate” (cumulative drive years of service / number of failures). An impressive visualization is the 3 year survival rate:

Backblaze blog: disk survival rate by vendor

Backblaze data: disk survival rate by vendor

Note: for a normal consumer type, a Seagate drive should (on average) last longer than indicated by the graph above, since Backblaze is putting their drives under heavier load than normal users do (Backblaze for sure performs significantly more write operations on the disks, also the normal user does not spin the drive 24/7). While there is no doubt that Seagate drives were identified to be significantly less reliable in this study (compared to WD and Hitachi), Backblaze still likes Seagate drives, because they usually are quite cheap.

You might be wondering why exactly Backblaze is using consumer hard disk drives for running their service instead of enterprise class disks. It’s all about money — especially the value-for-money ratio. Backblaze realized that the relative enhancement in reliability and warranty of enterprise class disks compared to consumer HDDs is not as high as the relative prize increment. On a small scale, it might be worth having enterprise disks, because one has less trouble with failing hardware. When operating on the large scale, however, Backblaze concluded that it is cheaper to operate with consumer HDDs. Obviously, Backblaze has to make sure (and they do, I guess), that data is stored with sufficient redundancy, so that the overall likelihood of data loss is on the same level (almost zero) as with enterprise disks. Also, Backblaze has to properly adjust their infrastructure and internal processes/guidelines according to the “high” rate of disk failures. Considering 30,000 disks being operated simultaneously, and an annual failure rate of 5 % (which is realistic, according to their data), about 4 disks have to be exchanged per day (30000*0.05/365).

For my NAS, I have recently bought Western Digital’s 3 TB disks from the Red line, which are energy-saving consumer disks ‘certified’ for 24/7 operation — a very good choice, according to technical reviews and Backblaze’s data. You might want to look into these Red line models yourself, they come at a very good price. So far, I was assuming that the differences in HDD quality on the consumer market are small. Backblaze’s data disproves this assumption and although Seagate models surely are fine for many types of applications, I won’t blindly buy Seagate anymore. It is impressive to see certain models being much more reliable than others (on average) under heavy load conditions.

I am looking forward to seeing more conclusive data provided by Backblaze in the next couple of years. This kind of data is something that usually is kept secret within the manufacturer companies, and normal tech reviews simply cannot provide these data as of missing statistics.