The scientific method: in normal times you don’t give a fuck about it

“The statistics are flawed”, “the numbers don’t convince me” are some of the more common things I hear and read these days in view of the ongoing pandemic of coronavirus disease 2019 (COVID-19).

One cannot disagree with that. I cannot disagree with you when you say that. In fact, I can very much relate to that. I feel you. And I’d love to work together with you to discuss the numbers, just as much as I like to do that for and with myself.

Every single person who says that “the numbers are not correct” is correct. Do you want to be one of them? I can understand that.

All the skeptical people are right. Really. You are absolutely right when you don’t quite trust the current statistics and numbers.

But some of you are also fucking fundamentally wrong and destructive.

When you think and propagate that it isn’t all as bad as “they” are trying to tell us.

When you use that argument to morally justify not following guidelines.

When you try to convince your peers of your “insight”.

That disappoints me so much. What I hear and read these days is a reminder for how deeply we as a humanity have to worry about our own ignorance and stupidity. It’s our own collective stupidity that we suffer from the most.

Of course you would agree with that. Because clearly “they” are all idiots, right?

You may even think that you contribute something useful when you say “the numbers are not correct”. I am pretty sure: it’s not a productive contribution. You need to care a little more, and try a little harder (please read on).

Gladly, relevant people have already considered the uncertainty of data for us. They know that they base their decisions on potentially thin ice. They work with the fact that all models are wrong, some are useful (cheers to Mia). Finding more of the latter is hard work.

 

There is an important difference between them and you.

Most of “them” are trying to help. For example, by leaning on the side of caution. Some of “them” are making decisions that matter. Because they are asked to do so. And at least in Germany I am quite grateful for the average decision-making that happened to date in the context of COVID-19. I am especially grateful that it’s not people like you who get to making decisions that matter.

And here we are: the big difference between “them” and you is that you don’t need to make decisions that matter. You just use this opportunity to express how little you trust the people around you to do the right thing, how little you trust their desire to actually protect you from severe illness. I would love to see a shift in perspective: how would you think if you had to make difficult decisions that matter? Think about it, please.

 

You are Captain Obvious.

For undeniable, credible, bulletproof statistics we need dense sampling. We need exhaustive spatial and temporal coverage.

That requires a large amount of time, and many countries being affected by definition. It will take months and years to quantitatively and with great confidence answer all these questions that we seek answers for right now.

However, we needed to make decisions in the meantime. And we still need to make decisions right now, and we are going to have to make decisions every minute from now. All in view of uncertainty.

There is a fundamental conflict here that implies that we’re in a shitty situation.

 

You don’t get kudos for finding out that we’re in fact in a shitty situation.

Working with the uncertainty at hand is exactly what every single scientist working in the field of epidemiology tried their best to be prepared for. Dealing with uncertainty is the biggest challenge of every single politician who needs to make decisions right now.

What do you think are most of these discussions about in meeting rooms and phone conferences? I would assume: uncertainty.

Now after we’ve made some decisions (in view of uncertainty) you come in, Captain Obvious.

Glad we have you. Genius. You’ve come up with some incredible findings.

For example, you have found that some test methods might be affected by false positives and false negatives. You have found that the time evolution of the case count numbers in region X is not conclusive because we don’t quite know how the time evolution of the corresponding testing effort looked like. You have heard that some COVID-19-attributed deaths might actually have been misclassified. No! Why didn’t we think of THAT?

 

Thanks, Genius. You’re right.

Genius! Fortunately you had a look at the website of the European mortality monitoring institution and found that it doesn’t quite show anything unusual for Europe in March 2020. Thanks for looking. I mean, really! We almost missed looking at their website. Now we know that it probably isn’t even unusual that all these people die. Probably just a local outlier!

In view of that: you are of course right! It’s very likely that whatever is happening in Italy, Spain, France is completely anecdotal. Of course! They’re probably just in bad luck over there, don’t have their shit together, or are just plain stupid. I mean, right, of course! There must be some plausible explanation for what’s happening there other than “this is dangerous”. And even if we don’t find a plausible explanation: we cannot yet conclude that “this is dangerous” because all the data we have so far are so damn inconclusive! We don’t really know anything! How can we act not knowing? You are right, the best course of action is to eat some crackers and every now and then refresh the website of the European mortality monitoring institution. Really cool plots they have.

You’re right. We should amend our conclusions: it’s unlikely that anything bad is going to happen to us because we don’t have bulletproof statistics yet about Italy, Spain, France. China’s numbers? Made up. And about the U.S…. I mean, their president simply fucks it all up, right (yes, he does)?

 

Denial of observation

In view of “anecdotal evidence” is where some of you even flip into conspiracy land. You happily ignore or trivialize the qualitative in the absence of the quantitative. (thanks, Gustav).

I really wish you would try to understand what I just tried to say here.

Example: you do not actually show genuine, deep interest for the situation in Italian hospitals (and refuse to analyze how we got there) until someone presents bulletproof evidence on a silver plate convincing YOU that .. oh, fuck, there were indeed unusually many people “dying”.

 

Did I just say “dying”? Hah.

In your view it seems to be only about dying or surviving. You seem to pick your battle with a potential number of deaths, from specific disease.

I would love to remind us that the spectrum of suffering includes more than just people dying. Do you know what it’s like to be intubated? I don’t. And I don’t want to find out. Seriously. Did you already hear that when you “survive” (which is likely, yes) that you might do so with some long-term lung tissue damage?

I don’t want to create fear. But at a higher level I would like you to consider what it means to fight a disease that we barely understand yet. There is a spectrum between “dying” and “surviving” that very much defines how we experience this disease. Please, differentiate.

From your arguments I infer that you all seem to understand what COVID-19 really is, how straight-forward it is to treat. You seem to know your enemy very well. Cool.

I can tell you that our measures of precaution are fortunately not one-dimensionally based on the potential number of people dying. Yes, it’s what most of us talk about. Because of our stupidity, and inability to talk more than one-dimensional through mainstream media.

But in the meeting rooms and in the phone conferences “they” talk about details. Details that matter.

You may want to consider for yourself that you know NOTHING about all that, compared to all the things that one can know about this disease, and how it spreads through populations.

 

Stop telling your peers that the lockdown measures are too much, that things are “not as bad as they are trying to tell us”.

Shut up. Until you know. You should pause here for a moment. And fucking think about the word “know”.

Also think about the following: given your limited efforts it is unlikely that you understand more about this than those people who try really really hard. Try harder, and then we can talk (see below).

My most important point really is: don’t influence the people around you with your personal risk assessment.

Yes, that assessment that you pulled out of your ass. Based on how severe this sickness was for you and your friends. OH WAIT, don’t have personal experiences yet? Maybe wait for things to happen. This. has. just. begun.

Free of irony: the timescale of the observation period that you base your intuition on is likely to be vastly different from the timescale that this pandemic will change things for you and us. Maybe account for the possibility that your intuition can mislead you.

Please, do not assume that the majority of the people providing guidance towards decision-making do not know what they are doing.  Stop believing that you are a productive member of society by finding simple flaws in the “statistics” (call it whatever you’d like) underlying the political decisions of the past weeks. I mean it’s cool that you care! But you should realize that (tens of) thousands of people working in scientific fields are actually doing their job here pretty well and you suggesting otherwise is not helpful. It creates damage.

Have you ever thought about how hard it is to have an in-depth expert-level discussion AND at the same time communicate the subtle findings to the public so that we would be able to “understand”? It’s close to impossible. What you see through the media is (sadly, yes!) a very very very very tiny fragment of what actually matters.

Statements will be refined and revised. Measures will be extended and relaxed. That’s all part of the process. Personally, as of the risk implied by the unknown, I quite like when this process starts erring on the side of caution, and then relaxes over time. Yes, you will find many statements that appear to be “wrong”. Especially in hindsight. That does not indicate that we are taking a bad path through this crisis.

 

Invitation to contribute

I want to say that if you really believe that you can contribute anything substantial to the discussion about models, statistics, numbers: you can. You’re even invited to. All you need to do is take some time and choose the right words. The scientific community is more open and accessible than ever.

Please, join the discussion. Be productive. This discussion leads to iteration. That iteration results in a refined understanding of the current situation. We derive knowledge.

This is called the scientific method.

In normal times you don’t give a fuck about it.

More knowledge leads to better decision-making. This is how this world works, on average.

I am calming myself down and genuinely ask you to please please have some faith, please assume good intentions. At the very least when you think about the science behind all that.

VS Code: I cannot quite figure out how to search and replace in a selection

Gosh, search and replace in a selection of text in VS Code drives me nuts.

I can’t even describe how exactly it does so, but it works against my intuition.

It’s funny how in 2019 we haven’t quite settled on UX for that!

For me, it’s super hard to grasp how this (search and replace in a selection of text) is supposed to work in VS Code, as in: how would the people who designed this part of the product like me to use it? Which workflow did they have in mind? I don’t seem to understand the magic behind it. I often seem to end up replacing things in the entire document as opposed to in the selection that I just carefully created a second before. All I can conclude so far is that trial and error don’t seem to get me far :-).

Trying to understand the behavior based on docs and reading GitHub issues then reveals that this thing is complex, that it can be configured with a number of special configuration options, and that many users experience WTF moments on a daily basis. Users keep posting videos and GIFs of their weird experiences (thanks!). Two of these WTF moments are well captured in the GIFs in this comment: https://github.com/microsoft/vscode/issues/17563#issuecomment-542656591 (from October 2019).

I can say that search and replace in a selection of text works much better, so much more intuitively, in Sublime Text 3 (by default).

In VS Code it turned out that setting editor.find.autoFindInSelection to true helps me quite a bit towards getting more predictable outcomes.

Feel free to leave a comment below if you have an anecdotal opinion about all this.

If you’d like to read along, I find these issues pretty entertaining:

I particularly agree with this statement from November 2019:

I appreciate all the work and attention that has gone into this, but I don’t think I will ever be truly comfortable with Find/Replace in VS Code. The issue for me is that the “find in selection” button looks and feels like a toggle switch, not a trigger (or “fire button”).

If you flip a toggle switch from one position to the other, and then back to the original position, it should be as though you had never flipped the switch at all. With this mental model, it is always a shock and a disappointment when flipping the switch irreversibly destroys my carefully crafted manual selection.

 

 

 

Bokeh: disable touch interaction (disable drag, zoom, pan)

Bokeh is quite cool. I was looking for a way to disable touch interaction with a plot, though. Was a little tedious to find a solution on the web. Found it, buried in the docs. Therefore this quick post.

Say you have created a figure object with

fig = bokeh.plotting.figure(...)

Then you can disable the individual touch controls with e.g.

fig.toolbar.active_drag = None
fig.toolbar.active_scroll = None
fig.toolbar.active_tap = None

Very easy to do, you just need someone to tell you :-).

(Tested with Bokeh 2.0.0)

Bulma: sticky footer (flexbox solution)

Bulma is nice. I was looking for a way to get a sticky footer, though. Like many others, I was a little surprised that it’s not built-in: https://github.com/jgthms/bulma/issues/47

It’s of course doable. I have created a demo / minimal working example based on the solution proposed here.

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <title>Footer at the bottom / sticky footer</title>
    <link
      rel="stylesheet"
      href="https://cdn.jsdelivr.net/npm/bulma@0.8.0/css/bulma.min.css"
    />
    <style type="text/css" media="screen">
      body {
        display: flex;
        min-height: 100vh;
        flex-direction: column;
      }
 
      #wrapper {
        flex: 1;
      }
    </style>
  </head>
  <body>
    <div id="wrapper">
      <section class="hero is-medium is-primary is-bold">
        <div class="hero-body">
          <div class="container">
            <h1 class="title">
              The footer is at the bottom. Seriously. 🦠
            </h1>
            <h2 class="subtitle">
              By <a href="https://gehrcke.de">Jan-Philip Gehrcke</a>
            </h2>
          </div>
        </div>
      </section>
    </div>
    <footer class="footer">
      <div class="content has-text-centered">
 
          I am down here.
 
      </div>
    </footer>
  </body>
</html>

I posted this in the GH thread, too: https://github.com/jgthms/bulma/issues/47#issuecomment-603547213

COVID-19-Fallzahlen: bitte zitiert nicht die JHU. Sondern die Landesministerien.

Liebe ARD Tagesschau/Tagesthemen und liebes ZDF heute journal: Ihr macht einen fantastischen Job in der COVID-19-Berichterstattung. Danke dafür!

Aber eine Bitte habe ich an Euch und an viele andere deutsche Medien: bitte zitiert nicht die Johns Hopkins University (JHU), wenn Ihr die aktuellen Fallzahlen von bestätigten COVID-19-Infektionen für Deutschland berichtet.

Die Zahlen selbst sind gut. Die Quelle “JHU” jedoch sorgt für ziemlich viel Verwirrung. Bitte zitiert stattdessen unsere Gesundheitsämter und Landesministerien: gleiche Zahlen, korrekte Quelle, weniger Verwirrung. Das versuche ich in diesem Beitrag zu erklären.

Dass viele Bürger sich fragen woher die Diskrepanz zwischen den JHU-Zahlen und den vom Robert Koch-Institut (RKI) veröffentlichten Zahlen kommt, wisst Ihr selbst.

Auch ich habe mich gefragt, warum das heute journal die JHU und nicht das RKI zitiert. In Anbetracht der Unterschiedlichkeit der Zahlen wollte ich das gerne von Euch erklärt bekommen.

Das heute journal ist darauf mit einem Beitrag in der Sendung vom 19.03. eingegangen (kann man in diesem Tweet noch einmal sehen). Aber ich wurde enttäuscht: uns wurde gesagt, dass die JHU sich auf Quellen wie “WHO, CDC, ECDC, NHC, DXY, …” beziehen würde. Wir blieben ratlos. Das war keine plausible Erklärung für die Diskrepanz der Datensätze und für die Entscheidung JHU statt RKI zu zitieren.

Kurzer Exkurs zur genannten Diskrepanz, am Beispiel vom 21. März. Der RKI Situationsbericht zum 21.03.2020 sagt: 16.662 bestätigte Fälle. Die JHU behauptet am heutigen Abend des 21. März: etwa 22.000 bestätigte Fälle. Vergleichbar? Im RKI Situationsbericht steht, dass er dem Datenstand vom 21.03.2020 um 0:00 Uhr entsprechen soll. Also sollte man diese Zahl mit dem letzten Stand der JHU-Daten vom 20.03. vergleichen. Schauen wir nach: time_series_19-covid-Confirmed.csv, Zeile 13, im git repository CSSEGISandData/COVID-19 auf GitHub. Da steht für den 20.03. die Zahl 19.848.

Eine faire Frage ist also: “Warum berichtet das RKI für das Ende des Tages des 20. März etwa 16.600 Fälle, während die JHU für den gleichen Zeitpunkt etwa 19.800 Fälle berichtet?”

Darauf gibt es meiner Meinung nach eine klare und zufriedenstellende Antwort: die JHU-Daten entsprechen aktuellen und offiziellen Daten der Landesministerien. Die RKI-Daten sind geprüfte und verzögert veröffentlichte Daten der selben Landesministerien.

jeder von uns kann sich zu jedem Zeitpunkt des Tages die kleine Mühe machen und die jeweils aktuellen COVID-19-Fallzahlen der einzelnen 16 Bundesländer nachlesen. Ich mache das hier mal vor, mit aktuellen Zahlen von sechs Bundesländern (während ich diesen Text schreibe):

Das Prinzip ist klar. Diese Fallzahlen-Portale und Statistiken der einzelnen Länder sind schon recht gut gemacht. Danke, liebe Landesministerien! Hier verlinke ich noch ein paar mehr davon.

Wenn ich jetzt die Bevölkerung über den aktuellen Stand der bestätigten COVID-19-Fallzahlen in den genannten sechs genannten Ländern informieren wollte, dann könnte ich doch einfach aufsummieren und schreiben: etwa 17.400 Fälle, Stand 21.03. (vormittags/mittags). Quelle: Gesundheitsministerien der Bundesländer.

Oder nicht? Eine plausible, vertrauenswürdige Quelle. Im Gegensatz zur mysterischen, nicht direkt plausiblen Quelle “JHU”.

Wenn man jetzt noch die 10 anderen Länder dazurechnet kommt man auf den derzeitigen Stand (während ich schreibe) von etwa 22.000 bestätigten Fällen (weiter oben haben wir die Zahl schon gesehen: das ist das was die JHU derzeit berichtet während ich das hier schreibe).

Meines Wissens nach ist das genannte Vorgehen (Konsultation der Ländesbehörden) genau das, was ZEIT ONLINE und Berliner Morgenpost akribisch mehrfach am Tag tun. Die erhaltenen Daten pflegen sie sofort in ihre Visualisierungen ein:

Und die Zahlen des RKI? Es schreibt selbst:

“Zwischen dem Bekanntwerden eines Falls vor Ort, der Meldung an […], der Eingabe der Daten in […] und von dort an das RKI liegt eine gewisse Zeitspanne. Die kann gemäß den Vorgaben im Infektionsschutzgesetz zwei bis drei Arbeitstage lang sein.”

Schlussfolgerung 1: Die Landesministerien veröffentlichen, gut aufbereitet, zu jedem Zeitpunkt ihren aktuellen Datenstand. Aggregiert von den ihnen jeweils unterstellten Gesundheitsämtern. Das sind offizielle Zahlen. Und wahrscheinlich auch ganz gute (belastbare) Zahlen. Aber auch diese Zahlen sind natürlich schon verzögert gegenüber den Meldungen der Gesundheitsämter!

Schlussfolgerung 2: Ich denke, dass wir die Zahlen von ZEIT ONLINE und Berliner Morgenpost jeweils guten Gewissens benutzen können. Denn sie repräsentieren meinen Stichproben nach exakt die Veröffentlichung der Landesministerien.

Schlussfolgerung 3: das RKI prozessiert die Daten langsam und erzeugt gegenüber den Landesministerien eine Verzögerung in der Veröffentlichung der Daten von weiteren 1 bis 2 Tagen. Warum das so ist: unklar. Eine eingehende Prüfung der Daten? Vielleicht. Dafür hätte ich gerne eine bessere Erklärung!

Schlussfolgerung 4: Tagesschau und heute journal: bitte nennt als Quelle für die aktuellen Fallzahlen die Landes(gesundheits)ministerien bzw. die Gesundheitsämter. Das ist korrekt und stiftet weniger Verwirrung.

Vielleicht könntet Ihr auch noch zusätzlich ZEIT ONLINE oder die Berliner Morgenpost zitieren, die die Konsultation für uns vornehmen und die jeweils aktuelle Summe für uns bereithalten.

Danke.

Randnotiz 1: ich hatte darüber schon vor ein paar Tagen geschrieben und versucht die Gedanken hier ein bisschen klarer zu sortieren.

Randnotiz 2: ich arbeite an https://github.com/jgehrcke/covid-19-germany-gae. Eine konsolidierte und automatisiert konsumierbare Datenquelle der aktuellen Fallzahlen in Deutschland, der individuellen Bundesländer, inkl. Zeitverlauf in der Vergangenheit.