Category Archives: General

Steven D. Schafersman’s Introduction to Science

In 1997, Steven D. Schaferman (Professor for geology, Miami) has published an essay titled “Scientific Thinking and the Scientific Method”. Its primary source is this website: http://pbisotopes.ess.sunysb.edu/esp/files/scientific-method.html, created with Frontpage 3.0 (and a nice brownish-yellowish background.). I feel the urge to share this text with you, and to prevent it from getting lost in the net. The essay manages to bridge fundamental scientific approaches and philosophy, in simple words. As a scientist, I often have those unstructured thoughts about “why the hell am I doing this” — while reading the text I got the feeling that it brings some of these thoughts into order. Also, for some of us it might be nice to explicitly read why we have been educated the way we were educated. Keep in mind that this is only an introduction to the huge philosophical topic of scientific theory, but as a primer it serves well.

The essay is referenced and quoted in many places, also in books. However, most of the online references contain dead links. Hence, I want to provide another platform for the essay, in modern HMTL in readable form, even mobile-friendly. Here it is.


An Introduction to Science

Scientific Thinking and the Scientific Method

Steven D. Schafersman
Department of Geology
Miami University
January, 1997

Introduction

To succeed in this science course and, more specifically, to answer some of the questions on the first exam, you should be familiar with a few of the concepts regarding the definition of science, scientific thinking, and the methods of science. Most textbooks do an inadequate job of this task, so this essay provides that information. This information in its present form is not in your textbook, so please read it carefully here, and pay close attention to the words in boldface and the definitions in italics.

The Definition of Science

Science is not merely a collection of facts, concepts, and useful ideas about nature, or even the systematic investigation of nature, although both are common definitions of science. Science is a method of investigating nature–a way of knowing about nature–that discovers reliable knowledge about it. In other words, science is a method of discovering reliable knowledge about nature. There are other methods of discovering and learning knowledge about nature (these other knowledge methods or systems will be discussed below in contradistinction to science), but science is the only method that results in the acquisition of reliable knowledge.

Reliable knowledge is knowledge that has a high probablility of being true because its veracity has been justified by a reliable method. Reliable knowledge is sometimes called justified true belief, to distinguish reliable knowledge from belief that is false and unjustified or even true but unjustified. (Please note that I do not, as some do, make a distinction between belief and knowledge; I think that what one believes is one’s knowledge. The important distinction that should be made is whether one’s knowledge or beliefs are true and, if true, are justifiably true.) Every person has knowledge or beliefs, but not all of each person’s knowledge is reliably true and justified. In fact, most individuals believe in things that are untrue or unjustified or both: most people possess a lot of unreliable knowledge and, what’s worse, they act on that knowledge! Other ways of knowing, and there are many in addition to science, are not reliable because their discovered knowledge is not justified. Science is a method that allows a person to possess, with the highest degree of certainty possible, reliable knowledge (justified true belief) about nature. The method used to justify scientific knowledge, and thus make it reliable, is called the scientific method. I will explain the formal procedures of the scientific method later in this essay, but first let’s describe the more general practice of scientific or critical thinking.

Scientific and Critical Thinking

When one uses the scientific method to study or investigate nature or the universe, one is practicing scientific thinking. All scientists practice scientific thinking, of course, since they are actively studying nature and investigating the universe by using the scientific method. But scientific thinking is not reserved solely for scientists. Anyone can "think like a scientist" who learns the scientific method and, most importantly, applies its precepts, whether he or she is investigating nature or not. When one uses the methods and principles of scientific thinking in everyday life–such as when studying history or literature, investigating societies or governments, seeking solutions to problems of economics or philosophy, or just trying to answer personal questions about oneself or the meaning of existence–one is said to be practicing critical thinking. Critical thinking is thinking correctly for oneself that successfully leads to the most reliable answers to questions and solutions to problems. In other words, critical thinking gives you reliable knowledge about all aspects of your life and society, and is not restricted to the formal study of nature. Scientific thinking is identical in theory and practice, but the term would be used to describe the method that gives you reliable knowledge about the natural world. Clearly, scientific and critical thinking are the same thing, but where one (scientific thinking) is always practiced by scientists, the other (critical thinking) is sometimes used by humans and sometimes not. Scientific and critical thinking was not discovered and developed by scientists (that honor must go to ancient Hellenistic philosophers, such as Aristotle, who also are sometimes considered the first scientists), but scientists were the ones to bring the practice of critical thinking to the attention and use of modern society (in the 17th and 18th centuries), and they are the most explicit, rigorous, and successful practitioners of critical thinking today. Some professionals in the humanities, social sciences, jurisprudence, business, and journalism practice critical thinking as well as any scientist, but many, alas, do not. Scientists must practice critical thinking to be successful, but the qualifications for success in other professions do not necessarily require the use of critical thinking, a fact that is the source of much confusion, discord, and unhappiness in our sociey .

The scientific method has proven to be the most reliable and successful method of thinking in human history, and it is quite possible to use scientific thinking in other human endeavors. For this reason, critical thinking–the application of scientific thinking to all areas of study and topics of investigation–is being taught in schools throughout the United States, and its teaching is being encouraged as a universal ideal. You may perhaps have been exposed to critical thinking skills and exercises earlier in your education. The important point is this: critical thinking is perhaps the most important skill a student can learn in school and college, since if you master its skills, you know how to think successfully and reach reliable conclusions, and such ability will prove valuable in any human endeavor, including the humanities, social sciences, commerce, law, journalism, and government, as well as in scholarly and scientific pursuits. Since critical thinking and scientific thinking are, as I claim, the same thing, only applied for different purposes, it is therefore reasonable to believe that if one learns scientific thinking in a science class, one learns, at the same time, the most important skill a student can possess–critical thinking. This, to my mind, is perhaps the foremost reason for college students to study science, no matter what one’s eventual major, interest, or profession.

The Three Central Components of Scientific and Critical Thinking

What is scientific thinking? At this point, it is customary to discuss questions, observations, data, hypotheses, testing, and theories, which are the formal parts of the scientific method, but these are NOT the most important components of the scientific method. The scientific method is practiced within a context of scientific thinking, and scientific (and critical) thinking is based on three things: using empirical evidence (empiricism), practicing logical reasonsing (rationalism), and possessing a skeptical attitude (skepticism) about presumed knowledge that leads to self-questioning, holding tentative conclusions, and being undogmatic (willingness to change one’s beliefs). These three ideas or principles are universal throughout science; without them, there would be no scientific or critical thinking. Let’s examine each in turn.

1. Empiricism: The Use of Empirical Evidence

Empirical evidence is evidence that one can see, hear, touch, taste, or smell; it is evidence that is susceptible to one’s senses. Empirical evidence is important because it is evidence that others besides yourself can experience, and it is repeatable, so empirical evidence can be checked by yourself and others after knowledge claims are made by an individual. Empirical evidence is the only type of evidence that possesses these attributes and is therefore the only type used by scientists and critical thinkers to make vital decisions and reach sound conclusions.

We can contrast empirical evidence with other types of evidence to understand its value. Hearsay evidence is what someone says they heard another say; it is not reliable because you cannot check its source. Better is testimonial evidence, which, unlike hearsay evidence, is allowed in courts of law. But even testimonial evidence is notoriously unreliable, as numerous studies have shown. Courts also allow circumstantial evidence (e.g., means, motive, and opportunity), but this is obviously not reliable. Revelatory evidence or revelation is what someone says was revealed to them by some deity or supernatural power; it is not reliable because it cannot be checked by others and is not repeatable. Spectral evidence is evidence supposedly manifested by ghosts, spirits, and other paranormal or supernatural entities; spectral evidence was once used, for example, to convict and hang a number of innocent women on charges of witchcraft in Salem, Massachusetts, in the seventeenth century, before the colonial governor banned the use of such evidence, and the witchcraft trials ended. Emotional evidence is evidence derived from one’s subjective feelings; such evidence is often repeatable, but only for one person, so it is unreliable.

The most common alternative to empirical evidence, authoritarian evidence, is what authorities (people, books, billboards, television commercials, etc.) tell you to believe. Sometimes, if the authority is reliable, authoritarian evidence is reliable evidence, but many authorities are not reliable, so you must check the reliability of each authority before you accept its evidence. In the end, you must be your own authority and rely on your own powers of critical thinking to know if what you believe is reliably true. (Transmitting knowledge by authority is, however, the most common method among humans for three reasons: first, we are all conditioned from birth by our parents through the use of positive and negative reinforcement to listen to, believe, and obey authorities; second, it is believed that human societies that relied on a few experienced or trained authorities for decisions that affected all had a higher survival value than those that didn’t, and thus the behaviorial trait of susceptibility to authority was strengthened and passed along to future generations by natural selection; third, authoritarian instruction is the quickest and most efficient method for transmitting information we know about. But remember: some authoritarian evidence and knowledge should be validated by empirical evidence, logical reasoning, and critical thinking before you should consider it reliable, and, in most cases, only you can do this for yourself.

It is, of course, impossible to receive an adequate education today without relying almost entirely upon authoritarian evidence. Teachers, instructors, and professors are generally considered to be reliable and trustworthy authorities, but even they should be questioned on occasion. The use of authoritarian evidence in education is so pervasive, that its use has been questioned as antithetical to the true spirit of scholarly and scientific inquiry, and attempts have been made in education at all levels in recent years to correct this bias by implementing discovery and inquiry methodologies and curricula in classrooms and laboratories. The recently revised geology laboratory course at Miami University, GLG 115.L, is one such attempt, as are the Natural Systems courses in the Western College Program at Miami. It is easier to utilize such programs in humanities and social sciences, in which different yet equally valid conclusions can be reached by critical thinking, rather than in the natural sciences, in which the objective reality of nature serves as a constant judge and corrective mechanism.

Another name for empirical evidence is natural evidence: the evidence found in nature. Naturalism is the philosophy that says that "Reality and existence (i.e. the universe, cosmos, or nature) can be described and explained solely in terms of natural evidence, natural processes, and natural laws." This is exactly what science tries to do. Another popular definition of naturalism is that "The universe exists as science says it does." This definition emphasizes the strong link between science and natural evidence and law, and it reveals that our best understanding of material reality and existence is ultimately based on philosophy. This is not bad, however, for, whether naturalism is ultimately true or not, science and naturalism reject the concept of ultimate or absolute truth in favor of a concept of proximate reliable truth that is far more successful and intellectually satisfying than the alternative, the philosophy of supernaturalism. The supernatural, if it exists, cannot be examined or tested by science, so it is irrelevant to science. It is impossible to possess reliable knowledge about the supernatural by the use of scientific and critical thinking. Individuals who claim to have knowledge about the supernatural do not possess this knowledge by the use of critical thinking, but by other methods of knowing.

Science has unquestionably been the most successful human endeavor in the history of civilization, because it is the only method that successfully discovers and formulates reliable knowledge. The evidence for this statement is so overwhelming that many individuals overlook exactly how modern civilization came to be (our modern civilization is based, from top to bottom, on the discoveries of science and their application, known as technology, to human purposes.). Philosophies that claim to possess absolute or ultimate truth invariably find that they have to justify their beliefs by faith in dogma, authority, revelation, or philosophical speculation, since it is impossible to use finite human logic or natural evidence to demonstrate the existence of the absolute or ultimate in either the natural or supernatural worlds. Scientific and critical thinking require that one reject blind faith, authority, revelation, and subjective human feelings as a basis for reliable belief and knowledge. These human cognitive methods have their place in human life, but not as the foundation for reliable knowledge.

2. Rationalism: The Practice of Logical Reasoning

Scientists and critical thinkers always use logical reasoning. Logic allows us to reason correctly, but it is a complex topic and not easily learned; many books are devoted to explaining how to reason correctly, and we can not go into the details here. However, I must point out that most individuals do not reason logically, because they have never learned how to do so. Logic is not an ability that humans are born with or one that will gradually develop and improve on its own, but is a skill or discipline that must be learned within a formal educational environment. Emotional thinking, hopeful thinking, and wishful thinking are much more common than logical thinking, because they are far easier and more congenial to human nature. Most individuals would rather believe something is true because they feel it is true, hope it is true, or wish it were true, rather than deny their emotions and accept that their beliefs are false.

Often the use of logical reasoning requires a struggle with the will, because logic sometimes forces one to deny one’s emotions and face reality, and this is often painful. But remember this: emotions are not evidence, feelings are not facts, and subjective beliefs are not substantive beliefs. Every successful scientist and critical thinker spent years learning how to think logically, almost always in a formal educational context. Some people can learn logical thinking by trial and error, but this method wastes time, is inefficient, is sometimes unsuccessful, and is often painful.

The best way to learn to think logically is to study logic and reasoning in a philosophy class, take mathematics and science courses that force you to use logic, read great literature and study history, and write frequently. Reading, writing, and math are the traditional methods that young people learned to think logically (i.e. correctly), but today science is a fourth method. Perhaps the best way is to do a lot of writing that is then reviewed by someone who has critical thinking skills. Most people never learn to think logically; many illogical arguments and statements are accepted and unchallenged in modern society–often leading to results that are counterproductive to the good of society or even tragic–because so many people don’t recognize them for what they are.

3. Skepticism: Possessing a Skeptical Attitude

The final key idea in science and critical thinking is skepticism, the constant questioning of your beliefs and conclusions. Good scientists and critical thinkers constantly examine the evidence, arguments, and reasons for their beliefs. Self-deception and deception of yourself by others are two of the most common human failings. Self-deception often goes unrecognized because most people deceive themselves. The only way to escape both deception by others and the far more common trait of self-deception is to repeatedly and rigorously examine your basis for holding your beliefs. You must question the truth and reliability of both the knowledge claims of others and the knowledge you already possess. One way to do this is to test your beliefs against objective reality by predicting the consequences or logical outcomes of your beliefs and the actions that follow from your beliefs. If the logical consequences of your beliefs match objective reality–as measured by empirical evidence–you can conclude that your beliefs are reliable knowledge (that is, your beliefs have a high probability of being true).

Many people believe that skeptics are closed-minded and, once possessing reliable knowledge, resist changing their minds–but just the opposite is true. A skeptic holds beliefs tentatively, and is open to new evidence and rational arguments about those beliefs. Skeptics are undogmatic, i.e., they are willing to change their minds, but only in the face of new reliable evidence or sound reasons that compel one to do so. Skeptics have open minds, but not so open that their brains fall out: they resist believing something in the first place without adequate evidence or reason, and this attribute is worthy of emulation. Science treats new ideas with the same skepticism: extraordinary claims require extraordinary evidence to justify one’s credulity. We are faced every day with fantastic, bizarre, and outrageous claims about the natural world; if we don’t wish to believe every pseudoscientific allegation or claim of the paranormal, we must have some method of deciding what to believe or not, and that method is the scientific method which uses critical thinking.

The Scientific Method in Practice

Now, we are ready to put the scientific method into action. Many books have been written about the scientific method, and it is a long and complex topic. Here I will only treat it briefly and superficially. The scientific method, as used in both scientific thinking and critical thinking, follows a number of steps.

  1. One must ask a meaningful question or identify a significant problem, and one should be able to state the problem or question in a way that it is conceivably possible to answer it. Any attempt to gain knowledge must start here. Here is where emotions and outside influences come in. For example, all scientists are very curious about nature, and they have to possess this emotional characteristic to sustain the motivation and energy necessary to perform the hard and often tedious work of science. Other emotions that can enter are excitement, ambition, anger, a sense of unfairness, happiness, and so forth. Note that scientists have emotions, some in high degree; however, they don’t let their emotions give false validity to their conclusions, and, in fact, the scientific method prevents them from trying to do this even if they wished.
  2. Many outside factors can come into play here. Scientists must choose which problems to work on, they decide how much time to devote to different problems, and they are often influenced by cultural, social, political, and economic factors. Scientists live and work within a culture that often shapes their approach to problems; they work within theories that often shape their current understanding of nature; they work within a society that often decides what scientific topics will be financially supported and which will not; and they work within a political system that often determines which topics are permitted and financially rewarded and which are not.

    Also, at this point, normally nonscientific emotional factors can lead to divergent pathways. Scientists could be angry at polluters and choose to investigate the effects of pollutants; other scientists could investigate the results of smoking cigarettes on humans because they can earn a living doing this by working for tobacco companies; intuition can be used to suggest different approaches to problems; even dreams can suggest creative solutions to problems. I wish to emphasize, however, that the existence of these frankly widespread nonscientific emotional and cultural influences does not compromize the ultimate reliability and objectivity of scientific results, because subsequent steps in the scientific method serve to eliminate these outside factors and allow science to reach reliable and objective conclusions (admittedly it may take some time for subjective and unreliable scientific results to be eliminated). There exists a school of thought today in the humanities (philosophy, history, and sociology) called post-modernism or scientific constructivism, that claims that science is a social and cultural construct, that scientific knowledge inevitably changes as societies and cultures change, and that science has no inherently valid foundation on which to base its knowledge claims of objectivity and reliability. In brief, post-modernists believe that the modern, scientific world of Enlightenment rationality and objectivity must now give way to a post-modern world of relativism, social constructivism, and equality of belief. Almost all scientists who are aware of this school of thought reject it, as do I; post-modernism is considered irrelevant by scientists and has had no impact on the practice of science at all. We will have to leave this interesting topic for a later time, unfortunately, but you may be exposed to these ideas in a humanities class. If you are, remember to think critically!

  3. One must next gather relevant information to attempt to answer the question or solve the problem by making observations. The first observations could be data obtained from the library or information from your own experience. Another souce of observations could be from trial experiments or past experiments. These observations, and all that follow, must be empirical in nature–that is, they must be sensible, measurable, and repeatable, so that others can make the same observations. Great ingenuity and hard work on the part of the scientist is often necessary to make scientific observations. Furthermore, a great deal of training is necessary in order to learn the methods and techniques of gathering scientific data.
  4. Now one can propose a solution or answer to the problem or question. In science, this suggested solution or answer is called a scientific hypothesis, and this is one of the most important steps a scientist can perform, because the proposed hypothesis must be stated in such a way that it is testable. A scientific hypothesis is an informed,testable, and predictive solution to a scientific problem that explains a natural phenomenon, process, or event. In critical thinking, as in science, your proposed answer or solution must be testable, otherwise it is essentially useless for further investigation. Most individuals–noncritical thinkers all–stop here, and are satisfied with their first answer or solution, but this lack of skepticism is a major roadblock to gaining reliable knowledge. While some of these early proposed answers may be true, most will be false, and further investigation will almost always be necessary to determine their validity.
  5. Next, one must test the hypothesis before it is corroborated and given any real validity. There are two ways to do this. First, one can conduct an experiment. This is often presented in science textbooks as the only way to test hypotheses in science, but a little reflection will show that many natural problems are not amenable to experimentation, such as questions about stars, galaxies, mountain formation, the formation of the solar system, ancient evolutionary events, and so forth. The second way to test a hypothesis is to make further observations. Every hypothesis has consequences and makes certain predictions about the phenomenon or process under investigation. Using logic and empirical evidence, one can test the hypothesis by examining how successful the predictions are, that is, how well the predictions and consequences agree with new data, further insights, new patterns, and perhaps with models. The testability or predictiveness of a hypothesis is its most important characteristic. Only hypotheses involving natural processes, natural events, and natural laws can be tested; the supernatural cannot be tested, so it lies outside of science and its existence or nonexistence is irrelevant to science.
  6. If the hypothesis fails the test, it must be rejected and either abandoned or modified. Most hypotheses are modified by scientists who don’t like to simply throw out an idea they think is correct and in which they have already invested a great deal of time or effort. Nevertheless, a modified hypothesis must be tested again. If the hypothesis passes the further tests, it is considered to be a corroborated hypothesis, and can now be published. A corroborated hypothesis is one that has passed its tests, i.e., one whose predictions have been verified. Now other scientists test the hypothesis. If further corroborated by subsequent tests, it becomes highly corroborated and is now considered to be reliable knowledge. By the way, the technical name for this part of the scientific method is the "hypothetico-deductive method," so named because one deduces the results of the predictions of the hypothesis and tests these deductions. Inductive reasoning, the alternative to deductive reasoning, was used earlier to help formulate the hypothesis. Both of these types of reasoning are therefore used in science, and both must be used logically.
  7. Scientists never claim that a hypothesis is "proved" in a strict sense (but sometimes this is quite legitimately claimed when using popular language), because proof is something found only in mathematics and logic, disciplines in which all logical parameters or constraints can be defined, and something that is not true in the natural world. Scientists prefer to use the word "corroborated" rather than "proved," but the meaning is essentially the same. A highly corroborated hypothesis becomes something else in addition to reliable knowledge–it becomes a scientific fact. A scientific fact is a highly corroborated hypothesis that has been so repeatedly tested and for which so much reliable evidence exists, that it would be perverse or irrational to deny it. This type of reliable knowledge is the closest that humans can come to the "truth" about the universe (I put the word "truth" in quotation marks because there are many different kinds of truth, such as logical truth, emotional truth, religious truth, legal truth, philosophical truth, etc.; it should be clear that this essay deals with scientific truth, which, while certainly not the sole truth, is nevertheless the best truth humans can possess about the natural world).

    There are many such scientific facts: the existence of gravity as a property of all matter, the past and present evolution of all living organisms, the presence of nucleic acids in all life, the motion of continents and giant tectonic plates on Earth, the expansion of the universe following a giant explosion, and so forth. Many scientific facts violate common sense and the beliefs of ancient philosophies and religions, so many people persist in denying them, but they thereby indulge in irrationality and perversity. Many other areas of human thought and philosophy, and many other knowledge systems (methods of gaining knowledge), exist that claim to have factual knowledge about the world. Some even claim that their facts are absolutely or ultimately true, something science would never claim. But their "facts" are not reliable knowledge, because–while they might fortuitously be true–they have not been justified by a reliable method. If such unreliable "facts" are true–and I certainly don’t maintain that all such knowledge claims are false–we can never be sure that they are true, as we can with scientific facts.

  8. The final step of the scientific method is to construct, support, or cast doubt on a scientific theory. A theory in science is not a guess, speculation, or suggestion, which is the popular definition of the word "theory." A scientific theory is a unifying and self-consistent explanation of fundamental natural processes or phenomena that is totally constructed of corroborated hypotheses. A theory, therefore, is built of reliable knowledge–built of scientific facts–and its purpose is to explain major natural processes or phenomena. Scientific theories explain nature by unifying many once-unrelated facts or corroborated hypotheses; they are the strongest and most truthful explanations of how the universe, nature, and life came to be, how they work, what they are made of, and what will become of them. Since humans are living organisms and are part of the universe, science explains all of these things about ourselves.
  9. These scientific theories–such as the theories of relativity, quantum mechanics, thermodynamics, evolution, genetics, plate tectonics, and big bang cosmology–are the most reliable, most rigorous, and most comprehensive form of knowledge that humans possess. Thus, it is important for every educated person to understand where scientific knowledge comes from, and how to emulate this method of gaining knowledge. Scientific knowledge comes from the practice of scientific thinking–using the scientific method–and this mode of discovering and validating knowledge can be duplicated and achieved by anyone who practices critical thinking.

Copyright © 1997 by Steven D. Schafersman

Websites using WP-GeSHi-Highlight: a NerdyData search

For some time now I am maintaining WP-GeSHi-Highlight, a code syntax highlighting plugin for WordPress, based on the established PHP highlighting library GeSHi.

The plugin has a very basic feature set, and I have never really promoted it — still, I was wondering how popular it is. I mean, it just works, reliably, and therefore it is a quite appropriate choice for many types of WordPress sites.

I came across NerdyData, a search engine for website source code:

We index the HTML, Javascript, CSS, and plaintext of hundreds of millions of pages.

Honestly, their web interface makes a pretty crappy impression. Nevertheless, It found about 140 websites that include the term “wp-geshi-highlight” in their source: https://search.nerdydata.com/search/#!/searchTerm=wp-geshi-highlight/searchPage=1/sort=pop

This is a short list of some web pages using WP-GeSHi-Highlight, found via NerdyData:

When seeing this, I found interesting that many make use of the simple customization and deviate from the default styling of WP-GeSHi-Highlight code blocks.

Kurze Einmischung zum Thema WotzApp

Ein neuer Stern am Himmel der Milliardenkonzerne. Man sollte WhatsApp nicht so sehr dafür loben, dass es simpel zu bedienen ist und funktioniert. Das können andere auch. Anderes ist wichtig. Ihr erinnert euch vielleicht: bis vor Kurzem konnte man sehr einfach WhatsApp-Nachrichten im Namen anderer Leute verschicken. Ist schwieriger geworden, geht aber noch. Was ist eigentlich sicher an WhatsApp?

  • Der Nutzer ist Produkt.
  • Datenschutz hat geringe Priorität.

Ersteres ist spätestens seit dem 19.02.2014 klar. Zweiter Punkt: zum Beispiel kann man im lokalen WLAN versendete Nachrichten mit relativ einfachen Mitteln mitlesen. Es gibt sicherlich noch viele andere kleinere und größere Datenschutz-Probleme — aber all diese Dinge interessieren nur einen kleinen Bruchteil des gemeinen Volkes.

WhatsApp: keine Standards, keine Sicherheit. Einfach Chat im 90er Style für’s Handy.

Die WhatsApp-Ingenieure haben zu Anfang quick & dirty gearbeitet und die Grundzüge ihrer Architektur nicht an gängigen Standards ausgerichtet. IT-Sicherheit und Datenschutz ohne sich an Standards zu halten? Sowas ist von Vornherein im mathematischen Sinne ill-posed. Die Sorglosigkeit bei der technischen Umsetzung von WhatsApp ist uns schon seit Jahren bewusst. Uns ist doch klar, was WhatsApp im Kern ist: eine ganz simple, total gleichgültige Form des Chats. Sicherheit und Datenschutz völlig egal. Folgendes hatte mich schon vor Jahren beeindruckt: man muss sich nicht bei WhatsApp “einloggen”, es gibt kein (geteiltes) Geheimnis.

Krypto-Grundkurs

Man muss sich gar nicht weiter mit WhatsApp beschäftigen, um große Datenschutz-Skepsis dagegen zu hegen. Da schreiben Leute miteinander ohne vorher ein Geheimnis auszutauschen. Mal ein ganz kleiner kurzer Mini-Krypto-Ausflug, vielleicht erreiche ich ja ne schmale Masse, also nen kleinen Teil der breiten Masse. Ein Geheimnis ist etwas was nur DU kennst. Deine Telefonnummer ist kein Geheimnis. Deine IMEI ist auch kein Geheimnis. Ohne ein Geheimnis kann man

  • sich nicht sicher authentifizieren (eindeutig ausweisen),
  • keine Daten sicher verschlüsseln,
  • die Integrität versendeter Daten nicht gewährleisten.

Im Umkehrschluss heißt das für Kommunikation ohne Geheimnis:

  • Jeder kann (mit mehr oder weniger Aufwand) in deinem Namen Nachrichten versenden.
  • Jeder auf dem Kommunikationsweg zwischen dir und dem Empfänger (WLAN, ISP, …) kann deine Nachrichten lesen.
  • Jeder auf dem Kommunikationsweg zwischen dir und dem Empfänger (WLAN, ISP, …) kann deine Nachrichten verändern.

Was wollen die Leute eigentlich? Sicherheit eher nicht so.

Das reicht schon für den Grundkurs. Aber jetzt mal ehrlich: E-Mail und SMS leiden unter den gleichen Problemen. Und auch ICQ und Skype bieten keinen theoretisch vollständigen Schutz, obwohl man hier ja ein Geheimnis benutzt, die Login-Daten (das ist die falsche Geheimnisform, aber das wollen wir hier jetzt nicht behandeln). Und bei DE-Mail muss man sich aufregen, denn hier wird Sicherheit versprochen, die nicht existiert.

Das alles interessiert kaum jemanden. Und ich glaube das ist der Kern — eine wichtige Einsicht: echte Sicherheit wollen die Leute gar nicht unbedingt. Meistens ist ihnen schlicht egal, ob “jemand” mitlesen kann. Ist die breite Masse da irgendwas zwischen naiv und illusorisch? Vielleicht, ist aber egal. Konzeptionell perfekte IT-Sicherheit und alltägliche menschliche Kommunikation passen nicht so recht zusammen. Die selbe Gleichgültigkeit haben wir doch beim NSA-Skandal gesehen. Wo bleibt der Aufschrei? #aufschrei? #aufschrei3000? Mir ist das Ganze ja auch ein Stück weit egal — schließlich benutzte ich ICQ, versende E-Mails und SMS und bin seit Kurzem auch WhatsApp-Nutzer. Dabei weiß ich bei jeder dieser Techniken genau, wie man hier angreifen kann. Habt ihr etwa noch nie per tcpdump im Router die Nachrichten eurer Mitbewohner mitgeschnitten ;-)?

Aber bitte seid euch doch im Klaren darüber, was hier passiert.

Was meiner Meinung nach wichtig ist: das Bewusstsein darüber, wer da Daten von wem in welcher Größenordnung sammeln kann. Und Bewusstsein darüber, dass man sich unter Umständen verkauft. Schaut mal in diesen offiziellen Blogpost von WhatsApp aus 2012:

Remember, when advertising is involved you the user are the product.

Sie erklären da, dass der WhatsApp-Nutzer nicht das Produkt ist, weil sie keine Werbung verwenden. Dann reden sie von ihrer ach so tollen Architektur und dass die simple Form der Kommunikation ihr Produkt ist, nicht etwa der Nutzer oder seine Daten:

That’s our product and that’s our passion. Your data isn’t even in the picture. We are simply not interested in any of it.

19 Milliarden $ für was genau? Achso, ja klar.

Das da oben klang schon immer schmutzig. Neuerdings erscheinen diese Aussagen aber in besonders reudigem Licht, ich formuliere das mal simpel:

  • 19 Milliarden $ für eine (IT-)Architektur, die viele besser hinbekommen hätten? Nee.
  • 19 Milliarden $ für eine riesige Nutzerzahl? Ja.

Was ist also das Produkt? Die Nutzer, genau, wie immer. Seid euch drüber im Klaren.

Was ich noch sagen will: heml.is und BitTorrent Chat

Wenn man wirklich mal sichere Kommunikation braucht, dann muss man wissen, wo man die bekommt. Die Medien haben sich gerade auf Threema eingespielt. Schön für die Schweizer, da klingeln bestimmt gut die Kassen. Soweit ich das sehe, ist das kryptographisch solide gemacht. Man hat sich an anerkannte Standards gehalten. Und arbeitet mit echten Geheimnissen. Die Nutzdaten, also die Nachrichteninhalte, scheinen sicher. Ihr müsst aber wissen, dass auch die Leute von Threema natürlich Metadaten sehen und sammeln können, also wer mit wem wann wie viel und so (eigentlich alles außer was und warum vielleicht :-)). Außerdem ist Threema nicht — wie einige Konkurrenten — kostenlos. Nebenbei bemerkt: ihr müsst kein schlechtes Gewissen haben, wenn ihr eine kostenlose App installiert. Der Flappybird-Mann hatte 50.000 $ tägliche Werbebeteiligung nur durch die Präsenz in den jeweiligen Applikations-Einkaufsläden.

Ich würde eure Aufmerksamkeit gerne auf heml.is und BitTorrent Chat lenken. https://heml.is/ wird seit Längerem von drei Schweden entwickelt. Die machen durch ihre Planungs- und Informationspolitik einen äußerst sympathischen und professionellen Eindruck. Sie stehen kurz vor Release und man twittert ihnen schon zu, dass sie am besten sofort jetzt, blabla, aber man reagiert recht cool:

A car without wheels may be 99% complete but is pretty useless, right?

Also, https://heml.is/, merkt euch das mal. Macht einen besseren Eindruck als Threema. Auch BittorrentChat ist sehr vielversprechend — in anderer Art und Weise. Wie immer rund um “Torrent” wird hier ein dezentraler Ansatz verfolgt. Ein selbstregulierendes P2P-Netz. Nur mit so einem Ansatz kann man Anonymität verwirklichen (Ähnlichkeit zu TOR), nur so kann man das effiziente Sammeln von Metadaten verhindern. Auch BittorrentChat ist noch nicht fertig, aber kurz vor Release.

Building numpy and scipy with Intel compilers and Intel MKL on a 64 bit machine

At work we make heavy use of the Python/numpy/scipy stack. We have built numpy and scipy against Intel’s MKL, with Intel’s C++ compiler icc and Intel’s Fortran compiler ifort. As far as I know, and also from my own experience, this combination of tools lets you get close-to-optimum performance for numerical simulations on classical hardware, especially for linear algebra calculations. If you get the build right, and write proper code, i.e. use numpy’s data types and numpy’s/scipy’s functions the right way, then — spoken in general terms — no commercial or other open source software package is able to squeeze more performance from your hardware. I just updated the installation to the most recent Python 2 (2.7.6), numpy 1.8.0 and scipy 0.13.3. I built on a 64 bit machine with the Intel suite version 12.1.3 — newer versions of the Intel suite have troubles with respect to resolving library dependencies (which is the topic of this blog post). The documentation on the topic of building numpy and scipy with the Intel suite is sparse, so I describe the procedure I took in this article.

A few notes on performance and version decisions: compiling numpy and scipy against MKL provides a significant single-thread performance boost over classical builds. Furthermore, building against Intel’s OpenMP library can provide extremely efficient multicore performance through automatic threading of normal math operations. These performance boosts are significant. You want this boost, believe me, so it makes a lot of sense to build numpy/scipy with the Intel suite. However, it does not really make sense to build (C)Python with the Intel suite. It is not worth the effort — somewhere I have read that it might even be slower than an optimized GCC build. In any case, Python is not so much about math. It is about reliability and control. We should do math in numpy/scipy and use optimized builds for them — then there simply is no significance in optimizing the Python build itself. Regarding the Intel suite, I am pretty sure that newer versions do not provide large performance improvements compared to the 12.1.3 one. These reasons made me build things the following way:

  • CPython 2.7.6 in classical configure/make/make install fashion, with the system’s GCC.
  • numpy and scipy with Intel suite (MKL, icc, ifort) version 12.1.3.

Prerequisites

I assume that you have built Python as a non-privileged user and installed it to some location in your file system (btw: never just overwrite your system’s Python with a custom build!). Set up the environment for this build, so that

$ python

invokes it. You can validate this via $ python --version and $ command -v python. In the following steps, we will work with the same user used for building and installing Python: numpy and scipy installation files will go right into the directory tree of the custom Python build.

I also assume that you have a working copy of MKL, icc, ifort, and that you have set it up like this:

$ source /path_to_intel_compilers/bin/compilervars.sh intel64

After invoking compilervars.sh, your PATH and LD_LIBRARY_PATH environment variables should contain the directories where all the Intel binaries reside. A simple test (which you also should perform):

$ icc --version
icc (ICC) 12.1.3 20120212

Prepare, build, install, and validate numpy

In the numpy source directory create the file site.cfg with the following content:

[mkl]
library_dirs = /path_to_intel_compilers/mkl/lib/intel64/
include_dirs = /path_to_intel_compilers/mkl/include/
mkl_libs = mkl_rt
lapack_libs =

That is right, no more contents are needed in that file, it can be that simple. This approach is also recommended in http://software.intel.com/en-us/articles/numpy-scipy-with-mkl.

What is left to be done is setting compiler flags. The build process uses compiler abstractions stored in the two files: numpy/distutils/fcompiler/intel.py and numpy/distutils/intelccompiler.py. You might feel the need to treat these files with respect, because they appear to be so important. I am frightened of these files, because they are partly inconsistent in themselves and have a quite bloaty appearance not justified by the very simple purpose I expect them to serve. My point is: do not be afraid, and make some drastic reductions in order to be sure about what happens during the build. In numpy/distutils/fcompiler/intel.py you can safely edit the get_flags* methods of the IntelEM64TFCompilerclass, so that they look like this:

class IntelEM64TFCompiler(IntelFCompiler):
    compiler_type = 'intelem'
    [...]
 
    def get_flags(self):
        return ['-O3 -g -xhost -openmp -fp-model strict -fPIC']
 
    def get_flags_opt(self):
        return []
 
    def get_flags_arch(self):
        return []

See, this compiler class describes itself as intelem and that is the compiler type we are going to use in the build command line. The “IntelEM64TFCompiler” has no meaning at all. We are just editing one of the classes, make sure that it has the compiler flags we want and eventually use this compiler abstraction during build. In the code above, I have made sure that the compiler flags '-O3 -g -xhost -openmp -fp-model strict -fPIC' are used by making the method get_flags return them, and make all other related methods return “nothing”. Additionally (not shown above), I have also edited the possible_executables attribute of the class:

possible_executables = ['ifort']

just to make sure that ifort will be used. I have also removed all compiler classes not needed from that file, just to get a better overview.

With the above’s steps, the Fortran compiler has been set up. Next, edit numpy/distutils/intelccompiler.py for configuring the C compiler. I have deleted tons of stuff from this file. What follows is all the content remaining (you can copy/paste it like this, I think, this should be safe for most purposes):

from __future__ import division, absolute_import, print_function
 
from distutils.unixccompiler import UnixCCompiler
from numpy.distutils.exec_command import find_executable
 
class IntelEM64TCCompiler(UnixCCompiler):
    """ A modified Intel x86_64 compiler compatible with a 64bit gcc built Python.
    """
    compiler_type = 'intelem'
    cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'
    #cc_args = "-fPIC"
    def __init__ (self, verbose=0, dry_run=0, force=0):
        UnixCCompiler.__init__ (self, verbose, dry_run, force)
        compiler = self.cc_exe
        self.set_executables(compiler=compiler,
                             compiler_so=compiler,
                             compiler_cxx=compiler,
                             linker_exe=compiler,
                             linker_so=compiler + ' -shared')

Regarding the compiler flags I trusted the Intel people somewhat and took most of them from http://software.intel.com/en-us/articles/numpy-scipy-with-mkl. I read the docs for all of them, and they seem to make sense. But you should think these through for your environment. It is good to know what they mean.

Then build numpy:

python setup.py build --compiler=intelem | tee build.log

Copy the build to the Python tree (install):

python setup.py install | tee install.log

Validate that the numpy build works:

20:20:57 $ python
Python 2.7.6 (default, Feb 18 2014, 15:09:15) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.8.0'
>>> numpy.test()
Running unit tests for numpy
NumPy version 1.8.0
NumPy is installed in /projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy
Python version 2.7.6 (default, Feb 18 2014, 15:09:15) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
nose version 1.3.0
[... snip ...]
----------------------------------------------------------------------
Ran 4969 tests in 63.975s
 
OK (KNOWNFAIL=5, SKIP=3)
<nose.result.TextTestResult run=4969 errors=0 failures=0>

That looks great. Proceed.

Build, install, and validate scipy

Now that numpy is installed(!), we can go ahead with building scipy. Extract the scipy source, enter the source directory and invoke

python setup.py config --compiler=intelem --fcompiler=intelem build_clib --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem --fcompiler=intelem install | tee build_install.log

See how we here enforce usage of intelem (which is just a label/name) for all components? This is important. The scipy build process uses the same build settings as used by numpy, especially the distutils compiler abstraction stuff (which is why numpy needs to installed before — this is a simple fact that the official docs do not explain well). I built and installed at the same time, on purpose. When doing build and install in separate steps, the install step involves some minor compilation tasks which are then performed using gfortran instead of ifort. At first I thought that this is none of an issue, but when executing scipy.test() I soon got a segmentation fault, due to mixing of compilers. When using the command as above (basically taken from http://software.intel.com/en-us/articles/numpy-scipy-with-mkl), the test result is positive:

Python 2.7.6 (default, Feb 18 2014, 15:09:15) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.test()
Running unit tests for scipy
NumPy version 1.8.0
NumPy is installed in /projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy
SciPy version 0.13.3
SciPy is installed in /projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/scipy
Python version 2.7.6 (default, Feb 18 2014, 15:09:15) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
nose version 1.3.0
/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy/lib/utils.py:134: DeprecationWarning: `scipy.lib.blas` is deprecated, use `scipy.linalg.blas` instead!
  warnings.warn(depdoc, DeprecationWarning)
/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/numpy/lib/utils.py:134: DeprecationWarning: `scipy.lib.lapack` is deprecated, use `scipy.linalg.lapack` instead!
  warnings.warn(depdoc, DeprecationWarning)
 
[ ... snip ...]
======================================================================
ERROR: test_fitpack.TestSplder.test_kink
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/scipy/interpolate/tests/test_fitpack.py", line 329, in test_kink
    splder(spl2, 2)  # Should work
  File "/projects/bioinfp_apps/Python-2.7.6/lib/python2.7/site-packages/scipy/interpolate/fitpack.py", line 1186, in splder
    "and is not differentiable %d times") % n)
ValueError: The spline has internal repeated knots and is not differentiable 2 times
 
----------------------------------------------------------------------
Ran 8934 tests in 131.506s
 
FAILED (KNOWNFAIL=115, SKIP=220, errors=1)
<nose.result.TextTestResult run=8934 errors=1 failures=0>

The one failing test is a corner case known to fail in some scenarios (https://github.com/scipy/scipy/issues/2911). I might be able to narrow it down and help resolving named issue.

Congratulations, done. Enjoy the performance.