DragonWorks

ye shall overcome...

23 MAY 2017

Menu

Chapter 8

Statistics

Mathematical statistics (also called statistical theory) is the branch of applied mathematics that uses probability theory and analysis to examine the theoretical basis of statistics.

Lying with statistics is easy, in fact it can be easily performed by mathematical-amateurs. Statistical prevarication is the art of straddling both sides of an issue or idea. A misuse of statistics occurs when a statistical argument asserts a falsehood.

While illiteracy is very often publicly discussed, hardly anybody talks about innumeracy – lacking knowledge and understanding of mathematical concepts and methods. The problem lies in the fact that many are functionally innumerate. This means that they are not capable of dealing with concepts like graphical displays of data, estimating risks (probability) and they are lacking the logic for statistical analysis. The main difference to being illiterate and innumerate is that most people don't even realize that they are affected by it. It's a problem even mathematicians are not completely free of.

Let us assume a statistic is true, it represents one part of the whole picture leaving the entire rest to be examined elsewhere.One should focus more on identifying the sources and limits of the statistic to eliminate prevarications (duplicity). However, rarely is the case that when quoting a statistic are the limits of that statistic mentioned.

If you take a room full of monkeys and give them computers they will eventually write “Moby Dick.”   Innumeracy (Sounds Plausible) - Statistical calculations show that the odds are too remote for this to be true. Mathameticians indicate that the “odds” number (for the first line alone) is so large as to be larger than the estimated number of atoms in our galaxy.

There are 101 keys on the modern keyboard plus 12 function keys. I will take the smaller of 101. The odds number that is represented is 80, the number of characters in a line of text; to the 101 power, the number of possible combinations for each of those 80 characters… (80101) OR 162 with 190 zeros. That is a large number and now that we have rapid computer simulations we have proven the monkey assertion to be not only false but ridiculous.

Another part of statistics is the concept of correlation. Statistical analysis of data may reveal that two variables (that is, two properties of the population under consideration) tend to vary together, as if they are connected. For example, a study of annual income and age of death among people might find that poor people tend to have shorter lives than affluent people. The two variables are said to be correlated. However, one cannot immediately infer the existence of a causal relationship between the two items; you see… correlation does not imply causation; I.E., A causes B. Consider the following:

  1. B may actually be the cause of A.
  2. B may be the cause of A at the same time as A is the cause of B… this is a self-reinforcing system.
  3. Some unknown third factor may actually cause the relationship between A and B to exist.
  4. The relationship may be so complex that the only inference is that they occur at the same time.

In other words, in this instance, there can be no conclusion made regarding the existence or the direction of a cause and effect relationship only from the fact that A is correlated with B. However, it is my experience that most individuals which use statistical inferences usually imply a direct causality relationship.

These examples help to show that things with statistically significant correlations are not necessarily related:

"Statistics show that of those who contract the habit of eating, very few survive." -- Wallace Irwin.

Statistics show that birthdays are good for you, the more you have the longer you live.

Statistics also show that the cause of death is birth as there is a 1 to 1 direct relationship. After all if you aren’t born you don’t die.

And a not so ridicules example, ice cream consumption and murder rates are highly correlated. Now, does ice cream incite murder or does murder increase the demand for ice cream? Neither: They are joint effects of a common cause or lurking variable, namely, hot weather. Another look at the sample shows that it failed to account for the time of year, including the fact that both rates rise in the summertime.

Statistical analyses is used in surveys. The answers to surveys can often be manipulated by wording the question. For example the two questions:

  1. Do you support the attempt by USA to bring freedom and democracy to other places in the world?
  2. Do you support the unprovoked military action of the USA?

These two questions would be answered differently but are essentially the same question. And there is the famous lawyer question, “Have you stopped beating you wife?” in which there is not a correct answer.

There is also the biased sample; over generalizations; mis-reporting estimated errors; not understanding estimated errors; data manipulation; and discarding favorable data which cause polls or surveys to be erroneous and misleading.

My conclusion:
When facing a statistical inference we must understand that the individual is trying to prove his point. And, if we are to understand the statistic at all, what relationship the inference to this particular statistic used is a correlated part and to which population. We must know how that statistic was developed to make any worthwhile conclusion.

We must learn about the mathematical model statistics uses and become familiar with its language. We also need to have knowledge about probability. In short, we cannot be innumerate. We must then consider the whole as in the ice cream example. The whole must be counted in its entirety to remove prevarications or double meanings.