The Personal Website of Mark W. Dawson
Containing His Articles, Observations, Thoughts, Meanderings,
and some would say Wisdom (and some would say not).
Oh What a Tangled Web We Weave
When first we practise to deceive!
- Sir Walter Scott
Knowing what is important, what is unimportant, and what is misleading when reviewing studies or statistics is crucial to discovering the truth. To repeat my comments on “Studies Show” and “Statistics Show” in my observation on “Phrases”:
Studies can show anything. For every study that shows something, there is another's study that shows the opposite. This is because every study has an inherent bias of the person or persons conducting the study, or the person organization that commissioned the study. A very good person conducting the study recognizes their biases and compensates for them, to ensure that the study is as accurate as possible. Having been the recipient of many studies (and the author of a few) I can attest to this fact. Therefore, you should be very wary when a person says "studies show". You should always look into a study to determine who the authors are, who commissioned the study and to examine the study for any inherent biases.
Everything that I said in "studies shows" also apply in statistics show. However, statistic show requires more elaboration, as it deals with the rigorous mathematical science of statistics. Statistics is a science that requires very rigorous education and experience to get it right. The methodology of gathering data, processing the data, and analyzing the data is very intricate. Interpreting the results of the data accurately requires that you understand this methodology, and how it was applied to the statistics being interpreted. If you are not familiar with the science of statistics, and you did not carefully examine the statistics and how they were developed, you can often be led astray. Also, many statistics are published with a policy goal in mind, and therefore should be suspect. As a famous wag once said, "Figures can lie, and liars can figure". So be careful when someone presents you with statistics. Be wary of both the statistics and the statistician.
Some additional comments on this subject are as follows:
Studies and statistics often claim to be scientific and rigorous. However, most of them are not as scientific or as rigorous as we may believe. Most studies are based on statistics, and most statistics become studies. But most studies based on statistics have issues with correlation, sampling, and confidence level, not to mention risk factors and probabilities, along with a host of other issues. The best book I have read that explains these issues is “Studies Show: A Popular Guide to Understanding Scientific Studies" By John H. Fennick.
One of the major issues with statistics that you should be aware of, especially when a politician starts quoting statistics to support their policy position, is the problem of Correlation vs. Causality. Correlation is when two or more statistics are compared and they seem to be in sync, especially when they are graphed. A Causality occurs when two or more statistics are related, and a change in one or more of the statistic affects the other(s) statistic. But as statisticians are trained “Correlation does not imply Causation” as exemplified in the following statistical graph:
Both statistics correlate, but neither has any causation on each other. These examples are extreme, but most Correlation vs. Causality issues are much subtler. Whenever you are presented with a statistic you should carefully consider the Correlation vs. Causality issue that it may contain.
For more on this subject refer to the Economics section of my “Further Readings”.
Benjamin Disraeli once famously said, "There are three kinds of lies: lies, damned lies, and statistics."
However, there are actually four kinds of lies: mistakes, lies, damned lies, and statistics.
Mistakes are when you have said something that you believe to be true, but later discover it was untrue. After the discovery of your mistake, you have a moral responsibility to correct the record with those who you had misinformed.
Lies make the world go around. They are told to protect the feeling of others or to prevent embarrassment to ourselves. They should only be told if no harm comes from them. Otherwise, they will become Dammed Lies.
Dammed Lies are told to gain an advantage for ourselves or to demonize, denigrate, or disparage an opponent. They are despicable, and when discovered they the teller should be roundly condemned.
It has been said the "Figures Can Lie, and Liars Can Figure." Statistics are often utilized to justify a belief, but Statistics only provide a guide to what has happened, and what may happen. But they are open to interpretation, and this interpretation must be done with care, responsibility, and honesty (and often by highly statistically trained individuals) Probabilities are less open to interpretation and are usually wrong when the data being utilized is incorrect, or the algorithms being utilized are incorrect (Boolean logic or arithmetic operations are wrong). Therefore, one must always be careful and skeptical when utilizing statistics and probability to discuss an issue.
Finally, the other important issue with studies and statistics is knowing what you know, knowing what you don't know, and allowing for what you don't know that you don't know. A good analyst or statistician always points out what they can be certain of, of what they are uncertain of, and that they may be unaware of all the facts or circumstances that could potentially skew their results. A bad analyst or statistician will often obscure these factors in order to achieve the desired results.
A good, highly readable, understandable, and entertaining book about statistics is “Naked Statistics: Stripping the Dread From The Data” by Charles Wheelan. For those of my readers that are interested in obtaining more information on statistics, with minimal mathematics, I would recommend the book: The Art of Statistics: How to Learn from Data by David Spiegelhalter.
The following observations are a few examples that demonstrate how you must be careful when examining statistics.
An example of using figures and studies inappropriately is of the opposing posters that were circulated on the Internet a few years ago:
All of the statistics on both of these posters are true. But there are unreported statistics as well, and one side only gives the favorable or statistics, while the other side gives the unfavorable statistics. There is also the cherry-picking of statistics, and when to start and stop collecting statistics (in the above example the starting points were in the depths of an economic recession, and the endpoints were during an economic recovery). It is easy to believe the statistics that support your worldview, and easy to discount the statistics the contravene your worldview. But for an accurate worldview, you must consider all of the statistics (both pro and con) before forming or changing your worldview. A more complete picture of the facts on the above posters is as follows:
Even the above chart does not reveal the whole truth. The above poster shows different blocks of statistics but doesn’t show the interrelationships between the blocks, nor the impact of changes within the blocks that would affect the other blocks. You must always consider these interrelationships and effects to help you determine the true state of affairs.
Of course, you also need to statistically analyze these numbers as to their cause and effect, contributing factors, and government policy & regulation impacts. Even then you cannot get a complete or accurate representation of the truth, as there are too many constants and variables, interactions between the constants and variables, and perhaps insufficient time for the actual results to be measured. Again, one must always be careful and skeptical when utilizing statistics to discuss an issue.
Another example of utilizing statistics improperly is Calendar Time vs Labor Effort. While I was awaiting my security clearance at GE Aerospace it took almost seven months for me to receive my clearances. This was not because it took seven months to investigate my background, but it took six months to start the processing of my application. Once my application was started it only took about a week to perform the work to process the application. Therefore, the labor effort was one week, but the calendar time was seven months. The important statistic was the labor effort, while the calendar time was only an indication of the administrative backlog. If someone had only mentioned the calendar time you would have thought that this was an excessive amount of time to do the work. The labor time was the important fact to determine the work required to process an application.
One example of using only one statistic when two or more statistics are required to gain a fuller understanding of the actual situation. This is best shown in the example of employment within the United States. To better understand employment in the United States you need to know of the people who are without a job and looking for a job (The Unemployment Rate), and the people who are without a job and have given up looking for a job (The Underemployment Rate). Many people go back and forth between these categories, and you need to be aware of both rates to truly determine the employment rate within the United States. Very often the change in one rate influences a change in the other rate, and both rates are needed to determine employment within the United States. The following charts exemplify this:
As can be seen in the above chart the number of people who are not working in the United States can vary between 10 and 20 percent. And it is important that we know the total number of people without work so that we can provide them with the proper social services (mostly food and housing) while they have no work. Combining these and other factors you get the Civilian Labor Force Participation Rate. This rate shows what percentage of the U.S. Population is currently employed as follows:
As can be seen from this chart you can obtain a better perspective of employment within the United States. This chart, of course, can be analyzed with additional statistics, and be broken down into finer detail for a more comprehensive analysis. You must be careful in obtaining the proper statistics to determine your objective. This becomes more apparent in the next observation.
Another example of the misuse of statistics is documented in the “AA Efficacy Rates” of my “Addiction” observation. Reading this carefully will demonstrate how difficult it is to gather data, processing the data, and analyzing the data properly. This is also a good example of both “Studies Show” and “Statistics Show”, and how it is possible to reach an incorrect conclusion without broader and fuller analysis of studies or statistics.
Another example is a recent University of Iowa study of its students to determine What Men and Woman want in a life partner. They studied their students and gather statistics as to what their students wanted or didn’t want in in a partner. I have no doubt as to the veracity of this study, but I have serious doubts about the appropriateness of their conclusions. First the results of their study:
The first problem is that they polled their student body, a student body who has spent most of their life in an academic environment (K-12 & College). This study sample makes no allowance for non-academic experience in a normal social or work environment that we all know has a significant impact on our attitudes and values. Or as Mark Twain once said:
“When I was sixteen I thought that my father was the dumbest most ignorant man in the world. And when I turned twenty I was amazed how much he had learned in four short years.”
Therefore, this study was only appropriate for College students and has little bearing on what an adult may think after a few years in a normal environment.
The other problem with this study is that modern medical science knows that a human brain does not fully mature until sometime between the year's twenty-two and twenty-four. And the last part of the brain that develops is the center that makes judgments based on possible future consequences of your actions. That is why most young people behave in a wild and crazy manner – their brain has not developed sufficiently to control their actions. Therefore, this study is utilizing an immature brain as its sample group. Who knows what judgments may change after the brain is fully mature.
There is also the question of self-serving and bias answers. Did these students answer the questions the way they really think or the way they believed they should answer the question? Take the last category “Unimportant characteristics” as an example. The Chastity answer begs the question “Are the men answering this way to convince the woman that sexual promiscuity is acceptable?” and “Are woman answering this way to justify their own promiscuity?”. It would have been a much more meaningful statistic if it were broken down by students who were virgins, students with 1 to 3 sexual partners, students with 4-9 sexual partners, and students with ten or more sexual partners. Another criterion for the answer is that if one of the partners were much more promiscuous than the other would it make a difference. A breakdown of the sexuality of the students (heterosexual, homosexual male, homosexual female, bi-sexual, etc.) is also necessary. It should also be broken down by students that identify themselves by ideology (conservative, moderate, liberal, leftist), as well as religiosity (strongly religious, mildly religious, no religiosity, or atheistic). You could then judge the weight of this answer based on these backgrounds.
The “Similar political background” answer also needs to be broken down to students that identify themselves by ideology (conservative, moderate, liberal, leftist), as well as religiosity (strongly religious, mildly religious, no religiosity, or atheistic), and other possible categories.
The other categories also have the same types of questions as too self-serving and bias answers. Not having read the study (I couldn’t find the actual study, but I found the graphic being touted by special interest groups), I do not know if any of these items were broken down, and in which ways they were broken down (this is why a synopsis or simple graphic of a statistic is not a good basis for making a judgment).
To be truly useful this study also needs to have a follow-up of five, ten, fifteen, and twenty years after the original study was performed, with the original students who were studied. You could then study if they were married and/or divorced, had children, their socio-economic status, and other factors that may have changed their opinions. Such a study and follow-up would then be very useful to determine what men and woman really want.
Given the above, it can be seen that this study is only useful for people in a constricted environment (academics) with immature brains. It probably has little basis for a mature person with life experiences and a fully developed brain. As such it should only be utilized for analysis within its constricted study sample.
Studies and statistics are often used and abused to justify a political or social point of view. They are, however, often used and abused in all arenas. Therefore, you should be wary of all studies and statistics until you can review them to ascertain their veracity. To do this you will need more knowledge than this article can provide. Statistics and Studies can be a dense and dry subject, but there is two books I would highly recommend as it it is very readable and enjoyable. These books are; "Naked Statistics: Stripping The Dread From The Data" By Charles Wheelan and "Studies Show: A Popular Guide To Understanding Scientific Studies” By John H. Fennick. If you are interested in knowing more about statistics and studies than these books are for you.