Professor '''John Brignell''' held the Chair in Industrial Instrumentation at University of Southampton (UK) from 1980 to the late 1990s. [http://www.ecs.soton.ac.uk/~jeb/cv.htm]
Brignell retired in the late 1990's from his academic career and now devotes his time to his interest in debunking the use of what he claims to be false statistics common in much of today's media. He presents his views on his website ''Numberwatch'', which was launched in July 2000, and is "devoted to the monitoring of the misleading numbers that rain down on us via the media. Whether they are generated by Single Issue Fanatics (SIFs), politicians, bureaucrats, quasi-scientists (junk, pseudo- or just bad), such numbers swamp the media, generating unnecessary alarm and panic. They are seized upon by media, hungry for eye-catching stories. There is a growing band of people whose livelihoods depend on creating and maintaining panic." [http://www.numberwatch.co.uk/number%20watch.htm]
Brignell has expressed delight with the feedback from the "encouragement and support I have received from some of the giants of the pro-science movement in the USA -- in no particular order [[Steve Milloy]], [[Alan Caruba|Alan Coruba]] [''sic''], [[James Randi]], [[Bob Caroll]], [[Michael Fumento]] and [[S. Fred Singer]]." [http://www.numberwatch.co.uk/term%20end.htm].
A number of popular, politically correct theories are based on unsound mathematical evidence. In choosing to debunk such theories Brignell occasionally provokes the wrath of self interested supporters. He seems to enjoy this.
== Statistical Significance (P<0.05) ==
Brignell suggests that one common source of error in experiments is the use of low levels of significance in statistical testing, particularly P<0.05.
If one applies a statistical test at a level of significance of 0.05, that means that one is accepting a probability of 0.05 of a false positive; that is, one is accepting that there is a chance of 1 in 20 that the result will appear significant when it isn't. Note that these odds apply to all tests carried out, not just the ones that return significant results. This has been called the 1 in 20 lottery.
This can cause problems when combined with publication bias. Studies that produce significant results tend to be published, ones that don't tend not to be. However, the number of false positives depends on the total number of studies carried out rather than the number published. In simple terms, if 1 in 2 studies produce significant results, then 50% of them will be published of which 10% will be bogus. If 1 in 5 studies produce significant results, then 20% of them will be published of which 25% will be bogus. If 1 in 10 studies produce significant results, then 10% of them will be published of which 50% will be bogus. How many studies produce significant results? What percentage of published studies are bogus? The numbers are simply not known.
Another source of problems occurs when this level of significance is used in combination with categorisation. This occurs when the data in a study is broken up into a number categories, and a statistical test is applied in each category. The effect is that of having multiple goes at the lottery: the more categories, the better the chance of producing a false positive. If the test is applied to 1 category, the odds of getting at least one bogus result are 1 in 20 (0.05). If there are 4 categories, the odds are nearly 1 in 5 (0.185). If there are 8 categories, the odds are about 1 in 3 (0.337). If there are 12 categories, the odds are about 9 in 20 (0.46). If there are 16 categories, the odds are about 5 in 9 (0.56). The odds continue to rise with the number of categories.
Combining these sources of error compounds them. If 100 studies are conducted at a significance level of 0.05, each categorising their data into 10 categories, and 50 of them are published, then 40 will be bogus and only 10 will have found any real significance.
If one hears that a study has been published that has "found a link between thyroid cancer and eating chocolate in women between the ages of 20 and 30 who eat more than three pieces of chocolate per day", how likely is it that the study is bogus? If one assumes that data was collected from a randomly selected sample of people, one could hypothesise that the data was categorised by sex (male, female), age (less than 20, 20 to 30, 30 to 40, over 40), and number of pieces eaten (none, 1 to 3, more than 3), for a total of 24 categories. This would suggest that the basic chance of the study being bogus is about 70% Incorporating publication bias might raise it to over 90%. This is of course assuming that the level of significance used is 0.05, a safe assumption in most such studies.
Brignell suggests that the use of a significance level of 0.05 is inherantly unsound. He suggests that "It is difficult to generalise, but on the whole P<0.01 would normally be considered significant and P<0.001 highly significant."
[http://http://www.numberwatch.co.uk/significance.htm]
Brignell suggests: "Many leading scientists and mathematicians today believe that the emphasis on significance testing is grossly overdone. P<0.05 had become an end in itself and the determinant of a successful outcome to an experiment, much to the detriment of the fundamental objective of science, which is to understand."
== Fitting Linear Trends ==
Brignell states "One of the major problems in using a finite sequence of data to represent a source that is effectively infinitely long is that the process of chopping off the ends is a distortion. [...] The other is the fact that, even when there is no linear trend in the original process, there is always one in the finite block of data taken to represent it."
Picture the sine wave y=sin(x). It is a continuous graph centred around the x axis. The x-values, are in radians, and the y-values range between 1 and -1. Lets take some samples from the graph.
*Sample 1: The points are (0.44,0.42), (0.52,0.50), (0.61,0.57), (0.70,0.64) and (0.79,0.71), and the fitted curve is y = 0.82 x + 0.07
*Sample 2: The points are (0.79,0.71), (0.87,0.77), (0.96,0.82), (1.05,0.87) and (1.13,0.91), and the fitted curve is y = 0.57 x + 0.26
*Sample 3: The points are (1.13,0.91), (1.22,0.94), (1.31,0.97), (1.40,0.98) and (1.48,1.00), and the fitted curve is y = 0.26 x + 0.62
*Sample 4: The points are (1.48,1.00), (1.57,1.00), (1.66,1.00), (1.75,0.98) and (1.83,0.97), and the fitted curve is y = -0.09 x + 1.13
*Sample 5: The points are (1.83,0.97), (1.92,0.94), (2.01,0.91), (2.09,0.87) and (2.18,0.82), and the fitted curve is y = -0.42 x + 1.74
So, depending on which set of points are selected, the associated linear trend is either an 82% rise, a 57% rise, a 26% rise, a 9% fall, or a 42% fall!
A sine wave is a continuous, cycling curve; it has no linear trend. If one were to try to fit a line of best fit to it, the best one could fit would be the x-axis, y=0. Yet none of these samples has given that line. It should be obvious that, depending on what data points are used, it is possible to get just about any line of best fit that one could desire. The linear trend found is there, not because of any underlying cause, but because it is generated by the act of selecting the data set.
As an alternative example, consider a set of points generated at random. There can obviously be no linear trend to such data. And yet, a line of best fit can be applied to any subset of these points, and a linear trend deduced from it. Again, the linear trend found is there, not because of any underlying cause, but because it is generated by the act of selecting the subset.
These examples illustrate the "greatest hazard of trend estimation. The trend is a property of the data points we have and not of the original process from which they came. As we can never have an infinite number of readings there is always an error introduced by using a restricted number of data points to represent a process in the real world. Naturally the error decreases rapidly as we increase the number of data points, but it is always there and, in fact, the calculated trend is never zero".
It is not enough to fit a line to a set of points and declare a linear trend. One must understand the underlying data in order to know whether there actually is a real trend. It may simply be that the trend one sees is an artifact of the selection of data points, and doesn't actually exist in the underlying process.
Brignell's comments can be found here.
[http://www.numberwatch.co.uk/Trends.htm]
== The End Effect ==
Brignell states "A major problem is the end effect, which relates to the huge changes in apparent slope that can be wrought just by the choice of where to start the data selection." "The reason for this is that in the calculation of the slope, the contribution of each data point is weighted according to its distance from the centre", so variation in the end points has much more effect on the final result than variation in the central points.
If we start with the nine points, (1,1), (2,2), (3,3), ..., (9,9), the slope of the fitted line is 1. (Obviously.) If we replace the points at (3,3) through (7,7) with (3,8), (4,8), (5,8), (6,8) and (7,8), the slope of the fitted line is 0.83. The change has flattened the line. If we instead replace the end point with (1,1) with (1,4), the slope of the fitted line is 0.80. The effect on the slope of changing just the end point is greater than that of changing five central points.
Brignell offers the following as an example of the end point effect.
[http://www.numberwatch.co.uk/2003%20May.htm#wheeze]
The graph contains 15 data points displaying a strong linear trend. The removal of 2 points from either end removes the trend. So the trend is really only supported by those 4 data points.
Brignell's comments can be found here.
[http://www.numberwatch.co.uk/Trends.htm]
== Trojan Numbers ==
The term Trojan number was coined by Brignell to describe a number used by authors to "get their articles or propaganda into the media." "The allusion is, of course, to the mythical stratagem whereby the Greeks infiltrated the city of Troy inside a giant wooden horse." The number looks impressive, but on further examination isn't.
Brignell states "The major form of Trojan Number is the size of study. Early on in the piece it will be mentioned that (to invent some arbitrary numbers) there were 60,000 people in the study. The number experiencing the condition in question, say toe-nail cancer, is, however, much smaller, perhaps 60. Of these the number indulging in the putative cause, say passive drinking, is even smaller say 20. There is a number expected (as a proportion of the 60) at random from knowledge of the statistics for the general population, say, 14. Thus the number that really matters, the excess number of cases, is half a dozen. It is surprising how often an apparently huge study whittles down to an excess that you can count on your fingers. If the number 6 had been mentioned at the outset, the claim would have been laughed out of court, so it is never mentioned, though you can often have a pretty good stab at deducing it. In the statistics of rare events an excess of 6 on an expectation of 14 would be unsurprising. The rest of the 60,000 are mere bystanders." In fact, finding an extra 6 would not be significant.
Trojan numbers can be repeatedly presented in different ways. For example, "3.21% of passive drinkers are depressed" or "72.45% of women under 35 are unaware that passive drinking causes toe-nail cancer". In each case, after the headline and a couple of sentences, the body of the article is a repeat of the material already presented. The repeated presentation of the same material helps to lodge it into the public consciousness, as well as to raise the profile of the academic doing the research.
Brignell notes "One of the most effective forms of Trojan Number is the Virtual Body Count. Sub-editors cannot resist a headline Thousands to die of X." The body count is of course obtained by dividing the size of the study into the country's population and multiplying the result by 6.
Brignell's comments on trojan numbers can be found here.
[http://www.numberwatch.co.uk/trojan_number.htm]
== DDT ==