Apocalypse soon: Unwarranted skepticism and the growth fetish

The dire predictions of the Club of Rome’s 1972 report on The Limits to Growth have supposedly been refuted by subsequent studies, but the refutations have serious shortcomings. This article is based on a presentation to the NZ Skeptics 2009 conference in Wellington, 26 September.

We belong to a species that dominates the planet. After millennia of steady growth which have altered regional environments and killed off many species, the human population has exploded during one lifetime. Whereas it took millennia to reach the first billion, the human population tripled in 140 years to three billion by 1960, and is currently trebling again in just 80 years, to nine billion in 2040. We have become a plague.

Many scientists, including myself, have been concerned with this picture. There is considerable evidence describing an overpopulated world, threatened by food and water shortages, a shortage of oil supplies, and huge changes due to global warming. Consider the message in the figure on the right, which adds more recent data to the Limits to Growth forecasts of Meadows et al (1972) for the Club of Rome’s Project on the Predicament of Mankind. World population may hit a peak around 2040-2050 and then rapidly decline. My own research, including work with a number of international forecasting projects, suggests that the peak will happen sooner, around 2030.

This model was based on a considerable body of research and is supported by many other more detailed studies. Here we have a picture of a world in which population may plummet following an overshoot-and-decline pattern when limits are passed.

I looked at this some 35 years ago with the eyes of an applied mathematician. I had seen that a model can capture the essence of a situation and provide realistic guidance, just so long as the model is based on the key aspects of a greater complexity. The thought of possible global collapse within one lifetime impressed me and I set off on a new career. I have found that the picture based on physical science can readily be fleshed out by reference to past historical events. It is easy to foresee the repetition of population collapse, social breakdown and war.

So here am I proclaiming apocalypse in just 20 years. What do you make of it? Either I am mistaken or society is just a little bit crazy. J B Priestley made this point in relation to William Blake:

And no doubt those who believe that the society we have created during the last hundred and fifty years is essentially sound and healthy will continue to believe, if they ever think about him, that Blake was insane. But there is more profit for mind and soul in believing our society to be increasingly insane, and Blake (as the few who knew him well always declared) to be sound and healthy.

I introduce this point as I have been treated as a pariah for taking up an extremely important scientific endeavour. Should you be sceptical of those like me who talk of impending catastrophe? Certainly, but consider the alternative which is to put your faith in those who have dismissed the reality of a finite Earth. The Limits to Growth was the subject of widespread denunciation by the supporters of status quo economic growth. Let’s look at the validity of some of the critics; we at the DSIR considered many and found some bizarre arguments.

One key critique was a 1977 Report to the United Nations, The Future of the World Economy, by a team headed by Nobel Prize- winning economist Vassilly Leontief. The Dominion reported that:

Among the most significant aspects of the study are its rejection of predictions by the Club of Rome that the world will run out of resources and choke on its pollution if it continues to expand its economy.

The summary of the report emphasised this theme:

No insurmountable physical barriers exist within the twentieth century to the accelerated development of the developing regions.

Read that carefully. It says “in the twentieth century”. The Limits to Growth‘s authors made a forecast of a possible calamitous population collapse around 2050 – not within the twentieth century. By stopping their model 50 years before, in 2000, the UN team made quite sure that they avoided any possibility of such an event. In fact, as far as they went, their forecasts are very similar to those of The Limits to Growth.

Such sleight of hand is not uncommon. In 1978 I worked for six months with the OECD Interfutures project. While I was able to study an extensive collection of input information, I had no real part in the analysis, which was dominated by a small core group. The 1979 report includes a claim that would be satisfactory to the clients, the wealthy nations of the world:

Economic growth may continue during the next half-century in all the countries of the world without encountering insurmountable long-term physical limits at the global level.

There are two reasons why this statement is misleading. Firstly, all their many computer model calculations stopped in 2000 and did not reach out that far, so this is not in any way based on the work of so many of us in this project. Secondly, they look ahead for just 50 years, thus stopping short of 2050, the forecast time for crisis. It is always easy to dodge a crisis by stopping short of the due date, like the fellow falling off a building who felt that all was well as he sailed down, before he reached the pavement. They knew what The Limits to Growth forecast; they knew what they were doing.

These are examples of the way in which organisations employ expertise to generate desired results and make unjustified claims. Many readers will be sceptical of the warnings of approaching limits. Such scepticism may be better applied to many of the arguments for continuation of growth; here is a New Zealand example.

In 1990 the Planning Council published a report, The fully employed high income society (Rose 1990), which received nationwide publicity due to its suggestion that sustainable full employment with full incomes was possible by 1995 due to high rates of productivity increase – but otherwise continuing current policies.

When I read the document carefully I found some very questionable points:

(a) Estimates of employment requirements commenced in 1988 and ignored the significant loss of jobs between that date and 1990.

(b) Modelling of productivity increases commenced with modelling which has proved unrealistic and overly optimistic, and assumed a further doubling of productivity.

(c) The model run commenced in 1984 with these increases in productivity in order to generate an optimistic result in 1995, thus ignoring the negative experiences of 1984-90.

(d) The model was instructed to produce full employment by 1995 – this was not a consequence of the modelling based on policy changes as represented by input parameters.

(e) Full employment was completely generated by additional capital investment.

The model failed the most basic scientific test of forecasting even before it was published. In the four years from 1986, the date one model run commenced, to 1990, the date of the report, the model had suggested an increase in employment of 38,000, whereas the actual experience was of a fall of 90,000. Nor was that followed by a fully employed society – indeed unemployment was 11 percent in 1991.

The main feature of this work was a failure to produce the required result of full employment within realistic model parameters. The correct process would then have been to report that finding, which would have been in line with what actually happened, but they chose to tweak the model by the introduction of massive capital investment. This artificial process forced the model to say what was wanted and the result was then widely publicised.

Whereas the growth merchants have feet of clay, the limits forecasts from the 1970s hold up well when put to the test. When in 2008 the CSIRO (noted above) returned to the 1972 forecasts of The Limits to Growth and considered whether the real world had followed the forecast trends, the results were convincing. They considered measures of population (birth rates, death rates and population growth), food (and food per capita), services (basic education, electricity and suchlike), industrial output per capita, non-renewable resources and global pollution. All were tracking along the forecast paths towards the coming crisis.

The graph of population on page 3 is typical. These further graphs (right) of food and industrial output per capita, non-renewable resources and global persistent pollution show the same correlation between forecast and observation.

Data since 1972 follow the standard run closely, and do not deviate to follow alternative paths. This result echoes a study I carried out in 2000, when I found that my worrying picture built up around 1980 was robust. Trends have been intriguingly following the expected pattern, including more recently the 2008 oil peak and economic collapse, galloping global warming and the appearance of boat people off the Australian coast.

When I studied the futures literature back in 2000 I found two very different dominant themes. Each followed observed trends and each could describe features of the coming decades. Some of the articles suggested the possibility of food shortages, which would exacerbate the considerable inequalities observed today. That negative scenario may be exacerbated by water shortages and climate change. However a much more prevalent picture was of increasing human capabilities, new technologies and wealth.

No choice is needed; both sets of forecasts may prove robust, as existing trends take different regions or different groups along very different paths. There is then the possibility of the coexistence of two very different societies in the future. This is quite likely; after all it was like that in mediaeval times and in eighteenth and nineteenth century Europe, and this is the reality in many parts of the world today.

I have described the application of the scientific method to long-term forecasting. This is the way a scientist operates, in a search for the truth. An opposite process is followed in economics, where false analyses are widely publicised, and the fit of forecast to reality is ignored. New Zealand discourse is dominated by shonky science. The key work on global crisis comes from the Australian CSIRO while the DSIR, where I started my work, is no more. Here science is in a straitjacket of controls, totally gutted. In a recent round of grants eight out of nine applications were turned down, and initiative is killed as scientists waste time writing proposals for guaranteed results rather than asking questions and exploring the world. The human cost has also been enormous with the crushing of the lively, questioning spirit in true science. The fun of science is gone. Sadly the spokesman for the scientific community, the Royal Society (RSNZ) is quiescent.

Even in economics much more can be done. In 1989 I was able to foresee the collapsing system we have now. Sometimes I dream that we can recover the spirit of the 1970s when the debate was well-informed, when an initiative in the DSIR was supported and the Commission for the Future was set up. It is nowhere on the horizon. This is a country that is deep in denial, which can sign up to Kyoto and then do nothing as greenhouse gas emissions from 1990 to 2007 increase 39.2 percent for energy and 35 percent for industrial processes. Where is the madness here?

Ignorance goes nowhere. A people which faces the world with eyes wide open can gain a national spirit and decide to work towards a satisfying and full life for all, even in the face of adversity, rather than put up with the massive inequality introduced in 1984 and still touted as the way forward.

Graphs are reproduced with permission from Graham M Turner 2008: A comparison of The Limits to Growth with 30 years of reality, Global Environmental Change 18(3): 397-411.

mp3 blues

HAVING recently joined the happy hordes of mp3 player owners, our household has been getting an object lesson in the nature of random events. For those who have yet to succumb to the charms of these amazing little gadgets, they can hold thousands of songs in memory and play them back in many different ways. You can, for example, just play a single album, or make up a playlist of songs for a party, or to encapsulate a particular mood.

Continue reading

Superstitious? Me? That depends

When the Sunday Star-Times decided to survey the nation on how superstitious New Zealanders are and about what, Vicki Hyde got used as a guinea pig. Part One of her responses was published in the last issue of the NZ Skeptic. This is Part Two.

The Paranormal

Paranormal phenomena are things that cannot be explained and/or proven by current scientific methods. Put a number between 1 and 7 next to each item to indicate how much you agree or disagree with that item.

7 = Strongly agree, 1 = Strongly disagree, 4 = Neutral

Astrology is a way to accurately predict the future.

1 – Having done lots of charts, I know it’s applied psychology – people will read into it what they want to. No accuracy, no prediction.

Psychokinesis, the movement of objects through psychic powers, does exist.

7 or 1 – If you’d said mental abilities instead of psychic powers, I would have agreed. We have a growing number of examples of neurological manipulation of an external environment, such as people able to move cursors around a computer screen by thinking at it. That’s real with the right kind of technology behind it. And pretty darned amazing, not to mention hugely inspiring for people with motor disabilities, given the possibilities for future development.

However, using psychic powers, a la X-Men, to shift things, that’s not been demonstrated.

During altered states, such as sleep or trances, the spirit can leave the body.

1 – Presupposes the existence of the spirit in the first place …

Out-of-body experiences (OOBEs) are fascinating and real in the sense that the people who experience them – me, for one! – feel as if they are real. However, neuroscience is starting to paint a very interesting picture of how these experiences occur and even how to induce them. This does not involve the spirit departing the body, nor have such experiences been able to demonstrate conclusive proof of knowledge gained solely from such a spirit wandering.

The Loch Ness monster of Scotland exists.

1 – Though it would be great if it did. Imagine a plesiosaur living in these times; that would be a magnificent survival story. But you only have to stop and think for a bit to see how unlikely it is. We’ve got much more chance for the Fiordland moose or the moa to pop up here than Scotland’s favourite cryptozoological beastie lurking in the depths.

The number ’13’ is particularly unlucky or particularly lucky

1 – Only if you’re culturally responsive to it. Other cultures don’t like four or seven or NEE!

Reincarnation does occur.

1 – I haven’t seen any good evidence for agreeing with this, and it presupposes a whole host of entities and processes to support it for which there is no evidence.

There is life on other planets.

7 – I’d prefer if it you said “likely to be life on other planets”, as we still don’t have any specific examples, but I’ll take a punt and be definite on this one. It’s a big universe out there and it would be rather presumptuous of us to assume that our planet was the only one to experience the right conditions for life to occur.

Most card-carrying skeptics would agree with this one. Where we tend to demur is the idea that that life must therefore be intelligent and buzzing our planet teasing the natives …

Some psychics can accurately predict the future.

1 – Only if you define accurately to mean “roughly right if you let them reinterpret what they said after the event”. Anything other than their very generalised predictions have failed on a regular basis. Here’s some examples:

For 2001, psychics predicted that:

  • the nine US Supreme Court judges would vanish without a trace
  • the Mississippi River would flood, forming a new ocean in the US heartland
  • Pope John Paul II would die and his successor would be Italian

And the big story they missed – the 9/11 attack on the Twin Towers in New York.

In 2005, professional psychics saw the usual mix of the banal and bizarre, including that:

  • terrorists would start World War III by shooting a nuclear missile into China
  • the winner of a new reality TV show would gain fame by killing and eating a contestant
  • the San Andreas Fault in California would have a massive rupture on June 17 with a death toll reaching 4,568,304

What did they miss – Hurricane Katrina, which made thousands homeless in the southern US, and the devastating earthquake that hit Pakistan and India in October, killing 73,000 people.

There are actual cases of witchcraft.

5 – It depends on your definition of witchcraft, which is a culturally and historically complex concept. Riding on broomsticks, outside the Harry Potter movies, is right out, though there might be a technological fix for that in the future, which could be fun.

In a strong cultural context, makutu, maleficus, pointing the bone, voodoo and a whole pile of other psychological techniques can certainly affect a compliant individual immersed in the belief system.

It is possible to communicate with the dead.

1 – Certainly not going by the current crop of rather banal, self- similar pronouncements by those professionals claiming to have this ability.

Taniwha do exist.

4 – Culturally yes, physically no. And this makes it different to the Loch Ness Monster or the Yeti, where people claim such things can be found and photographed.

During the 2002 furore over the Waikato taniwha lurking inconveniently in the path of the main south highway no-one went and actually looked for Karu Tahi. It was understood that the taniwha was a cultural matter, not a physical matter, and that regardless of that, it had a role to play in the debate about development.

Have you ever had a ‘paranormal’ experience – one that can’t be explained scientifically, or ‘proven’ in ways that a scientist would accept? If so, what was it?

Not one that I haven’t been able to think of an alternative non-paranormal explanation for.

You’ve got to remember that, based on general experiences and basic maths, you should experience a million-to-one coincidence roughly every two years – so the world will throw up mysterious experiences from time to time. How we explain those experiences by observation, examination, replication and just plain hard thinking is a lot of fun, and far more interesting than the quick jump to a paranormal pablum.

Lotto

How frequently do you buy a Lotto ticket?

Not in about 10 years.

If you buy one often, do you regularly use the same numbers? (Y/N)

Nope, but I do know what numbers to use to increase my winnings. Send me $10 and I’ll tell you how … 🙂

But seriously, you can improve your winnings by doing the following:

  • select sequences -most people think these can’t come up as they aren’t random, but they are as random as another other set of numbers (don’t choose 1, 2, 3, 4… or …37,38,39,40 as these are more likely to be chosen for sequences).
  • don’t choose any numbers with 7 in them; seven is commonly considered a lucky number, so when the numbers 7,10,17,23,27,33,37 came up in one draw, 21 people shared the first division prize and 80 people took the second division. The average number of winners at that time were 3 and 19 respectively, so any winner of that draw had a much smaller part of the pie.
  • don’t choose double digits or numbers ending in 0 – these are more likely to be picked by people playing numbers.

These strategies do not affect your chances of winning, but can be used to improve the amount you win. This is because you are not playing merely against the machine, but also against everyone who has a Lotto ticket. Pick the more ‘popular’ numbers and you’ll have to share the prize with more people. Select ‘uncommon’ numbers or ‘unlikely’ sequences and you have a good chance of not having to share the winnings.

Who said maths wasn’t useful …

Religion

Do you consider yourself to be a religious/spiritual person? (Y/N)

No. Ethical, yes; moral, yes; honourable yes, but I don’t think you have to be religious or spiritual for any of that.

If so, what religion/teachings do you follow?

I guess the closest I’d get to one would be the Golden Rule, found in many a religion and philosophy – variously described as “do as you would be done by”. Sure there are critiques of this ethic of reciprocity, but it’s not a bad one-liner to start with.

Conspiracy Theories

Below is a list of theories about the causes of important or controversial events. Please read through, and indicate how likely these are as actual explanations.

7 = very likely, 1 = extremely unlikely

The All Blacks were deliberately poisoned before the 1995 rugby world cup final

5 – Put enough people together in a group environment under stress and it’s not unlikely some will fall ill. ‘Course the circumstances can seem more suspicious depending on the situation, and I’d tip this one on the more likely side just because of the circumstances surrounding it. On the other hand, sh*t happens …

Princess Diana was killed by British secret service in order to prevent a Royal scandal

1 – I just don’t think they’re that competent …

A secret cabal of American and European elite control the election of national leaders, the world economy, and direct the course of history in their favour

1 – At some times, in some places, there have been powerful non-elected forces at work behind the scenes, but an all-powerful Illuminati seems very unlikely.

There is a deliberate political conspiracy to suppress the rights of minorities in NZ

3 – Not a conspiracy, but possibly just basic human psychology at work. Never put down to malice what can be achieved through thoughtlessness …

Of course, you could argue that democracy and consensus-building, by their very nature, are going to ride over minorities in their general quest for the greatest good for the greatest number. But I’d need a lot more red wine in me to get into that debate …

NASA faked the first moon landings for publicity

1 – Only the first?

I think the saddest thing about this one is that my kids, and a whole lot of other people, are growing up in a world where they’ve never seen a moon shot to inspire them with a sense of awe at what humanity is capable of achieving. When everyone in my fourth form class had a poster of the Bay City Rollers stuck to their desk-lid, I had the famous shot of Buzz Aldrin standing on the Moon. It still makes my heart lift.

The war in Iraq has less to do with promoting democracy than it does with controlling oil production in the East

6 – The reasons for going into Iraq were pretty shonky in the first place. But few things are done for just one reason …

Elvis Presley faked his death to escape the pressures of fame, the shame of his decline, or the unwanted attentions of the Mob

1 – Nope, he just carked it. Now if you’d cited Jim Morrison I might’ve wondered as I think he’d have been smart enough to pull it off …

World governments are hiding evidence that the earth has been visited by aliens

1 – Too big a story, too incompetent a collection to let that one run for any length of time.

The American government was either involved in, or knew about, the September 11 attacks before they happened

2 – I gather they were aware that an attack of some kind was being planned, but the rest of the conspiracy ideas around this are just sickening and demonstrably incorrect in many cases. People want to find an explanation for such things and someone to blame and, for some, governments or Big Business or the MIB or the Gnomes of Zurich serve as the first port of blame.

Playing the numbers game

Some risks in life are distributed throughout a population, others are all-or-nothing. There’s a big difference. This article is based on a presentation to last year’s Skeptics Conference.

Many organisations, not excluding certain government agencies, rely heavily on public fear to influence public decisions and to provide their on-going funding. That provides strong motivation to generate fake fears even where there is no real public danger.

There are several methods in use:

  1. The distinction between evenly distributed risks versus all-or-nothing risk is obscured.
  2. Forecasts that should be written as fractions, are multiplied, unjustifiably, into a purported risk to individuals (“10 deaths per million”).
  3. An obscure statistical trick is used to treat the most extreme possibility as though it is the most likely.
  4. Numbers are reported with unjustifiable levels of precision to provide a reassuring air of scientific competency. A check of your foods cupboard will reveal boxes claiming, say, 798 mg of protein, where natural variations in ingredient composition can justify only “0.8 g”.

Kinds of risks

Imagine that a maniac injected a lethal dose of undetectable poison into one orange in a box of 1000 fruit. Would you willingly eat an orange from that box? Surely the risk of dying is more important than the minor pleasure of a juicy fruit.

On the other hand, assume that these 1000 fruits are converted into juice. Everyone who drinks a portion of juice will ingest one-thousandth of a lethal dose. Aside from the yuk-factor, would you drink this beverage? I would, since a dose of 0.001 of the lethal dose will not, in the absence of any other negative factors, harm me. A critical enzyme that is blocked by 0.1 percent will still provide 99.9 percent functionality. In fact, most enzyme systems are down-regulated (throttled back) by our natural feedback controls. We routinely consume significant but harmless amounts of natural poisons such as cyanide. Our bodies are rugged; we are not delicate mechanisms. We can function despite losing half our lungs, half our kidneys, most of our liver, and even parts of our brains. Most critical parts of metabolism are backed up by duplicate mechanisms. The key term here is ‘biological threshold’. To speak of a ‘low dose of poison’ is to mouth a meaningless collection of sounds: only high doses of poisons are poisonous. Evenly distributed risk versus all-or-nothing distribution

Returning to my example of the poisoned oranges, the risk factor is one in a thousand for an individual fruit and for a glass of juice. The difference between the two is that risk is lumped all-or-nothing in one case, and distributed uniformly in the other case. The maths may be the same but the practical conclusions are different.

Here is another example: the Bonus Bond Lottery. My $10 bond receives four percent interest a year, 40 cents annually, about three cents per month. Please don’t clutter my letter box with bank statements reporting another three cents! No Bonus Bond holder wants his or her earnings to be distributed uniformly. Obligingly, the bank turns all these tiny individual earnings into a single monthly prize of $300,000 going to just one lucky person. Three cents is trivial, but $300,000 is life-changing. No wonder people hold onto Bonus Bonds.

Injury from damaging chemicals or conditions is not random

The standard way to evaluate toxicity is to treat a number of animals with increasing amounts of chemical (or radiation, etc). The LD50 is the dosage where half the animals die (or develop cancer, etc.) Is the LD50 an example of all-or-nothing risk? In fact, it is a distributed risk applied over a heterogeneous population. All the animals given the LD50 dose were seriously ill but only half of them succumbed. Perhaps they had been fighting, or simply were genetically weaker. It’s hard to tell. Presumably the healthiest animals were most likely to survive.

Examine a lower dosage, where only 10 percent of the animals succumbed. The survivors didn’t go off to play golf! All were affected, but one in 10 was too weak to survive. If you imagine a bell-shaped curve where ‘health’ is on the x-axis, then a low dosage shifts all the animals to the left; those who started on the low-health side of the curve were likely to drop off the mortality cliff.

We can apply this logic to episodes of severe air pollution. We can predict, in advance, who is likely to die and who is likely to survive. There might be, say, 20 deaths per 100,000, but that does not mean an otherwise healthy man or woman has 20 chances in 100,000 of dying. It’s the people with pre-existing respiratory problems who are at risk, not everyone.

Groups that forecast a certain number of deaths per million, from a particular environmental contaminant, should be challenged to describe in advance the characteristics of the victims. Purported victims of tiny doses of chemicals will be abnormally inept at detoxification, perhaps because of liver disease. They are probably hypersensitive to many chemicals in addition to one particular man-made chemical. One may wonder how such people have survived to adulthood.

Turning 100 rats into millions of people

The rules for long-term testing of potential carcinogens are:

  1. 100 rodents per concentration (x 2 for both males and females).
  2. At least three dose levels.
  3. Highest dose levels such that animal growth is inhibited about five or 10 percent (ie, partly toxic doses).
  4. Lowest dose is one-tenth of highest. (Very narrow range).
  5. Attempt to minimise number of animals for both humanitarian and economic reasons (US$600,000 per study).

Successful experiments are those where, at high doses, death or cancer rates of 10 to 90 percent occur. For statistical and practical reasons, responses below 10 percent are difficult to measure. (A massive $3 million ‘mega-mouse’ effort failed to confirm a forecast of one percent cancer caused by low levels of a known carcinogen.) So estimates for risk at low doses can only be done by extrapolating high-dose results.

There are different ways to calculate hypothetical response to doses lower than tested. These are:

  • quadratic
  • linear
  • power
  • non-linear transition
  • threshold

Do we really care which extrapolation equation is used? Arguments between log-linear, probit, or threshold models are no better than discussions about whether the angels dancing on the head of a pin are doing the waltz or the two-step. It’s the unjustified extrapolation that’s at fault.

Remember, the number of rodents used is in the hundreds. Extrapolations give rise to predictions of, say, 0.001 cancers per 100 rats at a certain dosage. The meaning of this number is obscure. What is one-thousandth of a cancer?

Obedient computers, run by scientific spin doctors, multiply the numbers by 1000. So now the prediction is one cancer per 100,000 rats. Better yet, “10 cancers per million rats”, or perhaps “9.8 cancers per million”, to simulate spurious precision. We can now perform the brainless arithmetic of multiplying “10 per million” by the population of New Zealand, resulting in newspaper headlines of “40 cancer cases” per year. All this from fewer than one thousand rats!

Something is seriously wrong with this approach to toxicity testing. It predicts, with unjustifiable precision, death or cancer rates that are forever unverifiable. Moreover, high-dose tests can overwhelm natural defences, falsely suggesting damage from lower doses, damage that is never observed.

An alternative way to handle toxicological data is to see how long it takes for chronic doses to cause damage. An important paper by Raabe (1989) did this for both a chemical carcinogen and for radiation damage. His plots show, logically, that as the dosage was lowered, the animals survived longer. In fact, at low doses the animals died of old age!

Toxicologists using this approach could estimate “time-to-damage” with considerable reliability. (Converting rodent risk to human risk would remain problematical.) The results would be reported as, for example, “At this low dosage, our data predict onset of cancer in individuals with more than three centuries of exposure at the permitted level.” This kind of reassuring forecast would not, unfortunately, inspire larger budgets for the testing agency.

The ‘upper boundary scam’

The bell-shaped normal curve offers another way to fool the public. At the standard threshold of ’95 percent confidence level’, there is an upper limit point and a corresponding lower limit point. We expect the ‘real answer’ to lie between those points, but the most likely value is around the middle of the range. An honest report of our results should include both the mean (middle) value together with some indication of how broad our estimate is.

I have a homoeopathic weight-loss elixir to market. I have run simulated tests on 30 women using Resampling Statistics. (This is a low-budget business and I don’t want to waste my advertising budget on real experiments.) The mean result was, of course, zero change, but there was an upper-bound of 2 kg weight loss. (There was also a corresponding figure of minus 2 kg weight loss, ie, gain, but we won’t worry about that!).

Am I entitled to advertise ‘Lose up to 2 kg’? After all, 2 kg loss was a possibility, even though it’s right at the edge of statistical believability. My proposed advertisement is sharp practice, probably fraudulent. Surely no reputable organisation would distort their results this way!

Such considerations do not seem to bother US and NZ environmental agencies, which happily quote ‘upper bound’ forecasts. The US Environmental Protection Agency (EPA) wrote:

“We were quite certain any actual risk would not exceed that [upper bound] and would be a very conservative application and be quite protective. It does not necessarily have scientific basis, but rather a regulatory basis…. EPA considers the use of the upper 95 percentile as a conservative estimate.”

Problems

One problem with the upper bound estimate of risk is that the worse the underlying data, the more extreme the upper bound, and hence the greater the forecast risk! Some people might think this might motivate the EPA not to refine and expand its experimental database, but I couldn’t possibly comment.

There is nothing inherently wrong with using the upper bound; it does indeed offer reasonable confidence that no damage will occur. But it is dishonest if the central prediction, the mean (average) is not also given. The difference between the mean and the upper-bound tells the public whether or not the estimates are too crude to believe. Unfortunately both numbers are rarely given. One EPA spokesman stated that “The upper bound and maximum dose estimate is usually within two orders of magnitude” (my italics).

That’s an error margin of 100-fold!

A revealing example of how this works was given by EPA in a now-defunct web page devoted to estimating risks from chlorine residues in swimming pool water. That page seems to have been withdrawn, but my own copy of it is at www.saferfoods.co.nz/EPA_drinking_water.html

This amazing document considered the health risk if drinking water were contaminated by swimming pool water. The upper-bound risk was 24 bladder cancer cases per year (for the entire American population). Precautions against this happening would cost $701 million (note the convincing precision here). This would save $45 million of medical and other losses. Many people would have considered that such poor payback would be sufficient reason to drop the proposal.

The “24 cases” of bladder cancer represented the upper-bound estimate. The most likely estimate was, however, 0.2 cases per year. This agrees with their admission of 100-fold discrepancy between upper-bound and most-likely. The most likely outcome of spending $700 million would be to avert one case of bladder cancer every five years! Is that what EPA considers “conservative” and “protective”? Incidentally, the US has more than 60 thousand new cases of bladder cancer every year. An effective anti-smoking campaign would halve that number. What the EPA described as a “conservative” approach turns out to be a proposal to waste millions of dollars for little or no benefit.

There is a further consequence of the EPA mathematics. Since low doses of toxins can sometimes improve health (‘hormesis’), EPA’s figures imply a lower-bound estimate that two dozen cases of bladder cancer could be prevented by drinking dilute swimming pool water.

When the American Council on Science and Health petitioned the EPA to eliminate ‘junk science’ from its administrative process, EPA eventually announced that “Risk Assessment Guidelines are not statements of scientific fact … but merely statements of EPA policy.”

We might expect such behaviour from the EPA. After all, it has 18 thousand employees and a budget of more than US$6 billion. You don’t get that kind of money by telling the American public that they need not worry.

Meanwhile in NZ…

Surely our New Zealand government agencies won’t stoop to the dubious flim-flam of the EPA? Consider the NZ Ministry for the Environment report entitled in part, “Evaluation of the toxicity of dioxins … a health risk appraisal”.

www.mfe.govt.nz/publications/hazardous/dioxin-evaluation-feb01.pdf

“The current appraisal has estimated that the upper bound lifetime risk for background intake of dioxin-like compounds for the New Zealand population may exceed one additional cancer per 1000 individuals.

This cancer risk estimate is 100 times higher than the value of 1 in 100,000 often used in New Zealand to regulate carcinogenic exposure from environmental sources. Of course, if there were a threshold above current exposures the actual risks would be zero. Alternatively, they could lie in a range from zero to the estimate of 1 in 1000 or more.”

This confusing prose says, I think, that the likelihood of risk from dioxin-like compounds is much less than one per 1000. It might be zero, but the Ministry has not provided enough vitally important data. Should public policy be made on the basis of unlikely, unprovable, worst-case guestimates?

The upper boundary scam is, I believe, a despicable misapplication of ‘science’. It’s junk science. Should a government agency be allowed to misinterpret data in a way that would lead of a false-advertising claim if tried on by private merchandisers? I think not.

Our brains are not wired to handle low probabilities. We jump to conclusions on the basis of inadequate information. It’s in our genes. Cave men or women who waited around for more information were eaten by sabre-tooth tigers and didn’t pass their cautionary genes on to us. The cave woman who ran at the slightest unusual sound or smell passed her quick-to-act genes on to us. This is not a good recipe for evaluating subtle statistical issues.

In today’s world, there are many people and organisations ready to push their narrow point of view. We need to be as suspicious about groups claiming to ‘protect’ us or the Earth as we are about time-share salesmen and politicians.

Slops the latest Health Threat

The World Health Organisation has issued a new warning against non-essential travel to the entire Western Hemisphere following renewed concerns about the spread of Severe Loss of Perspective Syndrome (Slops).

Officials are warning travellers not to visit western Europe and North America where outbreaks of the disease have led to mass panic among the media, and increased profits from DIY stores as the gullible public rush to bulk-buy face masks and boiler suits.

A WHO spokesman said, “You’d be much better going to somewhere like Thailand or China, because all you’ve got to worry about there is Sars, and let’s face it, you’re about as likely to die from that as you are to get kicked to death by a gang of zombie nuns.”

The Sars virus has now claimed a staggering 500 lives in only six months, which makes it considerably more deadly than, say, malaria, which only kills around 3000 people every single day. Malaria, however, mainly affects only natives what speak foreign, whereas SARS has made at least one English person feel a bit iffy for a couple of days, and is therefore considered much more serious.

The spread of Slops has now reached pandemic proportions, with many high-level politicians seemingly affected by the disease. Its rapid spread has been linked to the end of the war in Iraq and the need for western leaders to give the public something to worry about. Otherwise, they might start asking uncomfortable questions about domestic issues.

Behind the Screen

Mass screening programmes have generated considerable controversy in this country. But these programmes have inherent limitations, which need to be better understood

In 1996 the Skeptical Inquirer published an article by John Allen Paulos on health statistics. Among other things this dealt with screening programmes. Evaluating these requires some knowledge of conditional probabilities, which are notoriously difficult for humans to understand.

Paulos presented his statistics in the form of a table; a modified version of this is shown in the table below.

Have the
condition
Do not have
the condition
Totals
Test Positive 990 9,990 10,980
Test Negative 10 989,010 989,020
Totals 1,000 999,000 1,000,000
Table 1

Of the million people screened, one thousand (0.1%) will have the condition. Of these 1% will falsely test negative (10) and 99% will correctly exhibit the condition. So far it looks good, but 1% of those who do not have the condition also test positive, so that the total number who test positive is 10980. Remember that this is a very accurate test. So what are the odds that a random person who is told by their doctor that s/he has tested positive, actually has the condition? The answer is 990/10980 or 9%.

In this hypothetical case the test is 99% accurate, a much higher accuracy rate than any practical test available for mass screening. Yet over 90% of those who test positive have been diagnosed incorrectly.

In the real world (where tests must be cheap and easy to run) a very good test might achieve 10% false negatives and positives. To some extent the total percentage of false results is fixed, but screening programmes wish to reduce the number of false negatives to the absolute minimum; in some countries they could be sued for failing to detect the condition. This can only be done by increasing the chance of false positives or inventing a better test. Any practical test is likely to have its results swamped with false positives.

Consider a more practical example where the base rate is the same as previously, but there are 10% false negatives and positives, ie the test is 90% accurate. Again 1 million people are tested (see Table 2 below).

Have the
condition
Do not have
the condition
Totals
Test Positive 900 99,000 100,800
Test Negative 100 889,100 899,200
Totals 1,000 999,000 1,000,000
Table 2. Base rate is 0.1%. Level of false positives=10%; level of false negatives=10%

This time the total number testing positive is 100800. But nearly one hundred thousand of them do not have the condition. The odds that any person who tested positive actually has the condition is 900/100800, or a little under 1%. This time, although 90% of these people have been correctly diagnosed, 99% of those who test positive have been diagnosed incorrectly.

In both these cases the incidence of the condition in the original population was 0.1%. In the first example the screened population testing positive had an incidence two orders of magnitude higher than the original population, but this was unrealistic. In the second example those testing positive in the screened population had an incidence one order of magnitude higher than the general population.

This is what a good mass screening test can do – to raise the incidence of the condition by one order of magnitude above the general population. However any person who tests positive is unlikely to have the condition and all who test positive must now be further investigated with a better test.

So screening programmes should not be aimed at the general population, unless the condition has a very high incidence. Targeted screening does not often improve the accuracy of the tests, but it aims at a sub-population with a higher incidence of the condition. For example, screening for breast cancer (a relatively common condition anyway) is aimed at a particular age group.

Humans find it very difficult to assess screening, and doctors (unless specifically trained) are little better than the rest of the population. It has been shown fairly convincingly that data are most readily understood when presented in tables as above. For example the data in Table 3 was presented to doctors in the UK. Suppose they had a patient who screened positive; what was the probability that that person actually had the condition?

When presented with the raw data, 95% of them gave an answer that was an order of magnitude too large. When shown the table (modified here for consistency with previous examples) about half correctly assessed the probability of a positive test indicating the presence of the disease.

Have the
condition
Do not have
the condition
Totals
Test Positive 8,00 99,000 107,000
Test Negative 2,000 891,000 893,000
Totals 10,000 990,000 1,000,000
Table 3. Base rate is 1%. False negative rate=20%; False positive rate=10%

This time the total number who test positive is 107 000. But nearly one hundred thousand of them do not have the condition. The odds that any person who tested positive actually has the condition are 8000/107 000 or about 7.5%. Now remember that nearly half the UK doctors, even when shown this table could not deduce the correct result. If your doctor suggests you should have a screening test, how good is this advice?

Patients are supposed to be supplied with information so that they can make an informed decision. Anybody who presents for a screening test in NZ may find it impossible to do this. My wife attempted to get the data on breast screening from our local group. She had to explain the meaning of “false negative”, “false positive” and “base rate”. The last is a particularly slippery concept. From UK figures the chances of a 40-year-old woman developing breast cancer by the age of 60 is nearly 4% (this is the commonest form of cancer in women). However, when a sample of women in the 40-60 age group are screened, the number who should test positive is only about 0.2%. Only when they are screened each year, will the total of correct positives approach 4%.

The number of false positives (again using overseas figures) is about 20 times the number of correct positives so a women in a screening programme for 20 years will have a very good chance of at least one positive result, but a fairly low probability of actually having breast cancer. I do not think NZ women are well prepared for this.

The Nelson group eventually claimed that the statistics my wife wanted on NZ breast cancer screening did not seem to be available. But, they added, “we (the local lab) have never had a false negative.” From the recent experience of a close friend, who developed a malignancy a few months after a screening test, we know this to be untrue. What they meant was that they had never seen a target and failed to diagnose it correctly as a possible malignancy requiring biopsy. This may have been true but it is no way to collect statistics.

Screening for breast cancer is generally aimed at the older age group. In the US a frequently quoted figure is that a woman now has a one in eight chance of developing breast cancer, which is higher than in the past. This figure is correct but it is a lifetime incidence risk; the reason it has risen is that on average women are living longer. The (breast cancer) mortality risk for women in the US is one in 28. A large number who develop the condition do so very late in life and die of some other condition before the breast cancer proves fatal.

Common Condition

Breast cancer is a relatively common condition and would appear well suited for a screening programme. The evaluation of early programmes seemed to show they offered considerable benefit in reducing the risk of death. However later programmes showed less benefit. In fact as techniques improved, screening apparently became less effective. This caused some alarm and a study published in 1999 by the Nordic Cochrane Centre in Copenhagen looked at programmes world wide, and attempted to better match screened populations with control groups. The authors claimed that women in screening programmes had no better chance of survival than unscreened populations. The reactions of those running screening programmes (including those in NZ) were to ignore this finding and advise their clients to do the same.

If there are doubts as to the efficacy of screening for breast cancer, there must be greater doubts about screening for other cancers in women, for other cancers are rarer. Any other screening programme should be very closely targeted. Unfortunately the risk factors for a disease may make targeting difficult. In New Zealand we have seen cases where people outside the target group have asked to be admitted into the screening programme, so they also “can enjoy the benefits”. Better education is needed.

Late-onset diabetes is more common among Polynesians than among New Zealanders in general, and Polynesians have very sensibly accepted that this is true. Testing Polynesians over a certain age for diabetes makes sense, particularly as a test is quick, cheap and easy to apply. Testing only those over a certain body mass would be even more sensible but may get into problems of political correctness.

Cervical cancer is quite rare so it is a poor candidate for a mass screening programme aimed at a large percentage of the female population. The initial screening is fast and cheap. If the targeted group has an incidence that is one order of magnitude higher than the general population, then the targeting is as good as most tests. Screening the whole female population for cervical cancer is a very dubious use of resources.

My wife and I were the only non-locals travelling on a bus in Fiji when we heard a radio interview urging “all women” to have cervical screening done regularly. The remarkably detailed description of the test caused incredible embarrassment to the Fijian and Indian passengers; we had the greatest difficulty in concealing our amusement at the reaction. The process was subsidised by an overseas charity. In Fiji, where personal hygiene standards are very high, and (outside Suva) promiscuity rates pretty low, and where most people pay for nearly all health procedures, this seemed an incredibly poor use of international aid.

Assessment Impossible

Screening for cervical cancer has been in place in NZ for some time. Unfortunately we cannot assess the efficacy of the programme because proper records are not available. An attempt at an assessment was defeated by a provision of the Privacy Act. The recent case of a Gisborne lab was really a complaint that there were too many false negatives coming from a particular source. However this was complicated by a general assumption among the public and media that it is possible to eliminate false negatives. It should be realised that reducing false negatives can only be achieved by increasing the percentage of false positives. As can be seen from the data above, it is false positives that bedevil screening programmes.

Efforts to sue labs for false negatives are likely to doom any screening programme. To some extent this has happened in the US with many labs refusing to conduct breast xray examinations, as the legal risks from the inevitable false negatives are horrendous.

Large sums are being spent in NZ on screening programmes; taxation provides the funds. Those running the programmes are convinced of their benefits, but it is legitimate to ask questions. Is this spending justified?

Some Post-Scripts:

January 15 2000 New Scientist P3: Ole Olsen & Peter Gøtzsche of the Nordic Cochrane Centre in Copenhagen published the original meta-analysis of seven clinical trials in 2000. The resulting storm of protest, particularly from cancer charities, caused them to take another look. They have now reached the same conclusion: mammograms do not reduce breast cancer deaths and are unwarranted.

October 2001: In recent TV interviews some people concerned with breast cancer screening in NZ were asked to comment on this meta-analysis. Once again the NZ commentators stated firmly that they were certain that screening programmes in NZ “had saved lives” but suggested no evidence to support their view.

March 23 2002 New Scientist P6: The International Agency for Research on Cancer (IARC) funded by the WHO claims to have reviewed all the available evidence. They conclude that screening women below the age of 50 is not worthwhile. However, screening women aged from 50-69 every two years reduces the risk of dying of breast cancer by 35%.

According to New Scientist, the figures from Britain are that of 1000 women aged 50, 20 will get breast cancer by the age of 60 (2%); of these six will die. Screening every two years would cut the death rate to four. [It is obvious that these are calculations, not the result of a controlled study!]

The IARC states that organised programmes of manual breast examination do not bring survival benefits (they call for more studies on these). If NZ has similar rates then screening programmes aimed at 50-60 year old women should save approximately 50 lives per annum.

Skeptical Early Warning System.

One of the arguments presented in favour of this year’s Bent Spoon award was that the NZ Skeptics increasingly provide an early warning system against strange notions from abroad. For example, Skeptical activities helped New Zealand develop some early immunity to the worst excesses of the “repressed memory” virus. While many members supported the Hitting Home award on similar grounds, some members may have wondered whether Hitting Home was no more than a local aberration and that we were seeing international demons where none existed.

It seems not.

In Massachusetts, USA, a feminist coalition has promoted the view that there has been a widespread epidemic of violence against women in the community and have succeeded in instituting legislative changes in response. But it turns out that the range of violent and abusive behaviour by males which has contributed to the epidemic includes the following:

  • claiming the truth
  • emotional withholding
  • telling jokes
  • changing the subject of a conversation.

Given these definitions, it should come as no surprise that abuse against women has reached extraordinary new levels.

The further lesson from Massachusetts is that such extended definitions have significance well beyond boosting statistics and writing reports. They have been applied to the administration of justice through vehicles such such as the Restraining Order legislation, Section 209A which allows a Massachusetts woman “in fear of abuse” to be granted an emergency restraining order against a husband or partner which can:

  • order the man immediately out of the home
  • order no contact between the man and the woman and any children
  • grant temporary custody of children to the victim
  • order the man to pay child support

Inevitably Section 209A, which was intended to protect women in genuinely violent and dangerous relationships, has been seized on as a powerful weapon in divorce and custody battles within the civil courts, where they have become weapons of domestic war rather than instruments of justice.

During the Hitting Home debate, several Skeptics wondered what could be the point of extending definitions of violence to include verbal sparring and the like, given that the justice system has no mechanism to intervene in such matters. The Massachusetts experience suggests that we were missing the point. These definitions have found their home in the adversarial legal environment where any weapon is legitimate if it assists the prosecution of the case.

Many of us have taken comfort from the fact that we live outside the culture of routine violence displayed so powerfully in Once Were Warriors. But only a brave, or foolish, man or woman could believe that divorce or custody disputes will never intrude into their family lives. During the public debate the Minister of Justice gave an assurance that Hitting Home (which focused on violence by men against women) was to be followed by similar studies focusing on violence by women against men and on violence within other relationships. The Ministry’s staff, when pressed on the matter, revealed that while this was what they had told the Minister, no funds were available for the job.

So, in the absence of local evidence, we must turn to US statistics and studies to test the common-sense assumption that most domestic violence is committed by men against women.

In 1975 and again in 1985, Murray A. Strauss and Richard J. Gelles led one of the largest and most respected studies in family violence. They concluded that not only are men just as likely to be the victims of domestic violence as women, but that between 1975 and 1985, the overall rate of domestic violence by men against women decreased, while women’s violence against men increased. Responding to accusations of gender bias in reporting, Strauss re-computed the assault rates based solely on the responses of the women in the 1985 study and confirmed that, even according to women, men are more likely than women to be assaulted by their partner.

There is no question that men on average are bigger and stronger than women, and hence they can do more damage in a fist-fight. However according to Professors R.L. McNeely and Cormae Richey Mann, “the average man’s size and strength are neutralised by guns and knives, boiling water, bricks, fireplace pokers and baseball bats.” Their opinion is endorsed by a 1984 study of 6,200 cases which found that 86% of female-on-male violence involved weapons, as compared to 25% of cases of male-on-female violence. (McLeod, Justice Quarterly (2) 1984 pp. 171-193.)

Several other US and Canadian studies reach similar conclusions while the following Justice Department statistics (1994) suggest that men receive no special favours from the “patriarchal” justice system of the US:

Men Women
Proportion of murder victims in domestic violence 55% 45%
Acquitted for murder of a spouse 1.4% 12.9%
Receive probation for murdering a spouse 1.6% 16%
Average sentence for murdering spouse (years) 17 years 6 years

These statistics and data have been collected off the Internet and are subject to bias or even corruption by those who put together the material. However, for what it’s worth, during the time I lived in the United States I was exposed to only one example of genuine domestic violence. A recently married couple living in the apartment beneath me became embroiled in a typical domestic screaming match. The young wife telephoned her mother seeking assistance. Mother drove round to the rescue, wielding a pistol with which she attempted to shoot the son-in-law. Instead she shot her own daughter.

American women turn to guns and knives. The English and Europeans appear to favour poison. How do New Zealand women redress the sexual balance of power? Or have they been conditioned to literally “take it on the chin”? At present we do not know and Hitting Home tells us less than half the story.

For me the strongest lesson of the exercise has been that the scope of such exercises is even more important than the internal integrity of the study itself. Telling half the story may well be less informative — and indeed be more damaging to public policy — than telling no story at all.

The 1995 Bent Spoon

This year’s Bent Spoon Award has ruffled a few feathers. In a controversial decision, what the Skeptics described as an “alarmist” Justice Department report on domestic violence in New Zealand has received the award.

“The report, entitled Hitting Home, paints a disturbing picture of New Zealand men as abusers of wives and partners, until you examine the fine print,” said Skeptics head Vicki Hyde.

“Since the report defines ‘abuse’ to include criticising your partner’s family, it is not surprising that half the men surveyed were guilty of some form of psychological abuse…By so exaggerating the extent of abuse, the report trivializes the real domestic violence that goes on in New Zealand,” Ms Hyde said.

For example, Hitting Home refers repeatedly to one particularly disturbing statistic, which was singled out in the Justice Department press release: “when they were shown some typical circumstances in which abuse occurs, 10% [of New Zealand men] said they approved and 56% did not really disapprove of hitting a woman. And in at least one circumstance, six out of ten men say the woman has only herself to blame for being hit.” This indeed would be alarming, were it true that bashing women was behaviour 60% of New Zealand males were willing to turn a blind eye to.

In fact, these figures were arrived at by showing men a list of possible provocations, including finding a partner “in bed with another man,” “physically abusing their child,” and hitting the man first in an argument. From the fact that the disapproval rating of respondents, once shown such circumstances, declined from “moderate to extreme” to “little or moderate” (even though 95%-98% disapproved), we’re served up the false conclusion that “56% did not really disapprove of hitting a woman.” They did disapprove, overwhelmingly, but not at the same level of disapproval as “she hasn’t cleaned the house,” and other trivial items on the list.

The report also inflates conclusions about the prevalence of abuse by its peculiar definition of “abuse” which runs the gamut from “Used a knife or gun on her” to “Kicked something” to “Put down her family and friends” to “Tried to keep her from doing something she wanted to do.” From this starting point, the report finds widespread “abuse” in New Zealand, as it would be a rare couple where a man had not at some time slammed a door or insulted a relation during an argument with his partner. Despite a title suggesting it is about domestic violence, Hitting Home is actually about abuse, understood as virtually any demonstration of anger. Even letting off steam to avoid “abuse” can be classified as “abuse.”

One of the report’s authors told the Listener that “Overall, the research found that New Zealand rates of abuse are about twice as high as rates based on what women say.” This is no surprise as the report’s inflated definition of abuse includes behaviours that even the “victims” didn’t think of as abuse.

In the press release, Vicki Hyde said, “It’s taken society a long time to recognize that domestic violence is a serious problem. It is vital, if we are to address this issue effectively, that research provides accurate, meaningful information on which policies can be based. By limiting its scope to men only and by defining abuse so broadly, Hitting Home misses the mark. It’s a great shame, since we desperately need well-founded social policies. This will disadvantage the women most vulnerable to serious violence. Surely, you can’t classify the experience of being strangled or threatened with a knife alongside hearing a rude comment about your brother.”

At our recent conference, Skeptic Hugh Young challenged the award.. His remarks and others follow. Further contributions will appear in the next Skeptic.

The Skeptics awards for excellence went to journalists with TVNZ, Metro, and the Listener.

“TVNZ’s Assignment series shows that we can still have thoroughly researched, critical documentaries on television,” according to Ms Hyde. The Skeptics praised Assignment‘s “The Doctor Who Cried Abuse,” an investigation of a Dunedin physician whose unwarranted diagnoses had wrecked havoc on New Zealand families. “Ellis Through the Looking Glass,” an examination of the Christchurch Civic Creche case, was singled out for accolades.

Vincent Heeringa of Metro magazine received an award for his article “Weird Science,” on the Auckland Institute of Technology Press and Listener journalist Noel O’Hare, author of a cover story on False Memory Syndrome received a Skeptics award for the second year running.