Mass screening programmes have generated considerable controversy in this country. But these programmes have inherent limitations, which need to be better understood

In 1996 the Skeptical Inquirer published an article by John Allen Paulos on health statistics. Among other things this dealt with screening programmes. Evaluating these requires some knowledge of conditional probabilities, which are notoriously difficult for humans to understand.

Paulos presented his statistics in the form of a table; a modified version of this is shown in the table below.

Have the
condition
Do not have
the condition
Totals
Test Positive 990 9,990 10,980
Test Negative 10 989,010 989,020
Totals 1,000 999,000 1,000,000
Table 1

Of the million people screened, one thousand (0.1%) will have the condition. Of these 1% will falsely test negative (10) and 99% will correctly exhibit the condition. So far it looks good, but 1% of those who do not have the condition also test positive, so that the total number who test positive is 10980. Remember that this is a very accurate test. So what are the odds that a random person who is told by their doctor that s/he has tested positive, actually has the condition? The answer is 990/10980 or 9%.

In this hypothetical case the test is 99% accurate, a much higher accuracy rate than any practical test available for mass screening. Yet over 90% of those who test positive have been diagnosed incorrectly.

In the real world (where tests must be cheap and easy to run) a very good test might achieve 10% false negatives and positives. To some extent the total percentage of false results is fixed, but screening programmes wish to reduce the number of false negatives to the absolute minimum; in some countries they could be sued for failing to detect the condition. This can only be done by increasing the chance of false positives or inventing a better test. Any practical test is likely to have its results swamped with false positives.

Consider a more practical example where the base rate is the same as previously, but there are 10% false negatives and positives, ie the test is 90% accurate. Again 1 million people are tested (see Table 2 below).

Have the
condition
Do not have
the condition
Totals
Test Positive 900 99,000 100,800
Test Negative 100 889,100 899,200
Totals 1,000 999,000 1,000,000
Table 2. Base rate is 0.1%. Level of false positives=10%; level of false negatives=10%

This time the total number testing positive is 100800. But nearly one hundred thousand of them do not have the condition. The odds that any person who tested positive actually has the condition is 900/100800, or a little under 1%. This time, although 90% of these people have been correctly diagnosed, 99% of those who test positive have been diagnosed incorrectly.

In both these cases the incidence of the condition in the original population was 0.1%. In the first example the screened population testing positive had an incidence two orders of magnitude higher than the original population, but this was unrealistic. In the second example those testing positive in the screened population had an incidence one order of magnitude higher than the general population.

This is what a good mass screening test can do – to raise the incidence of the condition by one order of magnitude above the general population. However any person who tests positive is unlikely to have the condition and all who test positive must now be further investigated with a better test.

So screening programmes should not be aimed at the general population, unless the condition has a very high incidence. Targeted screening does not often improve the accuracy of the tests, but it aims at a sub-population with a higher incidence of the condition. For example, screening for breast cancer (a relatively common condition anyway) is aimed at a particular age group.

Humans find it very difficult to assess screening, and doctors (unless specifically trained) are little better than the rest of the population. It has been shown fairly convincingly that data are most readily understood when presented in tables as above. For example the data in Table 3 was presented to doctors in the UK. Suppose they had a patient who screened positive; what was the probability that that person actually had the condition?

When presented with the raw data, 95% of them gave an answer that was an order of magnitude too large. When shown the table (modified here for consistency with previous examples) about half correctly assessed the probability of a positive test indicating the presence of the disease.

Have the
condition
Do not have
the condition
Totals
Test Positive 8,00 99,000 107,000
Test Negative 2,000 891,000 893,000
Totals 10,000 990,000 1,000,000
Table 3. Base rate is 1%. False negative rate=20%; False positive rate=10%

This time the total number who test positive is 107 000. But nearly one hundred thousand of them do not have the condition. The odds that any person who tested positive actually has the condition are 8000/107 000 or about 7.5%. Now remember that nearly half the UK doctors, even when shown this table could not deduce the correct result. If your doctor suggests you should have a screening test, how good is this advice?

Patients are supposed to be supplied with information so that they can make an informed decision. Anybody who presents for a screening test in NZ may find it impossible to do this. My wife attempted to get the data on breast screening from our local group. She had to explain the meaning of “false negative”, “false positive” and “base rate”. The last is a particularly slippery concept. From UK figures the chances of a 40-year-old woman developing breast cancer by the age of 60 is nearly 4% (this is the commonest form of cancer in women). However, when a sample of women in the 40-60 age group are screened, the number who should test positive is only about 0.2%. Only when they are screened each year, will the total of correct positives approach 4%.

The number of false positives (again using overseas figures) is about 20 times the number of correct positives so a women in a screening programme for 20 years will have a very good chance of at least one positive result, but a fairly low probability of actually having breast cancer. I do not think NZ women are well prepared for this.

The Nelson group eventually claimed that the statistics my wife wanted on NZ breast cancer screening did not seem to be available. But, they added, “we (the local lab) have never had a false negative.” From the recent experience of a close friend, who developed a malignancy a few months after a screening test, we know this to be untrue. What they meant was that they had never seen a target and failed to diagnose it correctly as a possible malignancy requiring biopsy. This may have been true but it is no way to collect statistics.

Screening for breast cancer is generally aimed at the older age group. In the US a frequently quoted figure is that a woman now has a one in eight chance of developing breast cancer, which is higher than in the past. This figure is correct but it is a lifetime incidence risk; the reason it has risen is that on average women are living longer. The (breast cancer) mortality risk for women in the US is one in 28. A large number who develop the condition do so very late in life and die of some other condition before the breast cancer proves fatal.

Common Condition

Breast cancer is a relatively common condition and would appear well suited for a screening programme. The evaluation of early programmes seemed to show they offered considerable benefit in reducing the risk of death. However later programmes showed less benefit. In fact as techniques improved, screening apparently became less effective. This caused some alarm and a study published in 1999 by the Nordic Cochrane Centre in Copenhagen looked at programmes world wide, and attempted to better match screened populations with control groups. The authors claimed that women in screening programmes had no better chance of survival than unscreened populations. The reactions of those running screening programmes (including those in NZ) were to ignore this finding and advise their clients to do the same.

If there are doubts as to the efficacy of screening for breast cancer, there must be greater doubts about screening for other cancers in women, for other cancers are rarer. Any other screening programme should be very closely targeted. Unfortunately the risk factors for a disease may make targeting difficult. In New Zealand we have seen cases where people outside the target group have asked to be admitted into the screening programme, so they also “can enjoy the benefits”. Better education is needed.

Late-onset diabetes is more common among Polynesians than among New Zealanders in general, and Polynesians have very sensibly accepted that this is true. Testing Polynesians over a certain age for diabetes makes sense, particularly as a test is quick, cheap and easy to apply. Testing only those over a certain body mass would be even more sensible but may get into problems of political correctness.

Cervical cancer is quite rare so it is a poor candidate for a mass screening programme aimed at a large percentage of the female population. The initial screening is fast and cheap. If the targeted group has an incidence that is one order of magnitude higher than the general population, then the targeting is as good as most tests. Screening the whole female population for cervical cancer is a very dubious use of resources.

My wife and I were the only non-locals travelling on a bus in Fiji when we heard a radio interview urging “all women” to have cervical screening done regularly. The remarkably detailed description of the test caused incredible embarrassment to the Fijian and Indian passengers; we had the greatest difficulty in concealing our amusement at the reaction. The process was subsidised by an overseas charity. In Fiji, where personal hygiene standards are very high, and (outside Suva) promiscuity rates pretty low, and where most people pay for nearly all health procedures, this seemed an incredibly poor use of international aid.

Assessment Impossible

Screening for cervical cancer has been in place in NZ for some time. Unfortunately we cannot assess the efficacy of the programme because proper records are not available. An attempt at an assessment was defeated by a provision of the Privacy Act. The recent case of a Gisborne lab was really a complaint that there were too many false negatives coming from a particular source. However this was complicated by a general assumption among the public and media that it is possible to eliminate false negatives. It should be realised that reducing false negatives can only be achieved by increasing the percentage of false positives. As can be seen from the data above, it is false positives that bedevil screening programmes.

Efforts to sue labs for false negatives are likely to doom any screening programme. To some extent this has happened in the US with many labs refusing to conduct breast xray examinations, as the legal risks from the inevitable false negatives are horrendous.

Large sums are being spent in NZ on screening programmes; taxation provides the funds. Those running the programmes are convinced of their benefits, but it is legitimate to ask questions. Is this spending justified?

Some Post-Scripts:

January 15 2000 New Scientist P3: Ole Olsen & Peter G√łtzsche of the Nordic Cochrane Centre in Copenhagen published the original meta-analysis of seven clinical trials in 2000. The resulting storm of protest, particularly from cancer charities, caused them to take another look. They have now reached the same conclusion: mammograms do not reduce breast cancer deaths and are unwarranted.

October 2001: In recent TV interviews some people concerned with breast cancer screening in NZ were asked to comment on this meta-analysis. Once again the NZ commentators stated firmly that they were certain that screening programmes in NZ “had saved lives” but suggested no evidence to support their view.

March 23 2002 New Scientist P6: The International Agency for Research on Cancer (IARC) funded by the WHO claims to have reviewed all the available evidence. They conclude that screening women below the age of 50 is not worthwhile. However, screening women aged from 50-69 every two years reduces the risk of dying of breast cancer by 35%.

According to New Scientist, the figures from Britain are that of 1000 women aged 50, 20 will get breast cancer by the age of 60 (2%); of these six will die. Screening every two years would cut the death rate to four. [It is obvious that these are calculations, not the result of a controlled study!]

The IARC states that organised programmes of manual breast examination do not bring survival benefits (they call for more studies on these). If NZ has similar rates then screening programmes aimed at 50-60 year old women should save approximately 50 lives per annum.

Recommended Posts