Opening a Dore?

A learning difficulties programme that claims to re-train the cerebellum makes some impressive claims which don’t stand close scrutiny.

DORE is an organisation that claims to treat learning difficulties without drugs. Their programmes supposedly

“… tackle the root cause of learning difficulties by improving the efficiency of the cerebellum – the brain’s ‘skill development centre’ – and the part of the brain now understood to play a significant role in learning, coordination, emotional control and motor skills.”

Recently the company held a series of information sessions to coincide with the opening of a new Dore centre in Lower Hutt, to go with their existing centres in Auckland and Christchurch. I attended a session to see what it was all about.

As we entered the room, video testimonials were playing, showing parents and their children claiming dramatic results for a range of learning disabilities and conditions, such as Asperger’s syndrome. An information pack was handed out, which included newspaper clippings and another testimonial. It claimed that Dore gets to the “core of learning difficulties”, “actively improves ability to learn”, is drug-free, based on scientific principles, is personally tailored and is not a “quick fix” or “soft option”. A FAQ stated that people who successfully complete the programme did the exercises accurately and consistently and if improvements don’t occur this is mainly because people are not sticking to the routine.

A video introduced Wynford Dore, who stated his daughter had learning problems, for which he searched for a solution. Then a mother and her son related how the son had dyslexia and behavioural problems at school which the mother was only made aware of after a few years when a teacher spoke to her. The child was already on a three-year programme with SPELD when the family discovered Dore; they followed this programme for a year concurrently with SPELD. They claimed significant improvement about three months after starting Dore.

The presentation went on to claim that approximately 16 percent of the New Zealand population had learning difficulties, with only four percent diagnosed; these were said to affect one in six New Zealanders. It was difficult to locate comparative figures, but SPELD estimates that seven percent of children have a specific learning disability, which would equate to about 50,000 school children.

The Dore programme claimed to assist with dyslexia, ADD/ADHD, dyspraxia (motor skills) and Asperger’s syndrome, and is targeted at people aged seven and over. The presenter briefly went over the typical feelings of those struggling with learning difficulties, and described how they thought these conditions manifest – as a multitude of literacy, numeracy, memory, attention, coordination, social and emotional problems. This was all claimed to be due to an inefficient cerebellum. Dore, they said, addresses underlying causes rather than symptoms (where have I heard that before I wonder?).

The conditions treated all allegedly have a physiological basis and nothing to do with other factors. Figures were presented, said to be from the Otago University longitudinal study and purporting to show that dyslexics were significantly disadvantaged compared with peers (with the consequent implication that treatment would help prevent this disadvantage).

Dyslexic students were more likely to leave school with no qualifications, much less likely to have a Bachelors degree, and none achieved Masters/Doctorate levels. Average income was more than $10,000 less than their peers. However, there was no word on whether this lack of achievement could be generalised to all people suffering dyslexia, given the long time period of the study and the considerable changes in educational services over that time.

In a further video presentation a Dr Sara Chamberlain claimed the cerebellum governs the automatic performance of simple tasks, and that this facility can be enhanced through exercise. We then heard about Dore’s assessment process. Following an initial phone consultation, prospective clients fill out a questionnaire, and there are a variety of tests and a medical assessment. Posture and ocular-motor skills are tested, and then dyslexia is screened for, apparently using a standard tool. Other conditions such as ADD/ADHD are assessed using the DSM-IV manual; the whole initial appointment takes three to four hours. The programme, it appears, is not suitable for everyone. Clients then have 1.5-hour interviews at three-monthly intervals and on completion of the course.

It was claimed that many scientific papers link the cerebellum with learning, attention, etc; these can be found on their website. They say they have done research themselves and written papers, and will provide details on request. They mentioned ongoing studies into ADHD at Ohio State University and by another US office; the Ohio State University testing appears to be a pilot study, but I couldn’t find any references to the other. A testimonial was introduced from a Dr Edward Hallowell, presented as an expert in ADD and ADHD. When I checked on this later, he appears to be involved with the Dore programme and would hardly be an unbiased commenter.

We were presented with figures from self-evaluation claiming to show 86.5 percent of children and 88.5 percent of adults showed progress in literacy and numeracy after taking the Dore programme. For coordination the respective figures were 81 percent and 75.4 percent, and for social skills 78.1 percent and 72.6 percent. The exercise programme was claimed to be individualised, unlike other programmes like ‘Brain Gym’ that aren’t (for more information on Brain Gym see Ben Goldacre’s Bad Science blog(.

The regime

The exercises take 10 minutes twice daily, with a mandatory four-hour break between; they have 400 exercises and 16 levels that could be completed. These involve such things as using a wobble board, or an exercise ball, or throwing and catching mini bean bags. Again, the cerebellum was claimed to be receiving, processing and automating sensory information from somatosensory, visual and vestibular inputs. The cerebral cortex (the thinking part of the brain) is apparently supposed to integrate all of this but with the conditions Dore say they treat, it is claimed the cerebellum isn’t working with the cerebral cortex.

The idea that defects in the cerebellum cause learning difficulties would seem to be a classic case of correlation not necessarily equating with causation. As noted by Oxford University psychologist Dorothy Bishop in her 2007 paper “Curing dyslexia and ADHD by training motor co-ordination: Miracle or myth?”, cause and effect would seem to be not so simple as presented at the session.

“The notion that the cerebellum might be implicated in some children’s learning difficulties is not unreasonable: both post-mortem and imaging studies have reported cerebellar abnormalities. Furthermore, some studies have reported behavioural deficits involving balance and automatisation of motor skills in a subset of people with dyslexia, consistent with a cerebellar deficit hypothesis. However, it is premature to conclude that abnormal cerebellar development is the cause of dyslexia, rather than an associated feature. Many people with dyslexia do not show any evidence of motor or balance problems. Furthermore, the cerebellum is a plastic structure which can be modified by training, raising the possibility that cerebellar abnormalities might be a consequence of limited experience in hand-writing in those with poor literacy.”

The programme used to use a book, but is now web-based. Exercises are carried out and then “marked” according to their criteria. They stressed that compliance was key, along with parental support. Times for completion vary, but are usually 12-14 months, with a weaning process at the end of the programme where the exercises are gradually wound down. The course is expensive, costing almost $5,400 or a little less for a one-off payment. They did say that they gave three “sponsored” places per month, but didn’t describe what exactly this entailed, outside of mentioning that it was for low income families and that children with a medical diagnosis could apply for a disability allowance through WINZ which could be used to access their programme.

A few questions

During question time, they were asked how they could be sure the child in the video testimonial had improved because of Dore and not the other programme he was on. The answer was fudged: they said they didn’t diagnose but looked for “sensory processing problems” and it was those they treated, which then enabled the person to learn. In other words, if there was improvement, it was Dore, not any other intervention specifically targeted at helping the person learn to overcome their disability and learn to read.

Another questioner asked why it was so costly given that the programme is mostly self-directed. They equivocated, talking about staffing costs, the website, and having support available. They said that braces cost much more and that that is basically cosmetic, when their programme “benefited a person for life” so was worth the investment. Yet another question was about the doctors – why wouldn’t they use paediatricians and other suitably qualified professionals? They stated that for their purposes, the level of medical expertise was sufficient.

Dore has obviously learned from experience following actions taken by overseas advertising standards authorities, and no longer make claims of “100 percent cure” and “miracle cure” for the conditions they claim to treat. In fact they seemed to be reasonably realistic in introducing caveats such as “it doesn’t work for everyone”. Despite this, they still claim to be proven to help overcome learning difficulties even though the evidence base is weak to non-existent. Although they make many claims to be “scientific” and have an extensive list of papers on their website, when the UK Advertising Standards Authority considered a complaint against Dynevor, Dore’s parent company, they assessed the studies submitted in support as poor, lacking control groups, and not supporting the treatment claims made:

“The ASA noted Dynevor’s interpretation of the ad. We considered, however, in the absence of any qualifying text to the contrary, that consumers were likely to understand the claim “Need help with Dyslexia, ADHD, Dyspraxia or Asperger’s?” to mean that the DORE programme could help treat the named conditions. We also considered that we would need to see robust, scientific evidence to support the claim. We noted that the two studies provided by Dynevor assessed the effect of the exercise-based DORE programme on children with reading difficulties and children and adults with ADHD respectively…

“… As neither the first nor second study referred to Asperger’s syndrome and only two participants in the first study had dyspraxia, we considered that the evidence was inadequate to support claims to treat those conditions. With regards to dyslexia and ADHD, we did not consider that the studies were sufficiently robust to support the treatment claims for those conditions, and we therefore concluded that the claim was misleading…”

The average person would have trouble verifying claims about the role of the cerebellum and the ability of an exercise programme to improve function. If it really was that easy everyone would be using Dore’s exercises. Their claim that dyslexia, dyspraxia, ADD/ADHD and Asperger’s syndrome have one cause, one cure, is insufficient. The conditions they claim to treat are disparate and cause and effect is not established. There was little discussion of how cerebellar function or dysfunction is assessed, or of the relevance of their testing of such things as eye tracking, and no discussion at all of how the exercises impact on the cerebellum or how outcomes are measured. Bishop says:

“The gaping hole in the rationale for the Dore Programme is a lack of evidence that training on motor-coordination can have any influence on higher-level skills mediated by the cerebellum. If training eye-hand co-ordination, motor skill and balance caused generalized cerebellar development, then one should find a low rate of dyslexia and ADHD in children who are good at skateboarding, gymnastics or juggling. Yet several of the celebrity endorsements of the Dore programme come from professional sportspeople.”

There is little real involvement from the company once the programme has commenced, with only a few appointments to follow up after the initial assessment. Many who join the programme don’t apparently have a formal diagnosis of the conditions Dore claims to treat, and they won’t get that from the company, as they state they don’t diagnose anything other than the alleged cerebellar problems.

It’s not surprising that some would see benefits though – the commitment and parental support required to do the programme would alone benefit some children. Then there is regression to the mean, the Hawthorne effect (subjects modify an aspect of their behaviour being experimentally measured simply in response to being studied) and natural improvements with growing maturity. On retesting later, there may appear to be improvements due to the client having done the test before and being aware of what is required. Many would concurrently use other services such as reading recovery, and Dore themselves recommend that if the child has spare time, that it is spent practising reading and writing. That extra practice reading could be extremely beneficial.

The high cost of the programme is concerning, especially when they acknowledge that not everyone will benefit. Despite this, they had parents travelling from the Wellington region to undertake assessments in Auckland – hence the opening of an office in the region. There may also be a financial risk to participants; Dore UK and Australia have both failed, leaving clients out of pocket. In New Zealand Dore was placed in liquidation in 2009 and the Companies Office states: “This Company currently has Liquidators, Receivers or Voluntary Administrators appointed” with the liquidators due to report again in May 2011.

Orthodoxy? – Revisiting the Cartwright Report (Part 2)

NZ Skeptic issues 96, 97 and 98 contained articles presenting different viewpoints on the ‘Unfortunate Experiment’ at National Women’s Hospital and its aftermath. Wellington registered nurse and NZ Skeptics treasurer Michelle Coffey continues the discussion in this web-only special.

When I wrote my original article (NZ Skeptic 97), it was written with the intention that it could stand alone as a more thorough discussion of the findings of the Cartwright Report and later research. This was because there were a number of important issues raised as a result of the report which have been almost lost in the debate, many of them systemic ones. While I’m sure that readers interested enough can source the relevant material and judge for themselves, in Skeptic 98, Linda Bryder has responded and the statements made merit a response to clarify several points. I referenced the Bryder’s book for a complete review of the topic, but did not address it in the original article as while it does deal with aspects of the ‘Unfortunate Experiment’ the book ultimately fails to provide any complete assessment of the matter due to the book omitting to investigate key figures such as McIndoe or dealing with the health care system (in particular it’s politics) as opposed to social movements.

1. “There was no medical certainty about the proportion of cases of CIS…”

None of the references support the contention that there was no medical certainty about the proportion of CIS cases that would advance to invasion, and in any case proportion of cases isn’t the point – it’s whether CIS was considered to be a precursor of invasive cancer. This appears to be the case. In the Cartwright Report1 (p23) a compilation of studies was introduced into evidence giving figures that indicate over time, a significant proportion would progress to invasion.

The 1976 Editorial2 cited is discussing screening and states “The report faces up to the problems which still cause fierce controversy – those of the natural progression and regression of early lesions, the discrepancy between total [CIS] cases and the combined number of number of clinical invasive cases, and the incidence and mortality rates.” The Walden Report3 it is referring to states unequivocally that “The significance of [CIS] as a precursor of invasive disease has been recognised for more than 3 decades. Several series of patients, followed for months or years, have demonstrated progression from [CIS] to invasive disease at rates ranging from 25 to 70%.” The issue of where earlier dysplastic changes fit appears to be where any “controversy” laid rather than the concept of progression from pre-invasive lesion to invasion. The report placed these earlier changes a decade or so prior to invasive disease as a precursor state stating “the concept that progressive degrees of cervical dysplasia are part of the natural history of neoplastic disease of the cervix now seems firm.” This is relevant to developing a screening programme given that there is a window of many years in which the condition could be detected and treated. Ostor4 in his statement “The ultimate fate of patients with CIN is the most controversial issue facing investigators interested in cervical neoplasia.” is discussing similar issues to that discussed in the Walden report which is relevant in terms of assessing the relevance of findings and being able to predict the behaviour of these ‘atypias’. Most studies ended at the point of CIS. Ostor looked at not only progression to invasion but the likelihood of regression, persistence and progression to CIN3 (11% in the case of CIN1) with the conclusion that the probability of invasion increases with severity of dysplasia, but there is potential for regression which reflects on therapy.

One man’s dysplasia is another man’s carcinoma #40;notably without insertions to influence the reader to place a particular meaning on it) is a statement that crops up frequently. One issue is the correlation between cytology and histological confirmation, while this wasn’t perfect it was generally agreed that smears could reliably indicate an existing lesion. Histological confirmation was required, but there could be a lack of agreement between pathologists and laboratories on the histological criteria meaning that the precise differentiation between dysplasia and CIS varied. These uncertainties don’t seem to have impacted on the confidence of pathologists regarding screening for cervical malignancies and grading of a lesion was seen by surgical pathologists as more a statement of probability of progression which had limited applicability in clinical management as noted in Löwy’s history5. The precise definition didn’t matter as much as understanding that it was the same disease that was being managed. This is nothing other than a fairly typical debate as biology and medicine rarely, if ever, give certainties.

1. ” Coffey cites 1958 ” official policy… to show this.”

It’s important to note for clarity that there could be variation in policies in other areas but what is more critical in this case was policy at NWH, the hospital where Green practised which set the standard of care. Policies at NWH evolved over a period of time. In 1955 the formation of a cancer team to which all cases of carcinoma of the cervix were to be referred to for treatment was unanimously supported. Over the next ten years, policies regarding the diagnosis and treatment of CIS and invasive cancer were regularly reviewed. This wasn’t just agreed to at a meeting of “…only nine senior consultants…”the decision was made a formal meeting of the Hospital Medical Committee, with a majority which indicates that the committee was happy with the level of evidence for the policy. The clear majority and evolving policy don’t seem to fit too well with the narrative that there was considerable medical uncertainty and controversy about CIS and its progression.

2. “Professor Barbara Heslop explained this more appropriately…”

Heslop’s6 article is one to which I referred to in writing my article as I found some aspects of it informative. However, it is based in the opinions of the author so it’s unsafe to use this article to make certain statements about Green. Heslop considers that Green was doing research but seeks to place this in context stating “Herb Green aimed to ‘prove’ his hypothesis by carefully observing that dysplasia did not lead to cancer…Unfortunately, the proposed methodology was equally appropriate for showing dysplasia did lead to cancer. Paradoxically, and I am sure unintentionally, he ended up demonstrating…more convincingly than had been done before, the transition of dysplasia to cancer.” It was demonstrable that Green considered his work as a study initiated to test a theory and his 1974 paper said (p65) “This…represents the nearest approach yet to the classical method of deciding such an issue as the change or not of a disease from one state to another – the randomised controlled trial. It has not been randomised and it is not well controlled but it has at least been prospective…”

While Baker may have had the presumption that the therapeutic relationship would predominate, little suggests this happening in the case of Green. Whether he knew about such things as falsifiability, Green set out to prove his ‘dormant cancer’ idea despite indications early on that following such patients was unsafe (such as three cases of invasive disease in patients followed with positive cytology occurring by 1969). If the therapeutic relationship was predominant, those cases should have prompted reconsideration of the hypothesis; instead they were reclassified and removed from the study.

3. The 1966 management protocol was to “extend” conservative treatment…”

What seems to be being said here is that under 35 doesn’t mean that, but that it means older patients can be included as well. It should mean what it says as this was a safeguard intended to protect patients which Green then breached. When aging occurs, physiological changes mean it is more difficult to view areas of abnormality and Green and his colleagues were aware of this and the additional risks. The report (p37) stated “As a woman gets older, the squamocolumnar junction is more likely to lie in the endocervical canal and therefore be invisible to the colposcopist.” This means that it can’t be determined whether lesions extending further are suspicious and it was impossible to get a sample without a cone biopsy. Older women were more likely to have unsuspected invasive carcinoma. The use of words like treat is misleading as the intention was not to extend conservative treatment, but to monitor women with positive cytology to fulfil the aim of the proposal. As an example the proposal stipulated punch biopsies and used the word treat and treated (p 21 “four have been treated by punch biopsy alone.”) however this was regarded as a diagnostic procedure. The only way a punch biopsy could be a ‘treatment’ is if somehow by accident or design, the biopsy managed to obliterate a small lesion.

4. “Coffey presents this as a negative outcome, as if it was unnecessary outcome for the women.”

It was. There is a difference between ongoing monitoring which often can be done at primary care level and repeated attendances at a hospital over many years for multiple tests and interventions. Patient 4M (p44) was first admitted in 1970 with abnormal smears. In between 1970 and 1983 she had 38 appointments and six biopsies (wedge, ring, cone, surface) were performed with two occasions being histologically incomplete. A review of patient notes (p42) showed many women had more than one cone biopsy and in some cases up to six. Testimony showed that doing this more than twice was not considered unless under exceptional circumstances and doing this procedure could have effects such as stenosis or haemorrhage and make later evaluation difficult. Bonham testified that this was a dangerous practice and with the third or fourth conisation, it was probably a greater risk than hysterectomy.

Nothing in medicine is benign, and there are obligations to treat patients ethically. This includes minimising as far as possible unnecessary medical procedures as there are a number of risks entailed every time intervention is made. In a condition as treatable as CIS that could have been simply excised that means that over a period of time many women had a number of procedures that were unnecessary and posed excess risk to them that still left them with positive cytology resulting in risk of progression with its own complications. The associated disruption, pain and discomfort of these multiple interventions shouldn’t be trivialised.

5. Regarding the infant vaginal swabs, a press release by Judge Cartwright’s counsel stated “Mothers were told of the tests.”

Any kind of consent would have sufficed. Judge Cartwright stated (p141) “&#8230 there was no provision made to comply with the fundamental requirement that children are not included in research with the consent of their guardians.” This was not a test but a trial and was non-therapeutic research that held no benefit for the infant. Green quickly realised after 200 babies had undergone the procedure that it was a waste of time and lost interest in the study without communicating this to the nursing staff leading to over 2000 babies being subjected to an unnecessary and potentially harmful vaginal vault smear for the purposes of research without the consent of their parent or guardian.

With randomisation of Green’s 1972 “R series” radiotherapy and hysterectomy trial it is difficult to see that it conformed to international practice. Randomisation is aimed at preventing systematic differences between groups and preventing bias but in this case, the selection criteria were made in advance but there was no allocation of patients prior to anaesthesia, grading and decision on surgical treatment so no concealment. Enrolment could have been influenced by biases such as the need to enrol sufficient patients into the study along with the potential for further bias to be added with the use of coin tossing. The patients were not given any opportunity to consent, and were mislead about the treatment decision. Testimony on p170 states “Dr Green and myself and others discussed this question of informing women in the trial about it when it was initiated in 1972. We decided in the end not to tell patients about the trial. We told them they would be examined under anaesthetic when the most appropriate mode of treatment would be decided and then we would proceed accordingly.”

I can contrast this lack of any kind of consent from the parents or “R series” patients with the oral consent obtained by Sir Liley for his intra-uterine infusions where he sufficiently informed the patient of the possible risks and that the treatment was experimental. His case study published in 19637 states “the patient and her husband were an intelligent couple, and the prognosis for the foetus, the possibility and uncertainty of intrauterine transfusion, and the potential hazards to the mother were fully explained to and discussed with them.” This was not the case with Green and his research projects, as no real attempt was made to provide any kind of informed consent.

6. “Despite writing this, Coffey herself makes it clear that the two groups…had nothing to do with the two groups whose records Green analysed.”

This is an assertion and no reason is given as to why you state this. As such, there is nothing there to counter other than to say they had everything to do with those groups. McIndoe et al8 was retrospective while Green’s research was prospective, which made a difference in how the study was conducted but they were measuring the same thing as Green’s 1974 paper (p65) describes: “This series of 750 cases of in situ cervical cancer, and the following of 96 of them with positive cytology for at least two years…” The McIndoe paper was also a comparison of two groups of women, one with normal follow-up cytology and one without and was the final paper that Green never wrote that completed follow-up on the patients that were the subjects of his study. In my discussion, I highlighted the summary in the paper of patients who were included in the punch biopsy special series and that alone should make it clear the relationship between the “special series” and the study. I’m sure if Green could have asserted the same he would have, but couldn’t. The report didn’t rest on this paper alone but reviewed 1200 patient files and 226 were used as exhibits.

7. “Cartwright accepted this as “accurately reflect[ing] the findings of the 1984 McIndoe paper.”

Except Judge Cartwright did not. This is selective quotation that distorts the statements in the report and falls short of what you would expect from an historian whom you would expect to take care to fairly represent the context and statements in documents. The statement is from Ch4 “Expressions of Concern” where the article is addressed as it was the subject of public comment and had prompted the Hospital Board to request an inquiry. This put the article under scrutiny and criticism by some witnesses. Under the title “Was the magazine article accurate?” It is stated that the manuscript was submitted and editorial changes explained but there were some errors in the article that was finally published. This section states:

1.Significant editorial changes: The matter of accuracy was raised firstly by the authors themselves. In her evidence Sandra Coney drew attention to two editing changes which she considered substantially altered the meaning of sentences in the magazine article.

a. “Twelve of the total number of women had died from invasive cancer as had four, or 0.5%, of the group-one women, and eight, or 6% of the group-two women who had limited or no treatment.”

In the original manuscript the authors had written: “Twelve of the total number of women died from invasive carcinoma. Four (0.5%) of the Group-one women, and eight (6%) of the Group-two women who had limited or no treatment. Thus women in the limited treatment group were twelve times more likely to die as the fully treated group.”

I accept that the unedited material more accurately reflects the findings of the 1984 McIndoe paper. The edited version is not accurate.

It’s clear when looked in context that the statement was sourced from the original manuscript of the article and those words cannot be attributed to Cartwright. Cartwright is accepting that the original manuscript more accurately reflected the findings of the paper and is being misquoted to say something else. It is of note that in Bryder9 p33 that this statement is used to say “Cartwright too suggested differential treatment. In her report she quoted Coney and Bunkle’s statement that: ‘Twelve of the total number of women died from invasive carcinoma… [etc]” Cartwright accepted that this accurately reflected the findings of the 1984 McIndoe paper.” This statement is again used misleading to say something other than what it actually says and is being used inconsistently.

8.“How had they “returned to negative cytology”

McIndoe did not say treatment did not enter the study. The citation in Bryder used to reference this says only “The detailed management of patients is not under consideration in this paper…” The paper looks at the initial management and in some cases more detailed management of patients as Bryder would be aware. Here, it does become evident that there were differences, for instance in group 1 cone biopsies excision was incomplete in 24%, but in group 2, 74% were incomplete with the difference likely to be largely due to management where complete excision is not a necessity. The paper states “…any examination of the natural history of CIS of the cervix must depend on a representative, though incomplete, biopsy specimen on which to base the initial diagnosis. Thereafter, meticulous long-term follow-up of all patients using techniques such as clinical examination, cytology, and colposcopy, and if indicated biopsy, is required.” The paper detailed some limitations, such as small biopsies or possibly trauma eradicating lesions, or inadequate biopsies missing abnormalities. So in answer to that question, it was because initial management in group 1 patients either intentionally or unintentionally was adequate in treating the lesion and restoring them to negative cytology. Of this group only 0.7% had recurrence of CIS. In group 2, follow-up showed continuing positive cytology after initial management either by limited biopsy or incomplete treatment which was ideal for studying the natural history of CIS as set out in the 1966 proposal.

9. “Coffey refers to the 1986 paper…as critical of conservative treatment…”

This paper10 was only briefly mentioned before moving on with discussion of McIndoe et al as there was insufficient space to deal with it in detail. Here long term follow-up of vulvar carcinoma shows that of 31 patients managed by surgical excision, there were 4 recurrences and one developed a vulvar carcinoma 17 years later. 4 women managed only by biopsy progressed to invasion in 2-8 years and one additional patient managed with incomplete excision after a lengthy period of observation progressed to invasion. The paper demonstrated that untreated lesions have significant invasive potential. This approach was an extension of Green’s study of CIS of the cervix, and in this case a biopsy cannot be considered treatment at all. While the authors were advocating conservative treatment this was excision of the lesion not biopsies or incomplete excision.

10.“Would a modern gynaecologist agree with this assessment?”

The relevant sentence is presented as a statement, but it omits a significant portion of the sentence which is “This needs to be explained, as those figures strongly suggest the progression of CIS to invasion when it is and was a totally curable lesion.” Gynaecologists would accept the statement that CIS is a curable lesion which can be readily treated with a variety of local destructive methods with complete removal of the lesion and reversion to negative cytology which then prevents the risk of the lesion progressing. In the quoted statement McIndoe et al is referring to group 1 patients, whose cytology had returned to normal. It states “However, contrary to what would be expected, of the 139 group 1 patients with incomplete excision of the original lesion, only five (3.5%) later developed invasive carcinoma. Thus whether or not the lesion is completely excised does not appear to influence the possibility of invasion occurring subsequently.” In this case it didn’t, the rate of recurrence was unexpectedly small probably due to the initial intervention influencing the condition.

Treatment of a diagnosed lesion is then conflated with cervical cancer at a population level in asking for an explanation of why cervical cancer hasn’t been completely eliminated. In an ideal world this might be possible, but in the real world there are a number of difficulties to be faced in ensuring the entire population at risk is screened and treated if necessary. Green’s conclusion was that screening was not effective, however the conclusion was unjustified. The report discusses this on page 56 and crucially treatment needs to improve the prognosis as if subsequent cases are not adequately treated there is little value in screening in the first place. Also, if screening is done in low risk cases and high risk populations are missed, that means screening will be limited in being able to affect morbidity and mortality. In McIndoe et al, the age-standardised incidence of invasive carcinoma in group 2 was 1141/100,000 compared with 18.2/100,000 in the general population in 1975. This has since dropped considerably.

11. “As stated above, group 1 and group 2 had a similar range of treatments…”

My statements stand on this matter that “this ignores that while many women were treated with various procedures, there was evidence of continuing disease, demonstrating that the intervention was inadequate. This was not followed up, posing a high risk of development of invasive disease.” To prove that CIS is not a premalignant disease necessitated the area is sampled for diagnosis, but done in a way that left the lesion available for further study. In some cases there was no treatment, for instance the punch biopsy series which only used a diagnostic method. The criteria included that “the colpscopically-significant area is large enough not to be completely excised by the diagnostic punch biopsy.” The intention was to leave the lesion as undisturbed as possible. The use of cone biopsy is covered in q 5 and 9 as this could also be diagnostic. Of the hysterectomy series, only 4 out of 25 had the procedure for CIS so the procedure was done but not often specifically for CIS. Either way, women were left with positive cytology which put them at risk.

12. ” The methodology of the 2008 paper has been questioned by Sandercock and Burls…”

I would be embarrassed to cite this letter11 as an example of “questioning”. Every paper is flawed to a degree but this isn’t the right criticism to make. They cite a secondary source and claim this explains what they say is a problem with McIndoe et al – “He points out that, not only were the two group retrospectively divided on the basis of persistent abnormal cytology during follow-up and not prospectively as experimental groups for the comparison of different treatment strategies…” They misread the letter12 which does not appear to state anything regarding type of study and apparently draw from Overton’s misleading statement that “…Green and other senior NWH clinicians endorsed policy changes in dysplasia management. Younger women were to be continuously monitored, by repeat smears, colposcopy, lesser biopsies and appropriate more major surgery if evidence of early cancer.” which omits mention of Green’s role and his published studies. Sandercock and Burls then make an erroneous conclusion that McIndoe’s research should have been prospective and be following different treatments without realising that prospective research had already been done by Green. They cannot have read McIndoe et al despite citing the paper otherwise they would have seen the paper outlined the 1966 proposal. A few minutes reading would have shown the difference in between the statements which if they were honestly critiquing the study they should have checked.

Sandercock and Burls then claim a similar “problem” with McCredie et al even though they are aware it was retrospective. This might be correct to say for prospective studies that ask a question and look forward such as Green’s as this type of study should assess outcomes relative to interventions but retrospective studies are meant to pose a question and then look back. McIndoe et al looked at the question of outcomes for patients with CIS with the patient groups defined by presence of positive or negative cytology which categorised according to the risk they had persistent disease. McCredie13 takes this a step further with the approach being to look at the question of outcomes for patient groups classified by management that was adequate or inadequate. There is no problem with this approach; the problem lies with Sandercock and Burls.

13. “…It should be noted a study on outcomes cannot make such pronouncements…”

It can however tell a story, one that is further strengthened by understanding what the author is trying to achieve. Papers are meant to be considered in the light of all the evidence and that includes context. McCredie et al shows half the cancers in women initially managed with punch/wedge biopsy were diagnosed within 5 years of a finding of CIN3. It can be judged objectively there that merely doing a diagnostic procedure in patients with CIN3 leads to a high risk of developing cancer in a relatively short period of time, while the context shows up much more and shows the unethical nature of the original research which meant they were managed in that manner.

14. “Yet Green’s achievement was to encourage an openness to look at the evidence.”

Which story is it that is being referred to? The one where there is a controversy in medicine? If so, he wasn’t the spirited free-thinker he is being cast as. If it is the one where Green was the controversial one, willing to question modern medicine then the controversy wasn’t in medicine. If he is going to be cast as Galileo type of figure, persecuted for his heresy, the critical point is that Galileo was proven correct. So where are his papers? Even his supporters never present his papers to support their claims. Their resort is to complain about everything else.

Green’s ‘achievement’ was the reverse. On p108 of the report, in an Auckland Star article in 1972 it was reported that “Professor Green asserted that a woman with a positive cervical smear showing what is called [CIS] is no more likely to develop invasive or malignant cancer of the cervix than any other woman of the same age. In other words, in situ cancer is not a forerunner of invasive cancer, and the smear test is over-rated.” There is no shift in attitude over time, despite that over the years, much more would have been studied on the matter and medical practice would have changed. Green’s set views were taught, leading to Registrars and other staff being under the impression that screening for cancer precursors was a waste of time. Apparently he kept an Ogden Nash quotation on his blackboard for many years saying “My mind is made up – don’t confuse me with the facts”. None of this shows any willingness to debate the evidence; on the contrary when faced with evidence of patients with invasive cancer that he had originally diagnosed with CIS though not a trained pathologist, he reclassified them and excluded them from the study. They did not fit, so he changed the evidence to suit his theory. True scepticism is not about holding an idea or defending a position but about being open to the evidence and being willing to examine it and change if necessary. Hitting on the hard edges of scientific debate is a tough experience but it serves no one if the record is distorted to hold an untenable position and legitimate questioning of this is taken to be persecution instead of honestly examining whether the position is, in fact, a correct one to hold.

References

  1. “The Cartwright Report”: http://www.nsu.govt.nz/current-nsu-programmes/3233.asp
  2. “Screening for cervical cancer” 1976: BMJ 659-60
  3. The Walden Report: June 5, 1976: CMA Journal Vol. 114 1003-1012
  4. Ostor, AG 1993: Intern. J. Gyn. Path. 12, 2, 186-92
  5. Lowy, I July 2010 Historia, Ciencias, Saude – Manguinhos V. 17, supl. 1, 53-67
  6. Heslop, B 2004: NZMJ 117,1199
  7. Liley, A.W. 2 November 1963: BMJ Vol 2, Issue 5365 1107-1108
  8. McIndoe, WA; McLean, MR; Jones, RW; Mullins, PR 1984: Obstet Gynecol. 64, 4, 454.
  9. Bryder, L 2009: A history of the ‘Unfortunate Experiment’ at National Women’s Hospital, Auckland University Press, Auckland
  10. Jones, RW; McLean, MR; 1986: Obstet Gynecol. 68, 4, 499-503.
  11. Sandercock, J. Burls, A. 2010, NZMJ 123, 1320
  12. Overton, G.H. 2010, NZMJ 123, 1319
  13. McCredie, M. 2010, NZMJ 123, 1321

The Unfortunate Experiment: Revisiting the Cartwright Report

This article is a response to ‘Truth is the daughter of time, and not of authority’: Aspects of the Cartwright Affair by Martin Wallace, NZ Skeptic 96.

The Cartwright Inquiry1 was held after the publication of “An Unfortunate Experiment at National Women’s” in Metro magazine in June 1987. The events leading up to the publication of the article and the findings of the subsequent inquiry have been contested ever since.

The inquiry heard from 67 witnesses, many doctors, 84 patients and relatives, and four nurses. In addition, 1200 patient records were reviewed, with 226 used as exhibits. The final report released in August 1988 has had a long-lasting impact. It recommended many changes in the practice of medicine and research, including measures designed to protect patients’ rights and a national cervical screening programme. These have since been implemented. The Medical Council announced in 1990 that four doctors were to face disciplinary charges resulting from the inquiry’s findings of disgraceful conduct and conduct unbecoming a medical practitioner. Charges against Dr Herbert Green were dropped due to ill health.

The report of the Committee of Inquiry has withstood many challenges, including judicial reviews and many articles alleging its findings to be flawed. Yet there have been allegations of a miscarriage of justice, charges of a witch-hunt, even a feminist conspiracy.

Where does this leave Dr McIndoe and others who had mounting concerns for so many years? Why did so many women develop cancer? In this article I will explore the findings of the Cartwright Inquiry, its context, the research and the criticisms, and attempt to find a more nuanced understanding of the “unfortunate experiment” and its ongoing effects. Page numbers in parentheses refer to pages in the Cartwright Report. CIN3 and CIS are interchangeable terms for a lesion of the cervical epithelium which can be a precursor to invasive cancer.

The Findings of the Inquiry

The report found that Green, rather than developing a hypothesis, aimed to prove a point (p 21) that even at the time was known not to be the case. A 1961 compilation of studies from Paris, Copenhagen, Stockholm, Warsaw, and New York showed CIS progressed to invasive cancer in 28.3 percent of cases (p 23). As at 1958 the official policy was “… treatment of carcinoma of the cervix Stage 0, [CIS] should be adequate cone biopsy … provided the immediate follow-up is negative and … the pathologist is satisfied that the cone biopsy has included all the carcinomatous tissue” (p 26). Standard treatment of the time involved excising all affected tissue and the ‘conservative’ treatment of conisation was in use well prior to 1966.

Green’s initial proposal stated “… It is considered that the time has come to diagnose and treat by lesser procedures than hitherto, a selected group of patients with positive (A3-A5) smears. Including the four 1965 cases, there are at present under clinical, colposcopic, and cytological observation, 8 patients who have not had a cone or ring biopsy. All of these continue to have positive smears in which there is no clinical or colposcopic evidence of invasive cancer”… The minutes then record that “… Professor Green said his aim was to attempt to prove that carcinoma-in-situ (CIS) is not a premalignant disease”… (p 22). This appeared to come about because of concern about unnecessarily extensive surgery for CIS between 1949 and 1962. During this period, some centres were beginning to use cone biopsy as effective treatment; however there were limitations to its use (p 27).

There were some questions over whether the work was a research project. The inquiry concluded this was the case and that a research protocol, however flawed, was put in place (p 69). Green published in peer-reviewed journals on his hypothesis and findings. By 1969, three cases of invasive disease had occurred in patients with positive cytology monitored for more than a year, and this should have made it clear that following patients with persistent CIS was unsafe (p 52).

Green then explained those patients by concluding that they’d had invasive cancer that was missed at the outset. The report contends this was dangerous to the patients as it demonstrated that the proposal was incapable of testing the hypothesis. These patients were reclassified by Green and the patients removed from the study (p 55). In addition, patients over the age of 35 were included in the research in breach of the protocol vp 49).

There were many subsequent issues, including lack of patient consent (p 136). Patients also had to return for repeated tests and other invasive procedures, often receiving general anaesthetics in the process (p 42-49). A collection of cervices from foetuses and stillborn infants and another of baby uteri in wax were collected by Green for research which was later abandoned. This did not appear to comply with the Human Tissue Act (1964) as no consent was obtained from the parents of the stillborn infants (p 141).

As part of an earlier 1963 trial to test whether abnormal cytology in women later developing CIS or invasive cancer was present at birth (pp 34 & 140), 2,244 new-born babies had their vaginas swabbed without formal consent from the parents (there was a decision to abandon this trial soon after it started but this wasn’t communicated to nursing staff until 1966).

Procedures such as vaginal examinations and IUD insertions/removals on hysterectomy cases were performed by students without patient knowledge or consent while they were under anaesthetic (p 172). There was a further study on carcinoma of the cervix treatment, where patients either had radiotherapy alone or hysterectomy and radiation (p 170). The method of randomisation was by coin toss.

The Research

The idea that patients were divided into two experimental groups arose from McIndoe et al (1984)2. The patients were divided retrospectively into two groups which overlapped strongly but not completely with groups defined by Green, that he called “special series”. In his 1969 paper, cited in the report (p 40-41) he stated: “The only way to settle the question as to what happens to carcinoma in situ is to follow adequately diagnosed but untreated lesions indefinitely … it is being attempted at NWH by means of 2 series of cases. (I) A group of 27 women … are being followed, without ‘treatment’, by clinical, colposcopic, and cytologic examination after initial histological diagnosis of carcinoma in situ … has been established by punch biopsy … (II) A group of 25 women who have had a hysterectomy (4 for cervical carcinoma in situ) and who now have histologically-proven vaginal carcinoma in situ, has been accumulated …” This was done semi-randomly, with cases presenting themselves fortuitously.

The outcome for the group of 25 who were included in the punch biopsy “special series” was summarised in the McIndoe et al (1984) paper. Nine out of 10 women who were monitored with continuing positive smears developed invasive cancer. Only one out of 15 women who had normal follow-up cytology later developed invasive cancer. While Coney and Bunkle may have made a mistake, it’s clear the judge didn’t. The report states: “Green’s 1966 proposal was not a randomised control trial, but it was experimental research combined with patient care” (p 63).

Green’s interpretation of the data in his 1974 paper is suspect, having concluded that the progression rate was 7-10/750 (0.9 to 1.3 percent) or 6/96 (6.3 percent) of ‘incompletely treated’ lesions (p 54). These were explained by suggesting that either invasive cancer was missed at the start, or over-diagnosed at the end. Dr Jordan (expert witness) deemed this interpretation incorrect as of the 750 cases, 96 had continuing positive cytology, meaning that the other 654 patients could be considered free of disease. Of that 96, 52 patients had not been assessed further, making it impossible to know whether or not this group already had unsuspected invasion. Of the 44 patients remaining with ongoing carcinoma in situ who had more investigations, seven were found with invasive carcinoma. The incidence of known progression was therefore 7/44 (16 percent), which approximates McIndoe et al (1984) findings. This means that the proportion of invasive cancer cases in those inadequately treated was much higher compared with those who had returned to negative cytology, even before any cases where slides were re-read and excluded are considered.

McIndoe et al (1984) covered the follow-up data for 948 patients with a histological diagnosis of CIS patients who had been followed for a minimum of five years; there was a further paper in 1986 regarding CIS of the vulva. The same method used by Dr Green to group women by cytology after diagnosis and treatment was used, but using the correct denominators and the original diagnosis. Patients who were diagnosed with invasive cancer within one year were excluded to avoid the possibility the cancer had been missed initially. The management was cone biopsy or amputation of the cervix in 673 patients, with 250 managed by hysterectomy. The only biopsies in 25 women were punch biopsy (11), wedge preceded by punch biopsy (7) and wedge biopsy alone (7). Twelve out of 817 (1.5 percent) of group 1 patients developed invasive cancer. Given the lengthy follow-up with negative cytology for group 1 patients, the authors concluded these represented the development of new carcinoma. There were marked differences in the completeness of excision between the two groups and the second group shows markedly different results, with 29/131 (22 percent or 24.8-fold higher chance) with positive cytology developing invasive cancer. At 10 years this was 18 percent rising to 36 percent after 20 years, irrespective of the initial management or histologic completeness of excision. This needs to be explained, as those figures strongly suggest the progression of CIS to invasion when it is and was a totally curable lesion. The answer is that a prospective investigation, as done by Green, has to establish that invasive disease is not present, while conserving affected tissue that is required for later study. The argument has been posed that women in the second group did get cone biopsies and hysterectomies. This ignores the fact that while many women were treated with various procedures, there was evidence of continuing disease, demonstrating that the intervention was inadequate. This was not followed up, posing a high risk of development of invasive disease.

This differs from group 1 patients, who were successfully treated at the outset. It’s pertinent to point out that the Cartwright Report did not rely on this study (or the Metro article) to reach its conclusions, but on review of patient records.

There have been two follow-up studies. McCredie et al (2008)3 examined medical records, cytology and histopathology for all women diagnosed with CIN3 between 1955 and 1976, whose treatment was reviewed by judicial inquiry. This paper gave a direct estimate of the rate of progression from CIN3 to invasive cancer. For 143 women that were managed by only punch or wedge biopsy the cumulative incidence was 31.3 percent at 30 years and 50.3 percent in a subgroup who had persistent disease at 24 months.

The cancer risk for 593 women who received adequate treatment and who were treated conventionally for recurrent disease was 0.7 percent at 30 years. These findings support McIndoe et al (1984) and extend the period of follow-up.

McCredie et al (2010)4, described the management and outcomes for women during the period 1965-74 and makes comparisons with women diagnosed 1955-64 and 1975-76. This showed that women diagnosed with CIN3 in 1965-74 were less likely to have treatment with curative intent (51 percent vs 95 percent and 85 percent), had more follow-up biopsies, were more likely to have positive cytology during follow-up and positive smears that were not followed by curative treatment within six months, as well as a higher risk of cancer of the cervix or vaginal vault.

Those women initially managed by punch or wedge biopsy alone in the period 1965-74 had a cancer risk 10 times higher that women treated with intention to cure. This was despite the 1955-64 group being largely unscreened, which would have delayed diagnosis. This study is important as it shows the medical experience of the women, where they were subjected to many interventions that were not meant to treat but rather to monitor.

Whistle blowing

Scientific misconduct happens, and for those trying to address it the risks are high. Brian Martin5 looked at several cases, and stated: “In each case it was hard to mobilize institutions to take action against prestigious figures. Formal procedures, even when invoked, were slow and often indecisive.”

McIndoe and others encountered similar difficulties and ultimately failed to get Green’s proposal reviewed. The concept of “clinical freedom” (p 127), where the doctor was the arbiter of the best course of action for the patient, was one major issue to emerge from the report. Colleagues tended to be very reluctant to intrude upon this, and this meant that the proposal could continue with little oversight or intervention. McIndoe had mounting concerns, particularly after 1969, which were disregarded or treated lightly.

These concerns were shared by pathologist-in-charge Dr McLean, and were raised internally with Medical Superintendent Dr Warren, who consulted with the Superintendent-in-Chief, Dr Moody and an internal working party set up to look at the issue in 1975. Twenty-nine cases that had developed invasive disease were referred to it; however only 13 were examined, and having set up its own terms of reference it only considered whether the protocol had been adhered to and disregarded concerns about patient safety (p 83).

The 1966 proposal effectively ceased when McIndoe withdrew colposcopic services and Green reverted to cone biopsy in most new cases (p 88), but it was never formally terminated. While Green himself did not take any steps to prevent the review of records by McIndoe and colleagues, Bonham did, and wrote a letter to the Medical Superintendent (p 92).

There are some important lessons to be learned from this, including that those with the authority to deal with the situation should make the best effort to achieve a balanced view of the situation and assess it fairly to allow the claimant a fair hearing.

Conclusions

The potential risks of Green’s proposal outweighed any benefits such as avoiding hysterectomy or cone biopsy. Invasive cancer could not be ruled out because there were poor safeguards against the risk of progression. This was unethical from the outset, regardless of the issue of informed consent. In addition, patients that developed invasive disease had their slides reclassified and were removed by Dr Green from the study. This would be considered research misconduct then and now as it manipulated the data.

It does not matter if the initial motivations were sincere; they ultimately fail on these points. This proposal had a very human cost. Moreover Green’s views had long-term effects, including influence on undergraduate and postgraduate medical students, and support for the attitude that cervical screening was not worthwhile. This ‘atypical’ viewpoint was also promoted in the scientific literature and in the press, creating confusion within the medical scene and with the public.

It can be incredibly hard to admit our failings and let go of old loyalties. In the aftermath of the report many doctors objected to cervical screening, ‘unworkable’ consent forms and the intrusion of lay committees on practice6. It’s true this had negative effects on the perception of doctors overall, particularly in regard to practices that were widespread in hospitals at the time, and there were times that unfair criticisms were aired. This impacted on the nursing profession as well, for nurses are meant to be patient advocates.

This was also about power. The really unfortunate thing is that medical responsibilities to patients are almost totally ignored in the midst of the argument, when they should be brought to the forefront. Likewise respect, justice and beneficence were lacking for the patients involved. No doctor raised concerns about the lack of consent, even though from the 1950s there was the growing expectation that this be sought, particularly with participants in research.

The Medical Association working party that examined this stated that it was “regrettable that the trial deteriorated scientifically and ethically and did not change as scientific knowledge advanced or as adverse results were observed”7. They found it deplorable that patients involved did not know they were part of a trial, and that it took a magazine article for it to be investigated.

Unfortunately, instead of addressing this and examining whether Dr Green made any errors or misinterpretations himself, the findings in McIndoe et al (1984) and other papers were not accepted. There is the unfortunate implication that, rather than there being mounting and valid concerns over decades, that Green was unfairly toppled and the resulting inquiry was a whitewash.

The report couldn’t have been written without the assistance of the medical community as expert witnesses and advisors. It’s not surprising that there would be loyalty for a colleague, but perhaps instead of attempting to rehabilitate Green it’s time McIndoe and his colleagues were vindicated. Morality did not totally fail and attempts were made to prevent patients being harmed8.

Acknowledgements: many thanks to Dr. Margaret McCredie of Otago University who assisted me with my research.

  1. The Cartwright Report: www.nsu.govt.nz/current-nsu-programmes/3233.asp
  2. W.A. Mcindoe; M.R. McLean; R.W. Jones; P.R. Mullins 1984: J. Am. Coll. Obst. 64(4).
  3. M.R.E. McCredie; K.J. Sharples; C. Paul; J. Baranyai; G. Medley; R.W. Jones; D.C. Skegg 2008: The Lancet Oncology DOI:10.1016/S1470-2045(08)70103-7
  4. M.R.E. McCredie; C. Paul; K.J. Sharples; J. Baranyai; G. Medley; D.C. Skegg; R.W. Jones 2010: A&NZ J. Obst. Gyn. DOI:10.1111/j.1479-828X.2010.01170.x
  5. B. Martin 1989: Thought and Action 5(2), 95-102.
  6. J. Manning (Ed.) 2009: The Cartwright Papers: Essays on the Cervical Cancer Inquiry 1987-88. Bridget Williams Books Ltd.
  7. L. Bryder 2009: A History of the “Unfortunate Experiment” at National Women’s Hospital. Auckland University Press.
  8. C. Paul 2000: BMJ 320, 499-503.