Overdiagnosis in publicly organised mammography screening programmes: systematic review of incidence trends
BMJ 2009; 339 doi: https://doi.org/10.1136/bmj.b2587 (Published 09 July 2009) Cite this as: BMJ 2009;339:b2587
All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Overdiagnosis is an identified problem of screening for breast cancer
as shown by Jørgensen et al.[1] However, there is another important
negative side effect of screening that is not highlighted very often.
When an abnormal screening mammogram turns out to be false-positive, this
can have a major impact on quality of life (QoL) and feelings of fear of
the woman concerned.
In the south of the Netherlands a multicenter, prospective
longitudinal study is running since 2002 concerning QoL in women with
breast cancer. Women are included when they have an abnormal screening
mammogram or a palpable lump. Questionnaires are completed before
diagnosis is known and then on 6 fixed moments after diagnosis and
possible treatment.
Between 2002 en 2007 385 women with an abnormal mammogram were included in
this study. Of these, 152 were diagnosed with breast cancer and 233 turned
out to have a false-positive screening mammogram. This means that 60.5%
of the women recalled after an abnormal mammogram turned out to have a
benign diagnosis.
To come to this benign diagnosis significantly more diagnostic procedures
were needed than to diagnose malignancy. Almost 50% of the women needed
more than 4 procedures, of whom 18 (8%) eventually needed an excisional
biopsy to come to the diagnosis.
QoL in these women was mainly influenced by their personality, and
especially women with an anxious personality showed a diminished QoL and
an increase in feelings of fear. The Eta squared for QoL was 0.27 and for
state anxiety 0.44 (Eta squared is the effect size; higher than 0.14 is a
large effect). The negative effects were observed for at least 1 year
after the false-positive mammogram.
Women often overestimate their risk of breast cancer and the benefits
of screening and are not aware of the possible dangers.[2] Thus, it seems
important to inform women correctly and abundantly about the pros and cons
of screening. However, a cross-sectional study of 27 websites by interest
groups in different countries found that the most important dangers of
screening – overdiagnosis and overtreatment – were the best-kept
secrets.[3]
Not only the possibility of overdiagnosis, and thus overtreatment,
should be mentioned in the information leaflets that accompany the
invitation for breast cancer screening, but also the consequences of being
recalled because of an abnormality on the mammogram, such as (extra)
diagnostic procedures or even surgery, and the significant possibility of
a decrease of QoL should not be left out.
References
1. Jørgensen KJ, Gøtzsche PC. Overdiagnosis in publicly organised
mammography screening programmes: systematic review of incidence trends.
BMJ 2009; 339: b2587. Doi: 10.1136/bmj.b2587
2. Nekhlyudov L, Li R, Fletcher SW. Information and involvement
preferences of women in their 40s before their first screening mammogram.
Arch Intern Med 2005;165:1370-4
3. Jørgensen KJ, Gøtzsche PC. Presentation on websites of possible
benefits and harms from screening for breast cancer: cross sectional
study. BMJ 2004;328:148-54
Competing interests:
None declared
Competing interests: No competing interests
The debates about overdiagnosis caused by mammography screening fit
well with Thomas Kuhn's ideas of scientific revolutions (1). The Forrest
report from 1986, which was the basis for the introduction of screening in
the UK, noted that overdiagnosis was not a problem in the New York trial,
as the number of breast cancers diagnosed in the two groups after 7 years
was equal (2). It also noted that 20 per cent more breast cancer had been
detected in the group offered screening after 6 years in the Two-County
trial and advised further follow-up to find out whether this excess
incidence persisted for the life-span of the women.
But the view at the time, when only two trials of screening
mammography had been published, was that overdiagnosis was not a problem.
The Forrest report is well worth quoting: “Screening is not likely to lead
to a significant increase in the numbers of breast cancers treated. During
the initial (prevalence) screen, there will be an increase in the numbers
treated, but this may eventually be compensated by a subsequent reduction
in the normally expected incidence. The only potential source of a general
increase in the numbers treated would seem to be from those that would not
otherwise be detected because death from another cause may occur prior to
any symptoms arising” (2). The use of the word "potential" suggests that
any additional diagnoses, causing some women to die prematurely, were
against expectations.
The tacit knowledge, as Kuhn calls it, in the scientific community
continued to be that overdiagnosis did not occur. Accordingly, the problem
was largely ignored in the scientific literature (3). In 1999, the Danish
National Board of Health asked The Nordic Cochrane Centre to review all
the trials because it had not been possible to see a reduction in breast
cancer mortality in Sweden, despite nationwide screening having started
already in 1986 (4). We found about 30% more diagnoses in the screened
groups, even after long follow-up (5) and were highly surprised that this
had not been described previously. Our finding was so unwelcome that the
editors of the Cochrane Breast Cancer Group did not allow us to publish it
in our first Cochrane review of screening from 2001 (5, 6). It took
several years of negotiations, which involved the Cochrane ombudsman,
before these data became available in the Cochrane review in 2006 (7).
Thus, the prevailing paradigm of "no harm done" was fiercely protected,
even by editorial members of our own organisation, The Cochrane
Collaboration, which otherwise strives to minimize bias and to expose
important harms in systematic reviews.
As the Kuhnian anomalies accumulated in trial after trial, and in
statistics from screening programmes, clearly demonstrating that more
diagnoses occured in the screened groups, screening advocates sticked to
their paradigm, often by resorting to a short letter in The Lancet from
1993 by Boer et al. (8). Euler-Chelpin et al. (9) also cite this letter in
their criticism of our 2009 BMJ review of overdiagnosis (4). Boer et al.
claimed that screening led to less than 2% true extra incidence and based
this on a model that they did not describe (8). They predicted that the
large initial increase in incidence would be compensated by a sharp
decline in incidence immediately after the women left the programme when
they passed the age limit for screening (8), but this phenomenon was
wishful thinking and has never been observed in practice, in any country
(4).
Kuhn noted that observations that contradict the prevailing paradigm
are simply ignored by the scientific community. Accordingly, we have not
seen a single screening advocate try to explain how the New York Trial
could produce a large reduction in breast cancer mortality, 35% after 7
years, apparently without advancing the time of diagnosis, which is a
conditio sine qua non for cancer screening (7). The British statistician
Richard Peto asked the principal investigator of the trial three times at
a meeting in 1983 why the number of breast cancers was about the same in
the screened group as in the control group (10). This discussion is
largely unknown and is highly unlikely to be found in a literature search,
as only a formal paper, and not the ensuing discussion, is listed in
PubMed. We have calculated that far more women with prior breast cancer
were excluded from the screened group than from the control group, which
is why the number of cancers were similar in the two arms (7). The trial
is often cited, but the fact that it is flawed is rarely noted, as it
yielded the answer people wanted to hear: a large effect and no harms.
Euler-Chelpin et al. have previously argued that there is no
overdiagnosis in the Copenhagen mammography screening programme (11). They
continue their faulty line of reasoning from that paper (which is ref. 6
in their rapid response), as they now say that after the prevalence peak
"the incidence dropped back to a level well in line with the pre-screening
trend" (9). We have explained why their studies are inappropriate in the
Web Extra material that accompanies our review (4). To explain in more
detail, they calculated a 95% confidence interval around the expected
incidence level for each year after screening was introduced, and then
claimed that there is no overdiagnosis because the observed level in
Copenhagen and Funen, taken separately, was within this confidence
interval (11). This approach reminds us of the old adage that there is
none so blind as those who will not see. Obviously, any significant
difference will disappear if only the analysed subgroups get small enough.
There are only about 50,000 eligible women, both in the Copenhagen and the
Funen programmes, and there are therefore few diagnoses each year. The
authors could easily have shown a significant difference if they had used
commonly accepted methods and looked at the trend for the whole time
period, or just had included both geographical regions. They also made the
logical error of concluding "evidence of absence" from "absence of
evidence". In 2006, some of the same authors published a graph that showed
a large and sustained increase in diagnoses after screening was introduced
in Copenhagen, but there was not a word about overdiagnosis, although the
title of their paper was "Breast cancer incidence and mortality in the
Nordic capitals, 1970-1998. Trends related to mammography screening
programmes" (12). Euler-Chelpin et al. wonder why we have not looked at
the Copenhagen data, which are available from the Danish Cancer Register.
Indeed we have, and we have submitted a manuscript that not only looks at
Copenhagen, but at the whole country, and also at groups that are too
young or too old to get screened and will provide an estimate of
overdiagnosis in Denmark as well.
Daniel Kopans argues that we make an error when we call "cancers
diagnosed each year" for incidence (13). We are a bit puzzled by this, as
annual incidence is the number of new cases of a disease found each year.
Kopans focuses on the potential problem of mixing cancers detected at the
prevalence screens with other cancers, but this is not a problem for our
analyses, as we simply added up all cancers, no matter how they were
diagnosed. We have more than sufficient follow up to demonstrate that the
excess of cancers detected at the prevalence screens are not compensated
later on. After our BMJ paper was accepted, we became aware of more up-to-
date statistics from the UK and now have almost 20 years of follow up
after screening started with no sign of a compensatory drop (14). The
figure shows the incidence of invasive breast cancer per 100,000 women
among age groups too young to be screened (40-49 years), women screened
from 1988 (50-64 years), women screened from 2001-3 (65-69 years), and
women over the age limit for organised mammography screening (70+). The
dotted lines are the expected incidences in the absence of screening.
In the age group 65-69 years, there is still no sign of a
compensatory drop in incidence, although the follow up extends far beyond
the time when all had previously been invited to screening several times.
Yet when these women were included in the programme in 2001-3, their
incidence level increased dramatically, exactly as when the programme was
first offered to women who had never been screened before, in 1988.
The updated UK data set, and the data we have published from other
countries (4), dismisses the concerns raised about the role of hormone
replacement therapy and other factors that could increase the background
incidence (15). Despite wide variations in the time of introduction of
screening between regions and age groups, and within and between
countries, the introduction of screening is perfectly synchronised in time
with a marked and persistent increase in incidence relative to what was
expected without screening. Further, it is present in the relevant,
screened age group, and in this age group only. For New South Wales, women
too young to be screened (40-49 years) have closely similar observed
incidence rates to those we projected, just as for other countries (4). We
therefore do not share the concern of John Boyages about our projections
for the screened age ranges (15).
It is time to overthrow the 25-year old paradigm in screening
mammography that it is possible to screen without overdiagnosis. In the
screened age group, one out of three diagnoses in a country with a
screening programme is an overdiagnosis (4). Screening for cancer without
overdiagnosis is not only biologically implausible, it has also been a
known harm for very long in other cancers, e.g. for lung cancer,
neuroblastoma in children and for prostate cancer (16) (17). In these
cases, the excess incidence has been readily recognised as overdiagnosis.
What is therefore truly surprising is that so many breast screening
advocates have turned a blind eye to this rather obvious fact for so long,
or have buried it in statistical models when they saw it, using
assumptions that clearly must be wrong (18, 19).
1. Kuhn TS. The Structure of Scientific Revolutions. Chicago
University Press, 3rd Edition, Chicago 1996.
2. Forrest P. Breast Cancer Screening. Report to the Health Ministers
of England, Wales, Scotland & Northern Ireland. Department of Health
and Social Science 1986.
3. Jørgensen KJ, Klahn A, Gøtzsche PC. Are benefits and harms given
equal attention in scientific articles on mammography screening? A cross-
sectional study. BMC Medicine 2007; 5: 12.
4. Jørgensen KJ, Gøtzsche PC. Overdiagnosis in publicly organised
mammography screening programmes: systematic review of incidence trends.
BMJ 2009; 339: b2587.
5. Olsen O, Gøtzsche PC. Cochrane review on screening for breast
cancer with mammography. Lancet 2001; 358: 1340-2.
6. Horton R. Screening mammography – an overview revisited. Lancet
2001; 358: 1284-5.
7. Gøtzsche PC, Nielsen M. Screening for breast cancer with
mammography. Cochrane Database Syst Rev 2006;(4):CD001877.
8. Boer R, Warmerdam P, de Koning H et al. Extra incidence caused by
mammographic screening. Lancet 1994; 343:979.
9. Euler-Chelpin M, Njor SH, Lynge E. Answer to Jørgensen and
Gøtzsche. BMJ 2009. http://www.bmj.com/cgi/eletters/339/jul09_1/b2587.
10: Shapiro S. Discussion II. J Nat Cancer Inst Monographs 1985; 67:
75.
11. Svendsen AL, Olsen AH, von Euler-Chelpin M, Lynge E. Breast
cancer incidence after the introduction of mammography screening: what
should be expected? Cancer 2006; 106: 1883-90.
12. Törnberg S, Kemetli L, Lynge E, Olsen AH, Hofvind S, Wang H et
al. Breast cancer incidence and mortality in the Nordic capitals, 1970-
1998. Trends related to mammography screening programs. Acta Oncol 2006;
45:528-35.
13. Kopans D. A major error. BMJ 2009.
http://www.bmj.com/cgi/eletters/339/jul09_1/b2587#217871.
14. CancerResearch UK.
http://info.cancerresearchuk.org/cancerstats/types/breast/incidence/.
Accessed August 17th 2009.
15. Boyages J. Organised mammography screening programmes: a positive
perspective. BMJ 2009.
http://www.bmj.com/cgi/eletters/339/jul09_1/b2587#217871.
16. Welch GH. Should I be tested for cancer? California: University
of California Press, Ltd; 2004.
17. Schröder FH, Hugosson J, Roobol MJ, Tammela TLJ, Ciatto S, Nelen
V et al. Screening and Prostate-Cancer Mortality in a Randomized European
Study. N Engl J Med 2009; 360: 1320-8.
18. Gøtzsche PC, Jørgensen KJ. Stephen Duffy’s claims on the benefits
and harms are seriously wrong. BMJ 2009.
http://www.bmj.com/cgi/eletters/338/jan27_2/b86.
19. Zahl PH, Jørgensen KJ, Mæhlen J, Gøtzsche PC. Biases in estimates
of overdetection due to mammography screening. Lancet Oncol 2008; 9: 199-
201.
Competing interests:
None declared
Competing interests: No competing interests
The Jorgensen and Gotzche analysis of overdiagnosis may have
inadvertently communicated a conservative estimate to the public of a
serious screening harm (1). Rather than the 10% used by the authors,
United States population data show that 19-24% of breast cancers are DCIS
between the ages of 50-65 (2).
Furthermore, the last sentence of the abstract states “One in three
cancers detected in a population offered organized screening is
overdiagnosed”. While all excess or overdiagnosed cancers (observed minus
expected) should be mammography-detected (unless radiation induces excess
interval or symptomatic cancers), not all detected or diagnosed cancers
are necessarily mammography-detected. The percentage of overdiagnosis
depends on the reference class of the denominator.
If the denominator is (observed minus expected)/expected, the
overdiagnosis percentage is 52%. If the denominator is (excess
cancers)/observed, the percentage is 0.52/1.52 or 34% (1). In the
screening-eligible populations, the observed cancers should represent all
cancers including subsets nonscreened symptomatic, screened symptomatic
(interval), and asymptomatic mammography-detected. The last group would
be true screen-detected (3).
Do the authors know the percentages of screen-detected cancers in the
observed populations? Wishart et al found that 40% of invasive cancers in
an organized screening program were screen-detected. Assuming 10% DCIS,
this becomes 2849/6227 or 46% (4). Assuming this value is accurate, the
overdiagnosis percentage for screen-detected cancers is 0.52/ (1.52 times
0.46) or 74%, so perhaps 3 out of 4 mammography-detected cancers are
pseudocancers and overtreated? (5).
1. Jorgensen KJ, Gotzsche PC. Overdiagnosis in publicly organised
mammography screening programmes: systematic review of incidence trends.
BMJ 2009;339:b2587.
2. Keen JD, Keen JE. How does age affect baseline screening
mammography performance measures? A decision model. BMC Med Inform Decis
Mak 2008;8(1):40.
3. Welch HG, Schwartz LM, Woloshin S. Ramifications of screening for
breast cancer: 1 in 4 cancers detected by mammography are pseudocancers.
BMJ 2006;332(7543):727.
4. Wishart GC, Greenberg DC, Britton PD, Chou P, Brown CH,
Purushotham AD, et al. Screen-detected vs symptomatic breast cancer: is
improved survival due to stage migration alone? Br J Cancer
2008;98(11):1741-4.
5. Mulcahy N. 1 in 3 Breast Cancers Detected by Screening Is
Overdiagnosed, Overtreated. Medscape Medical News, July 14, 2009.
http://www.medscape.com/viewarticle/705886
Competing interests:
None declared
Competing interests: No competing interests
Institute of Public Health,
University of Copenhagen
Answer to Jørgensen & Gøtzsche (1)
As mammography screening aims at diagnosing breast cancer cases
before clinical symptoms emerge, it certainly has an impact on the
incidence of breast cancer. The dynamics of this process is well described
in the literature (2-5). Due to the lead time, initiation of screening
will lead to a prevalence peak, this is - as screening continues –
followed by an incidence, which due to the “artificial aging” is somewhat
above the level of unscreened women, and finally followed by a deficit in
incidence after the women have stopped attending screening. To study this
pattern, cohorts of women have to be followed over this entire span of
events. Jørgensen & Gøtzsche did not attempt to undertake such a
study.
Instead Jørgensen & Gøtzsche presented age specific incidence
data from before and after the introduction of screening. This approach
has, however, a major limitation. If say 50-69 year old women are screened
for the first time during a 2 year period, and the lead time is 3 years,
then the breast cancer incidence is expected to increase by 100% during
these 2 years. If the lead time is longer, then the expected increase in
incidence is even larger. However, if screening is introduced gradually or
if the screening sensitivity is low, then the prevalence peak will be
spread out over a longer time period, and it will then be impossible to
distinguish between the prevalence peak and overdiagnosis. Most of the
data sets presented by Jørgensen & Gøtzsche show modest prevalence
peaks, clearly pointing to gradual implementation of screening or – less
likely – to low screening sensitivity. These data sets are therefore not
optimal for the study of overdiagnosis.
Copenhagen – the home town of Jørgensen & Gøtzsche – offers an
excellent example of a transfer from no mammography screening (up to 31
March 1991), to screening offered to all women aged 50 to 69 within a 2
year period (1 April 1991 to 31 March 1993), to continued screening
offered to this age group every second year (6). In Copenhagen, the
incidence of breast cancer increased prior to the introduction of
screening, Figure 1. During the first screening round with a participation
rate of 71%, the incidence almost doubled. During the subsequent screening
rounds, the incidence dropped back to a level well in line with the pre-
screening trend, taking into account the constant inflow of prevalent
screens in the youngest age group, 50-51 years, and the artificial aging,
which based on the observed prevalence peak, must be above 3 years.
Clearly, the Copenhagen data are not compatible with an overdiagnosis of
52%. One may wonder why Jørgensen & Gøtzsche in their elaborated data
search have not looked at the Copenhagen data which are readily available
from the Danish Cancer Register.
My von Euler-Chelpin,
Sisse H. Njor,
Elsebeth Lynge
References
1. Jørgensen KJ, Gøtzsche PC. Overdiagnosis in publicly organised
mammography screening programmes: systematic review of incidence trends.
BMJ. 2009 Jul 9;339:b2587. doi: 10.1136/bmj.b2587
2. Boer R, Warmerdam P, de KH, van OG. Extra incidence caused by
mammographic screening. Lancet 1994; 343(8903):979.
3. Moller B, Weedon-Fekjaer H, Hakulinen T, Tryggvadottir L, Storm
HH, Talback M et al. The influence of mammographic screening on national
trends in breast cancer incidence. Eur J Cancer Prev 2005; 14(2):117-28.
4. Biesheuvel C, Barratt A, Howard K, Houssami N, Irwig L. Effects of
study methods and biases on estimates of invasive breast cancer
overdetection with mammography screening: a systematic review. Lancet
Oncol 2007; 8(12):1129-38.
5. Duffy SW, Lynge E, Jonsson H, Ayyaz S, Olsen AH. Complexities in
the estimation of overdiagnosis in breast cancer screening. Br J Cancer
2008; 99(7):1176-8.
6. Svendsen AL, Olsen AH, von Euler-Chelpin M, Lynge E. Breast cancer
incidence after the introduction of mammography screening: what should be
expected? Cancer 2006 May 1;106(9):1883-90.
Competing interests:
None declared
Competing interests: No competing interests
Jorgensen and Gotzsche in the introduction to their paper on
overdiagnosis in mammographic screening state that ‘Autopsy studies found
that 37% of women aged 40-54 who died from causes other than breast cancer
had invasive or non-invasive cancer’(1).
This is misleading. First, of the 50 women in the study they quote by
Nielsen et al (2) without a history of breast cancer only one had invasive
cancer and this was microinvasive (undefined in the paper, but the current
definition is less than 1mm). The median prevalence of previously
undiagnosed invasive cancer at autopsy in the review by Welch and Black
that they also quote (3) is only 1.3%. Second, Nielsen found 16 patients
with carcinoma in situ, but some of the illustrations are not convincing,
none were larger than 5mm with some were less than 1mm, suggesting
overdiagnosis. Welch and Black rightly raise the question of diagnostic
accuracy. Since the introduction of mammographic screening ductal
carcinoma in situ is much more frequently diagnosed and the diagnostic
criteria, in particular the distinction from atypical ductal hyperplasia,
are better established that at the time of the study by Nielsen. Thirdly,
cases of classical lobular carcinoma in situ are included – this is not
appropriate as, despite the name, this is not managed as carcinoma, as it
is largely regarded as a risk factor.
Jorgensen and Gotzsche correctly state that mammographic screening
identifies cancers that will not cause symptoms or death, but they
overstate the frequency of asymptomatic breast cancer based on autopsy
studies. The immediately following paper in the BMJ emphasises the
importance of accurate citation (4). The paper of Nielsen et al is much
quoted, but the results are almost certainly an overestimate.
Andrew HS Lee
Ian O Ellis
Department of Histopathology,
Nottingham University Hospitals,
City Hospital Campus
1. Jorgensen KJ, Gotzsche PC. Overdiagnosis of publicly organised
mammographic screening programmes: systematic review of incidence of
trends. BMJ 2009;339:206-209
2. Neilsen M, Thomsen JL, Primdahl S et al. Breast cancer and atypia
among young and middle-aged women: a study of 110 medicolegal autopsies.
Br J Cancer 1987;56:814-819
3. Welch HG, Black WC. Using autopsy series to estimate the disease
“reservoir” for ductal carcinoma in situ of the breast: how much more
breast cancer can we find? Ann Intern Med 1997;127:1023-1028
4. Greenberg SA. How citation distortions create unfounded authority:
analysis of a citation network. BMJ 2009;339:210-213
Competing interests:
None declared
Competing interests: No competing interests
As an F1 doctor I have had the chance to undertake a rotation in
Breast Surgery. My interest in this field has already helped me decide
that I wish to pursue a career in Breast Oncology or palliative care.
However, on March 22nd my world was tipped upside down when my mother was
diagnosed with Breast Cancer. She had been picked up on Breast Screening
through Breast test Wales. She had not felt a lump and the lump was hard
to palpate. It was removed and a Sentinal lymph node was returned
positive. She then underwent a full axillary dissection which revealed 8
out of 14 positive nodes. Currently udnergoing chemotherapy she is a
shadow of her old self but she is alive and making a recovery. Without the
Breast Test Wales system she would still not know that she had cancer. Who
knows when it would have become apparent? When the metastases had reached
the liver and she turned yellow? When she had a fit and the mets had gone
to the Brain? When the troublesome cough wouldn't go away? Fortunately we
will not have to answer that question and thanks to mammography my mother
will hopefully be here for many years to come!
I recognise the limitations of mammography and screening in general
and yes, some women may never develop agressive Breast Cancer. But is it
worth taking that risk? As Welch points out in his editorial, twisting the
figures to suit a viewpoint could show that mammography reduces death by
a third and not that it overdects by a third. Every woman should be free
to choose, given accurate advice and information on the pitfalls and
downsides of screening but also on the real risks of taking the chance
that a lesion will not progress. Until we know what lesions fo regress
surely the mammogram will remain a lifesaver - it certainly has saved my
mum!
Competing interests:
None declared
Competing interests: No competing interests
The results obtained by Jorgensen and Gotzsche, while of potential interest to public health authorities, are likely to be of limited use to an individual woman contemplating adoption of screening. This is because these results arise from an unspecified mixture of screening regimens applied in a heterogeneous population of women.
Ultimately, the decision to adopt screening, even if prompted or encouraged by mass screening campaigns, is taken by an individual woman, ideally in consultation with her doctor.(1) Under consideration in this context is a specific screening regimen, including such component features as the technical characteristics of the mammography, the reading and interpretation of the image, and the pre-specified algorithm of actions given a specific result. Furthermore, both “true-positive” and “false-positive”/overdiagnosis rates depend not only on the particulars of the screening method but also on characteristics of the woman herself, in terms of her risk profile (e.g. age, history of HRT use), history of screening, as well as factors affecting the performance of the screening method (e.g., body weight, breast density).(2) Thus, even if the quantitative findings reported by Jorgensen and Gotzsche do apply to the aggregate of women on the “population level”, the question that remains unanswered is to what extent they apply to any given woman being offered a particular regimen of screening, with her unique set of relevant characteristics.
These issues were not addressed or even commented on by Jorgensen and Gotzsche. To be ultimately useful in individual decision-making, the research needs to allow for the requisite practice-relevant distinction-making in the study design and analysis such that appropriate risk models can be produced (and made available to women and their doctors). Moreover, this information would need to be presented along with an estimate of the woman’s life-expectancy, which can be calculated based on a set of relevant risk factors.
References:
1. Miettinen OS. Screening for a cancer: a sad chapter in today’s epidemiology. Eur J Epidemiol 2008; 23:647-653.
2. Banks E, Reeves G, Beral V, et al. Influence of personal characteristics of individual women on sensitivity and specificity of mammography in the Million Women Study: cohort study. BMJ 2004; 324:477.
Competing interests:
None declared
Competing interests: No competing interests
In reply to the authors’ attempt to quantify the level of
“overdiagnosis” arising from population based screening programmes, it has
long been acknowledged that screening an asymptomatic population will
inevitably yield disease which will ultimately prove to be non life
threatening. Indeed this concept was headlined by Muir Gray J M et al
(2) which stated that “all screening programmes do harm; some do good as
well, and, of these, some do more good than harm at reasonable cost.”
The accompanying editorial(1), indicates the challenges to identify
the numbers of non-lethal breast cancers detected by screening, a question
which remains answered inconclusively by this article.
More extensive scrutiny of the authors’ adjustments for known rising
incidence of breast cancer including the magnitude of the effect of
hormone replacement therapy (HRT) requires to be undertaken to validate
the claims of this paper. In the same way, the sub analysis of
differing grades of DCIS would be vital before dismissing the diagnosis of
DCIS as attributing to “overdiagnosis”. There is sound evidence that
high grade (grade 3) DCIS presents considerably more challenges than that
of low grade DCIS and the Sloane audit(3) indicates that 57% of screen
detected DCIS in the UK National Screening Programme is indeed high grade
(grade 3) disease.
The concept of identifying non lethal tumours is outwith our current
understanding and technological ability. Treatment regimes are allocated
according to evidence-based algorithms e.g. the Nottingham Prognostic
Index (NPI), providing the most appropriate management at the present
time, but obviously even then there will be individual patients whose
survival does not follow the predicted trends.
In the accompanying editorial, the authors suggest the development of
a protocol to biopsy lesions: the suggestion is to use size as a
criterion to differentiate lethal from non lethal tumours. This
suggestion is at odds with our current understanding of the natural
history of breast cancer.
Recognising that organised, population based screening is a programme
rather than an individual test, the monitoring of performance against
agreed criteria and standards is pivotal to maximising the benefits and
minimising the harm of a screening programme. Muir Gray et al (2)
published 20 years experience of the UK Screening Programme showing
continuous quality improvement resulting in the cancer yield being
populated by a majority of small cancers a significant proportion of which
are high grade disease. With the increasing accuracy of non operative
diagnosis, the vast majority of women attending the Screening Programme
are being reassured without recourse to open biopsy, whilst for those
women in whom cancer is diagnosed, their views as to surgical options
(conservation versus mastectomy plus or minus reconstruction) can be
validly taken on board.
At the time of diagnosis we are duty bound to treat cancers according
to the current, evidence based protocols. Quite rightly we should be
constantly challenged to question our treatment plans and their
accompanying evidence base.
DR HILARY M DOBSON.
Chairman - Quality Assurance Reference Centre,
Scottish Breast Screening Programme.
Clinical Director - West of Scotland Breast Screening Service
DR JEREMY ST J THOMAS.
Lead Breast Pathologist - NHS Lothian
1. Welch H G. Overdiagnosis and Mammography Screening. BMJ 2009;
339: 1425.
2. Muir Gray J A, Patnick J, Blanks R G. Maximising Benefit and
Minimising Harm of Screening BMJ 2008; 336: 480-483.
3. The Sloane Project. UK prospective audit of screen detected non
-invasive carcinomas and atypical hyperplasias of the breast. Progress
Report 2006/2007 and 2007/2008. Published by West Midlands Cancer
Intelligence Unit, Breast Cancer Research Trust, Pfizer Oncology, NHS
Cancer Screening Programmes.
Competing interests:
None declared
Competing interests: No competing interests
Screening mammography aims to detect breast cancer at an earlier more “curable� stage and women who have been screened should enjoy a lower incidence of breast cancer after they stop screening.
The UK NHS breast screening programme is organised such that in Scotland, some regions are attended by a screening van for 1 year every 3 years, with no screening occurring in the interim 2 years. A graph was plotted showing the population incidence of invasive breast cancer diagnosed in one region of Scotland (Fife), from 1980 to 2004. One would expect that in the first or first two rounds of screening the cancers in this very stable populatiion would be detected and treated so that in the subsequent two years, the incidence would dip below the level before screening was introduced, by at least the same level as the level of increased diagnosis during the screening year.
In reality, the graph (figure 1) clearly shows that the population incidence of invasive breast cancer peaks every 3 years and these years correspond to the year when the region is visited by the breast screening van. However, the incidence in the following two years does not appear to reduce even after several successive rounds of screening and appears to remain at the level before screening was introduced. This suggests that these peaks represent the excess diagnoses (of early cancers) attributable to screening, without a corresponding reduction in symptomatic cancers. Although a new cohort of women would join the screening cohort every 3 years, the oldest cohort would leave- and after a few rounds of screening, the background incidence of symptomatic cancers detected in the years between screens should reduce. Firstly, the data show that it doesn’t reduce and secondly, the magnitude of the rise in incidence is far more than the number of cancers detected in those who are in the 50-53 age group. Thus, the incidence of breast cancer increases dramatically upon subjecting women to screening and this is not accompanied by a reduced incidence of symptomatic cancers in the following years.
Figure 1. Invasive breast cancer incidence in Fife
This figure shows the incidence of invasive and in situ breast cancer in the Fife region of Scotland. The incidence peaks every 3 years and these years correspond to the year when the region is visited by the breast screening van. However, the incidence (of symptomatic cancers) in the other two years does not appear to reduce even after several successive rounds of screening and remains similar to the incidence before screening was introduced. This suggests that these peaks represent the excess diagnoses (of early cancers) attributable to screening, without a corresponding reduction in symptomatic cancers.
In a second analysis, (figure 2)the population incidence of invasive breast cancer from Scotland was analysed for five cohorts of women separated by 10 years of age from 1984 to 2004. The 1st cohort represents women who were 25-29 years in 1984, 35-39 in 1994 and 45-49 in 2004 and the 5th cohort was 65-69 years in 1984 and 85+ in 2004; both these cohorts never received screening. The 2nd 3rd and 4th cohorts received screening for 10, 14 and 4 years respectively.
Invasive breast cancer incidence amongst those cohorts who did not receive screening remained unchanged between 1984 to 2004. Amongst those cohorts who received screening, the incidence of breast cancer rose to a level usual for those women 10-20 years older than them. Such a rise during the period of screening, however, did not reduce the incidence during the interval between screens, or after screening stopped at 65, when it was actually higher than expected.
Figure 2. Invasive breast cancer incidence in Scotland
This graph is plotted in an unusual way and represents the incidence of breast cancer in 5 cohorts of women over 20 years, who were in the age range as specified in the legend, in 1984. Thus, the following groups of points represent incidence for the same age group: CED, JAM, LK, as well as those points joined by the green lines. Women were undergoing screening at points A, K, L and M and between B and E (cohort 4), and K and D (cohort 3). The green lines demonstrate how the incidence of breast cancer has not increased significantly over the 20 years amongst those who did not receive screening. The red line demonstrates one example of the rise of incidence of breast cancer in age group 65-59, who had 15 years of screening compared with those who did not.
Women who undergo regular screening mammography raise their incidence of invasive breast cancer by 30% to 60%. This increase does not result in a reduction in incidence in later years. In the years between consecutive screens, as well as after stopping screening, rather than enjoying a reduced incidence, these women continue to experience a higher risk of invasive breast cancer.
These two figures are also available as a movie or a powerpoint_show.
Competing interests:
None declared
Competing interests: No competing interests
About benefits and harms of mammography screening
There is general agreement that modern therapies for early symptomatic breast cancer improve survival and can reduce disease-specific mortality. The widespread adoption of mammographic screening since the 1980s and the introduction of organised population-based screening programs have brought substantial improvements in awareness and access to quality breast health services for the population at large, not confined to the age group undergoing screening. This may well explain the early impact on mortality rates which, as Jørgensen and colleagues (2009a) argue, occurred too early to be attributed directly to the mammography exam itself. Resolution in this debate is particularly important for countries where breast cancer incidence is low and more compelling priorities limit resources that can be devoted to this disease. Clearly, control of breast cancer in these settings should start by making early diagnosis of clinically detectable tumours and quality treatment available to all cases.
But acknowledging the value of down-staging–reducing the proportion of cancers presenting at stage III or IV when prognosis is poor– does not imply that mammography has no net effect over and above the early detection of palpable tumours. All randomised trials of mammography, except the Canadian trial, showed a benefit to differing extents (depending on when and where they were conducted) of advanced diagnosis of palpable tumours, but none lasted long enough to measure the long-term impact on mortality of treating sub-clinical tumours that are detected only by mammography. It is likely that the lead time of such tumours is much longer than the 2.4 years adopted by Duffy et al. (2008). If this is the case, the full impact of mass screening in the UK and in other regularly screened populations has begun to be manifest only in recent years. The latest reported incidence estimates for the UK showed a decline in 2006 in the age-group 50-64 years targeted for screening since the beginning of the programme (Cancer Research UK, 2009). If this decrease persists, it might be indicative of the fact that invasive cancers were being prevented by treating in situ tumours. Weiss (2007) suggested that the decline in the incidence of invasive breast cancer in American women aged 40-49 years, between 1975 and 2002, could be a consequence of treating screen-detected in situ cancers that had been increasing rapidly in the same time period in that age group.
Many factors contribute to time trends in breast cancer incidence including varying screening intensity, modality, coverage as well as changes in the prevalence of late-stage promoting factors such as hormone replacement therapy. It is difficult by ecological analyses alone to interpret such trends, let alone to quantify the effects of the factors involved. A more reliable quantification of the long-term effects of screening will be soon available from on-going studies of time trends by screening history that link incident cases and deaths with information from screening registries.
The concern about the excess incidence caused by screening is that some of those tumours would have not progressed to a clinical stage in the woman’s lifetime. Some women would therefore suffer the distress of treatment with no benefit. However, around 1980, before the introduction of screening, half of the cases of breast cancer occurred in women younger than 65 years (figure); in 1985 British women aged 50 years were expected to live on average 30 more years (Office for National Statistics, 2009); the majority of women who were the target of screening were therefore destined to live for another two decades at least. The figure shows for comparison the incidence of prostate cancer, a disease for which screening with prostate-specific antigens is much more controversial. The age at which the incidence of prostate cancer in men begins to rise steeply is 20 years older than that seen for the rise in female breast cancer cases; moreover the prevalence of sub-clinical tumours of the prostate is at least one order of magnitude greater compared with the female breast. These observations suggest that for prostate cancer the proportion of screen-detected men who would not likely benefit from early diagnosis and treatment is much greater than for women with early breast cancer.
From what we know of the natural history of breast cancer, most malignant tumours progress to a life-threatening stage sooner or later. At present we do not have reliable information on how long that takes in order to quantify the balance between harms (unnecessarily treated tumours) and benefits (saved lives). What we do know is that mortality from breast cancer in the UK has never been lower since 1950. With all-causes mortality rates falling and increasing life-expectancy in the UK (Office for National Statistics, 2009), more and more of those screen-detected tumours that Jørgensen and Gøtzsche (2009) today count as over-diagnosed and unnecessarily treated may be counted as lives saved tomorrow.
Paola Pisani
University of Torino
Torino, Italy
David Forman
University of Leeds
Leeds, United Kingdom
Joe Harford
NCI/NIH
Bethesda, MD, USA
Cancer Research UK http://info.cancerresearchuk.org/cancerstats/types/breast/incidence/ accessed 30 July 2009.
Duffy SW, Lynge E, Jonsson H, Ayyaz S, Olsen AH. Complexities in the estimation of overdiagnosis in breast cancer screening. Br J Cancer. 2008;99(7):1176-8.
Gøtzsche PC, Jørgensen KJ, Mæhlen J and Zahl P-H. Estimation of lead time and overdiagnosis in breast cancer screening. (Letter) BJC 2009;100:219.
Jørgensen KJ, Brodersen J, Nielsen M, Hartling OJ, Gøtzsche PC. Fall in breast cancer deaths. A cause for celebration, and caution. BMJ. 2009a;338:b2126.
Jørgensen KJ, Gøtzsche PC. Overdiagnosis in publicly organised mammography screening programmes: systematic review of incidence trends. BMJ. 2009b;339:b2587.
Jørgensen KJ, Gøtzsche PC.Breast screening: fundamental errors in estimate of lives saved by screening. BMJ. 2009c;339:b3359.
Office for National Statistics. Interim Life Tables, United Kingdom, 1980-82 to 2005-07. http://www.statistics.gov.uk/STATBASE/Product.asp?vlnk=14459, accessed 1 Sept 2009.
Parkin DM, Whelan S, Ferlay J and Storm H. Cancer Incidence in Five Continents, Vol. I to VIII. IARC CancerBase No. 7, Lyon, 2005.
Weiss NS. Breast cancer trends (Letter); Epidemiology. 2007;18:284.
Competing interests:
None declared
Competing interests: No competing interests