17/11/09 Leia o texto abaixo com atenç˜ao e respon

Transcrição

17/11/09 Leia o texto abaixo com atenç˜ao e respon
Seleção para o mestrado em Estatı́stica - Prova de Lı́ngua Inglesa - 17/11/09
Leia o texto abaixo com atenção e responda as questões sobre seu conteúdo.
As respostas devem ser redigidas em PORTUGUÊS e a caneta.
The Lady Tasting Tea (author: David Salsburg)
Review by David S. McIntosh
Imagine that you have just been tested for an incurable, fatal disease, and the test results
came back positive you have the disease. But maybe you don’t, you tell yourself. Blood
tests aren’t perfect, especially for newly-discovered diseases like AIDS. The scientists that
developed the test say that it correctly identifies the disease in 90 percent of the people who
actually have it, and that it is 99 percent accurate in telling people who don’t have it that
they are uninfected. What’s the likelihood that you have the disease? Somewhere between
90 and 99 percent? It might be time to draw up a will.
Before you jump to conclusions, though, you need to know what percentage of the overall
population is infected. If ten percent is, then the chance that you are infected is nine out of
ten. But if only one tenth of one percent of the population is, then your chance of actually
having the diase is about one in twelve. However, it would be premature to break out the
champagne quite yet. When we said the population, did that mean the population of the
whole country, or the population of people being tested? If it is truly random who decides
to get tested, this is a moot point. But in reality, the people getting tested are probably
sicker than the average population, or at least more at risk of becoming infected. To know
how likely it is that you are infected, you have to know not just the accuracy of the test (in
both directions), but also the average condition of the other people being tested. The test
results you got back were telling, but they didn’t prove anything.
Looking at this example, we used words like test, accuracy, sample, population, chance,
percent, probably, random, average, and provewords that have specific meaning in the world
of statistics. One cannot intelligently discuss medicine, physics, economics or baseball without an occasional regression into statistics. Statistics is an inescapable part of modern life.
What is surprising is that statistics is a very young science, with roots in the early 19th
century.
Newtonian mechanics gave astronomers the tools they needed to model the movements of
the planets. When astronomers’ observations of planetary locations didn’t match their predictions, they assumed the differences were errors in measurement. Presumably, as telescopes
got better, the errors would get smaller. After all, the universe operated according to the
deterministic laws of Newton and Kepler, as the discovery of Neptune in 1846 demonstrated.
Unfortunately, better telescopes produced observations at greater variance from what the
math had predicted. Either the universe has an intrinsic randomness to it, or observations
are inherently inexact. Or both. Either way, as the 19th Century rolled along, it became
clear that science would have to be based on a more statistical understanding of things.
David Salsburg’s book The Lady Drinking Tea takes its name from an anecdote involving
Ronald Aylmer Fisher, one of the greats of modern statistics. Fisher, the story goes, was
at a summer tea party, when a one of the guests insisted that she could of course taste
the difference between tea poured into milk and tea that milk had been poured into. To
many of the guests, this was an absurd assertion. To Fisher it was an interesting question
1
of experimental design. How many cups of tea, in what order, would she have to taste,
with what accuracy, before one could conclude that her statement was true? Could one ever
know?
Throughout the 1920s, while working at the Rothampsted Agricultural Experimental
Station, Fisher produced a stream of papers for the Journal of Agricultural Science with
snappy titles like “Studies in Crop Variation VI”. One of Fisher’s lasting contributions,
obvious in hindsight, was that the best statistical inferences are drawn from carefully designed
experiments with randomized sampling.
Sometimes this is impractical, such as when studying people who smoke. At other times,
testing is done on intentionally-less-than-random samples. The Nielsen ratings of television
viewing are based on the habits of a group of families (a “judgment sample”) chosen to mirror
the socioeconomic and geographical diversity of the TV-watching US population. The “man
on the street” polls that appear on the TV news are based on “opportunity samples”, or
whoever happens to be on the street that day. When polls go disastrously wrong, it is
because the sample populations were not representative of the real population. In 1936, the
Literary Digest confidently predicted that Alf Landon would trounce Franklin Roosevelt in
the presidential election. The magazine had conducted a telephone survey of its readers, and
found they would be voting for Landon. Embarrassingly, there were more people without
phones that voted for FDR than people with phones who voted for Landon.
This type of mistake crops up in medical research all the time. For the last ten years, it
has been standard practice to prescribe hormone replacement therapy for post-menopausal
women. Doctors had observed that women who took estrogen had fewer blood clots and
heart attacks. Then, in 1998, more careful research showed that the hormones actually
increase the risk of heart attack. The researchers realized that the women taking estrogen
had been in better health to start with, less likely to smoke, and more likely to see their
doctors regularly. Once again, correlation was confused with causality, and the culprit was
a non-random sample.
We all understand the idea of bell-shaped distribution curves, with most of the measurements clustering around the middle, and fewer and fewer as you get farther away. When the
Augustinian monk Gregor Mendel fudged the numbers in his famous genetic experiments,
he couldn’t have suspected that subsequent generations would expose him. But the data
he recorded didn’t show the normal degree of randomness that real data always does. To
paraphrase Johnnie Cochran, “If the data don’t fit, you must acquit.”
Distributions have outliers, which people often forget. Warren Buffet has an unrivaled
record as an investment manager, and year after year his Berkshire Hathaway Inc. outperforms the stock market. It’s unseemly to say so, but there is a chance he’s just been lucky.
Let’s assume there are 100,000 investment managers in the US, and it’s purely random
whether any of them will outperform the market. Fifty percent do and fifty percent don’t.
After ten years, 98 of them would have an unbroken record of success. After fifteen years,
only three. But those three weren’t any smarter than the other investment managers. They
just represent the tail end of a very large distribution curve. Maybe Buffet is just lucky.
The statistician Chester Bliss once studied the efficacy of different insecticides. Bliss
found that no matter how much insecticide the bugs were exposed to, a few would always
manage to survive, and no matter how little they were exposed to, a few would die. It was
almost impossible to say how much insecticide would kill every insect, but relatively easy to
identify the lethal dose that would kill half the insects, or the LD-50. In a way, this is like
2
measuring the half-life of a radioactive substance. Interestingly, it takes considerably more
than five times as much data to determine the LD-10 than the LD-50. Just like accelerating a
particle ever closer to the speed of light, the amount of effort required increases exponentially.
Salsburg tells a story about his own experience running a study to find out what level of
a compound would be lethal to one percent of the mice exposed to it. He determined
that he would need several hundred million mice to get an acceptably certain LD-01, and
recommended the experiment be reconsidered.
Everybody knows that asbestos is a bad thing. When the crystalline particles of this
mineral lodge in the lungs, they can eventually lead to mesothelioma, asbestosis, or lung
cancer. A combination of lawsuits and regulations has reduced the amount of asbestos we are
exposed to. All of the companies that once produced it have been bankrupted. Regulations
now mandate that when buildings like schools and offices are renovated, all asbestos must
be removed or encapsulated by specially trained, protected, and insured firms. The legal
standard is that no amount of asbestos can be shown to be safe, and so no amount can be
tolerated. People have a very hard time making sense of the tail of a distribution curve.
Law, formal logic, and statistics are not always in concert about the meaning of proof and
causality.
Cigarette smoking is at least as bad as asbestos. People have referred to cigarettes as
coffin nails for the last eighty years. But it has been hard to prove that cigarette smoking
is harmful. The statistical tools used by epidemiologist have often been ones that Fisher
designed to test the significance of controlled experiments. Studies of smoking look at opportunity samples, or people who already started smoking. No one is proposing to create
a double-blind study that asks half the participants to start smoking two packs a day. But
the problem bedevils the courts: there is a preponderance of evidence that smoking is bad,
yet each one of the studies is in some way flawed. To statisticians like Jerome Cornfield, the
odds of all those studies being wrong, even if they are flawed, is so low as to constitute proof.
To people like Kip Vincusi, a Harvard Law School professor and the founding editor of the
Journal of Risk and Uncertainly, it is wrong to use the tools of statistics to assert what they
can’t. Law and statistics continue to be uneasy partners.
The Lady Tasting Tea makes the case that statistics changed the nature of science in the
20th Century. It is just as true that it has changed business. More than any other person,
W. Edwards Deming brought statistical thinking to the attention of corporate executives.
First in Japan and later in the US, he showed how quality control was a way to improve
both costs and quality. Each step in a production process has its own variability, and by
addressing the steps with the most variability first and getting to the others in turn, overall
variability can be reduced. Although many people associate Deming with Total Quality
Management, he in fact abhorred it. It is appropriate to reduce variability, he said, but
is nave to try to eliminate all defects. Statistics says the world doesn’t work that way.
The Lady Tasting Tea is one of the more painless introductions to statistics I have seen.
Rather than taking the readers through the mathematical undergrowth, Salsburg organizes
his book around the individuals that created the field. Anecdotes abound. No formulas are
to be found. By using a historical narrative, Salsburg illustrates how the field developedhow
rivalries and world wars shaped people’s careers and how the mundane needs of farmers
and pharmacologists led to the creation of a new science. As a result, fields such as law,
economics, and even textual analysis have been changed forever. Of course, I can’t prove
that, but it’s probably true.
3
Responda as perguntas a seguir com base no texto. As respostas devem ser
redigidas em PORTUGUÊS e a caneta.
1) Segundo o texto, que porcentagem das pessoas que de fato estão infectadas com o vı́rus
da AIDS apresentam resultado positivo para o teste clı́nico?
2) Antes de tirar conclusões a respeito da probabilidade de alguém estar infectado, qual
informação adicional é necessária?
3) Segundo o texto, quais pessoas são mais propensas a fazer o teste para infecção pelo
vı́rus da AIDS?
4) Segundo o texto, os astrônomos acreditavam que os erros de observação diminuiriam
com a construção de melhores telescópios. O que ocorreu, de fato?
5) A origem do tı́tulo do livro de David Salsburg é uma anedota envolvendo Ronald Fisher.
Descreva a história.
6) Que questões a história da senhora tomando chá suscitou na cabeça de Fisher?
7) Qual foi uma das maiores contribuições do trabalho de Fisher?
8) Qual a razão para o grande erro na previsão feita pela Literary Digest para o resultado da eleição presidencial dos Estados Unidos em 1936?
9) O que levou os médicos a adotarem o procedimento de receitar tratamentos de reposição
hormonal para mulheres após a menopausa?
10) Que problema de saúde tem seu risco aumentado pelo uso de estrogênio?
11) O que levou os cientistas a concluirem erroneamente que a reposição hormonal reduziria
o risco de problemas cardı́acos nas mulheres após a menopausa?
12) Por quê desconfia-se que Gregor Mendel falsificou os dados de seus experimentos genéticos?
13) Descreva o que o estatı́stico Chester Bliss observou inicialmente sobre o efeito de inseticidas sobre a morte de insetos.
14) O que significa a sigla LD-50?
15) Qual é a quantidade de asbesto tolerada legalmente?
16) Segundo o texto, qual o problema com as técnicas estatı́sticas usadas para evidenciar os
malefı́cios do cigarro?
17) Quem foi o responsável por atrair a atenção dos executivos de corporações para o uso
de técnicas estatı́sticas?
4
18) Qual foi a contribuição do controle de qualidade para a indústria?
19) Segundo o texto, em que sequência deve ser atacado o problema da variabilidade em
cada passo do processo produtivo?
20) Segundo o autor do texto, por quê o livro The Lady Tasting Tea é uma introdução
menos dolorosa à Estatı́stica?
5