Journal of Experimental Psychology: Human Learning and Memory VOL. 4, No. 6 NOVEMBER 1978 Judged Frequency of Lethal Events Sarah Lichtenstein, Paul Slovic, Baruch Fischhoff, Mark Layman, and Barbara Combs Decision Research, A Branch of Perceptronics Eugene, Oregon A series of experiments studied how people judge the frequency of death from various causes. The judgments exhibited a highly consistent but sys- tematically biased subjective scale of frequency. Two kinds of bias were identi- fied: (a) a tendency to overestimate small frequencies and underestimate larger ones, and (b) a tendency to exaggerate the frequency of some specific causes and to underestimate the frequency of others, at any given level of ob- jective frequency. These biases were traced to a number of possible sources, including disproportionate exposure, memorability, or imaginability of vari- ous events. Subjects were unable to correct for these sources of bias when specifically instructed to avoid them. Comparisons with previous laboratory studies are discussed, along with methods for improving frequency judg- ments and the implications of the present findings for the management of societal hazards. How well can people estimate the fre- how small a difference in frequency can be quencies of the lethal events they may en- reliably detected? Do people have a con- counter in life (e.g., accidents, diseases, sistent internal scale of frequency for such homicides, suicides, etc.) ? More specifically, events? What factors, besides actual fre- quency, influence people's judgments? The answers to these questions may have to society. Citizens must Defense and was monitored by the Office of Naval assess rlsks accurately in order to mobilize Research under Contracts N00014-76-C-0074 and society's resources effectively for reducing N00074-78-C-0100 (ARPA Order Nos. 3052 and hazards and treating their victims. Official ,5469) under subcontract to Oregon Research In- reco nition of the irnp0rtance of valid risk stitute and Subcontracts 76-030-0714 and 78-072- b . , , . , < < . , • ^ » 0722 to Perceptronics, Inc. from Decisions and assessments is found m the vital statistics Designs, Inc. that are carefully tabulated and periodically We would like to thank Nancy Collins and reported to the public (see Figure 1 ). There Peggy Roecker for extraordinary diligence and ;Sj ilowever, no guarantee that these statis-patience in typing and data analysis. We are also . a , ? . , , ,. , • , ...grateful to Ken Hammond, Jim Shanteau, Amos tics are reflected in the public s intuitive Tversky, and an anonymous reviewer for percep- judgments. tive comments on various drafts of this article. Few studies have addressed these ques- Requests for reprints should be sent to Sarah . ,, . . . . . ,. . , , ? Lichtenstein, Decision Research, 1201 Oak Street, tlons- Most investigations of judged f re- Eugene, Oregon 97401. quency have been laboratory experiments Copyright 1978 by the American Psychological Association, Inc. 0096-1515/78/0406-0551S00.75 551 552 LICHTENSTEIN ET AL. Earthquakes not more numerous t populated mat," aaM Ocofbyiidtt Iilud region of the South Pacific 'averly Persons. "But It you look at Ocer- The quake registered 6.! on the W quaket, they all occurred In hlihly opt ' - *•»!« snd occurred GOLDEN, Colo. (UPI) - The Na- In 0 tlwal Earthquake Information Center Waver en says recent earthquakes In populated the regions may have faliely led the public settmlc are** wnerc eannqusiet nave K y * 10 believe an abnormally high amount occurred all Uiratioh hiMnrva- '-----L Olll/WJ ^q,...i,.,,KW™,,l,g EtlcefrWltttW , '"oesun "Recenily. eanhquakei save gotten •wHVwf'*"™ /lii* » r' t , rt ) h - ..uiuinn hM^use ihey have been Ir Botulism outbreaks linf ""• ™ •"•»»»• — ™»" "r^ B, mim-Hitv- j*»*™o"--" Fire fatalities Traffic deaths^ expected to set down slightly drop in nation record for year ^n«-r,N fAPl - Flte deaths were BjlsmMnimi FORT COLLINS, Cfllo. (""1,*^ * i:S'i: ss«"SHS'i min [p(x, y), P(y, *) ]> while weak stochastic transitivity re- quires only that p(x, z} > 1/2. JUDGED FREQUENCY OF LETHAL EVENTS 559 triad with the log of the product of the geo- metric mean ratios for A: B and B: C. The relationship was linear with r — .99 (slope = 1.10; antilog of intercept = .83) for the college students and r = .97 (slope = 1.05; antilog of intercept = 1.09) for the league members. These results suggest that as a group, these subjects exhibited an interval scale of subjective frequency. Between-groups comparisons. The re- sponses of the students and the league mem- bers were highly similar. Across all 106 pairs, the correlation between the two groups was .93 for both percentage correct and geometric mean judged ratio. The high correlation between the two groups' second- ary bias residuals is further evidence of this similarity. The league members had a some- what higher percentage correct than the stu- dents (M — 76.8 vs. 71.3); their percentage correct was higher for 80 pairs, equal for 5 pairs, and lower for 21 pairs (sign test; p < .001). For the ratio judgments, how- ever, the league members did not perform significantly better than the students; the geometric mean of their ratio judgments was closer to the true ratio for only 62 of the 106 pairs (sign test; z = 1.65, p > .10). Individual performance. The perform- ance of individual subjects was rather vari- able. Percent correct ranged between 56% and 84% for the students and between 60% and 89% for the league members. Analysis of the correlations between log judged ratio and log true ratio over 101 items indicated that few individuals showed any appreciable ability to perform the ratio-estimation task. These correlations ranged between —.11 and .72 (Mdn = .45) for students and be- tween .10 and .80 (Mdn — .51) for league members. Further insight into the level of individual subjects' performance was obtained by cal- culating an error ratio, defined as the ratio of the judgment to the truth, or vice versa, whichever was greater than 1. A subject who always gave a judged ratio off by a factor of 10, that is, either 10 times as large or a tenth as large as the true ratio, would have a mean error ratio of 10. The median student subject erred by a factor of 22.5 IOQOOO lopoo 0) tn g 8- O r» -90 slope" 1.03 - antilog of intercept» 1.95 5 10 20 50 100 True Ratio 1000 lopoo Figure 8. Geometric means of subjects' ratio judg- ments as a function of true ratio for 100 pairs of words. (57% of subjects thought in was more likely than that; 56%, that more likely than for; and 51%, for more likely than in). Of the 20 triads contained in the occupations task, 17 showed strong stochastic transitiv- ity, and 3 showed moderate stochastic transitivity. The log geometric mean ratio response to the third pair of each triad was corre- lated with the log of the product of the re- sponses of the other two pairs; these cor- relations were .94 for words (slope = 1.21, antilog of intercept = .80) and .76 for oc- cupations (slope = .64, antilog of intercept = 5.32). Thus, words and occupations judgments showed considerable internal consistency, as found with causes of death. Comparison with Experiment 1. The purpose of Experiment 2 was to find out whether the major findings of Experiment 1 were specific to lethal events. Three re- sults of this comparison are noteworthy. First, subjects responded more accurately to words than to occupations; causes of death were worse yet. This may be due to exposure: We experience many more sam- ples of English text each day than examples of people working in occupations, and our exposure to death is even more limited. Another possible reason for poorer per- formance with causes of death is that our exposure to these events is systematically biased. We shall discuss this bias later in the article. Second, we found that causes-of-death subjects tended to underestimate ratios larger than 50:1. Underestimation did not appear at all with words and was found with occupations only for ratios of 1,000:1. Thus, one cannot conclude that the primary bias found in Experiment 1 was simply due to difficulties in using large numbers rather than to insufficient discrimination between different causes of death. Third, we found strong evidence in these new tasks that subjects possess consistent subjective frequency scales for these con- tent areas, as they did for causes of death. Experiment 3: Direct Estimates of Event Frequencies Experiment 1 suggested that subjects have a consistent underlying scale for the 1000 o: 20 O CD 10 T" .81 slope' .84 antilog of intercept • 1-17 _ _ _5 10 20 50 100 1000 True Ratio 10,000 Figure 9. Geometric means of subjects' ratio judg- ments as a function of true ratio for 95 pairs of occupations. JUDGED FREQUENCY OF LETHAL EVENTS 563 frequency of lethal events, although that scale deviates markedly from the statisti- cally correct one. Unfortunately, the incom- plete paired-comparison design used in Ex- periment 1 did not permit the subjective scale to be uncovered for all events. When the judged relative frequencies for a given pair were in error, it was difficult to deter- mine whether judgments were biased for one, the other, or both members of the pair. Experiment 3 elicited direct estimates to clarify the nature of the biases for individual lethal events. Method The subjects were 74 respondents to an adver- tisement in the University of Oregon campus newspaper. Each subject was assigned to one of two groups. One group of 40 subjects was told that the frequency of deaths in the U.S. due to motor vehicle accidents was 50,000 per year (MVA group). Using this value as a standard, they were asked to estimate the frequency for the other 40 lethal events shown in Table 1. The re- maining 34 subjects (Group E) were given the standard of 1,000 deaths by electrocution. The glossary used in Experiment 1, which defined some of the events, was provided. The 41 events were listed in alphabetical order on a single sheet. Subjects were encouraged to erase and change answers to make the relative frequencies of the entire set consistent with their best opinions. Since there were about 205,000,000 persons in the United States when the data were collected, the rates per 108 shown in Table 1 were multi- plied by 2.05 to provide statistical frequencies against which to compare subjects' judgments. The standards given to the subjects, 1,000 for electrocutions and 50,000 for motor vehicle acci- dents, were close to these computed statistical frequencies (1,025 and 55,350, respectively). Results The data for one subject from Group MVA and two subjects from Group E were excluded from all analyses because they gave unreasonably high estimates (the sum of their estimates for all 41 causes of death exceeded 50,000,000, whereas the sum of the statistical frequencies is 3,553,004). An- other subject was excluded from Group E because of unusually low responses. All of this subject's responses were below 1,000 (the value of the standard); 38 of 40 re- sponses were less than 100. As a result of these exclusions, the data presented below are based on 39 subjects in Group MVA and 31 subjects in Group E. Because arithmetic means tend to be un- duly influenced by occasional extreme val- ues, the present results are based on the geometric means of the estimates. The use of medians leads to essentially the same re- sults. For both groups, the correlation be- tween log geometric mean and log median was .99 (for Group MVA, slope = 1.01, antilog of intercept = .97; for Group E, slope = 1.00, antilog of intercept = 1.17). The log geometric mean direct estimates for Groups E and MVA were highly cor- related (r=.98). However, as shown in Table 5, the geometric means for the MVA group were larger than those for Group E for 34 of 41 causes (sign test; p < .01). This difference may be due to MVA sub- jects anchoring on a larger standard than that presented to E subjects. (The two col- umns in Table 5 labeled ratio of fudged to predicted will be discussed later in the article.) Accuracy. Figures 10 and 11 show the geometric mean judgments plotted against the true rates (excluding smallpox). The best-fitting quadratic curves are also shown. For both groups, quadratic equations pro- vided a significantly better fit (p < .01) to the data than linear equations. For the MVA group, the correlation between the log geometric mean responses and the pre- dictions from the quadratic equation was .92; the linear correlation was .89. For Group E the correlations were .93 (qua- dratic) and .91 (linear). Although the log geometric mean esti- mates correlated highly with the true fre- quency, these correlations, calculated over a true frequency range of over 800,000, do not indicate substantial accuracy. Large es- timation errors were evident, as with the paired-comparison judgments. For exam- ple, as Table 5 indicates, accidental death was again judged about equal in frequency to all diseases (although death from disease is 15 times more likely), cancer was judged to be about twice as frequent as heart dis- 564 LICHTENSTEIN ET AL. Table 5 Results from Direct Estimates MVA Cause Smallpox Poisoning by vitamins Botulism Measles Fireworks Smallpox vaccination Whooping cough Polio Venomous bite or sting Tornado Lightning Nonvenomous animal Flood Excess cold Syphilis Pregnancy, childbirth, and abortion Infectious hepatitis Appendicitis Electrocution Motor-train collision Asthma Firearms Poisoning Tuberculosis Fire and flames Drowning Leukemia Accidental falls Homicide Emphysema Suicide Breast cancer Diabetes Motor vehicle accident Lung cancer Stomach cancer All accidents Stroke All cancer Heart disease All disease Rate per 2.05 X 10s 0 1 2 5 6 8 IS 17 48 90 107 129 205 334 410 451 677 902 1,025 1,517 1,886 2,255 2,563 3,690 7,380 7,380 14,555 17,425 18,860 21,730 24,600 31,160 38,950 55,350 75,850 95,120 112,750 209,100 328,000 738,000 1,740,450 Geometric mean 88 237 379 331 331 38 171 202 535 688 128 298 863 468 717 1,932 907 880 586 793 769 1,623 1,318 966 3,814 1,989 2,807 2,585 8,441 3,009 6,675 3,607 2,138 50,000" 9,723 4,878 86,537 10,668 47,523 25,900 80,779 Ratio of judged to predicted 1.27 1.97 1.39 1.54 .17 .69 .80 1.67 1.82 .32 .71 1.77 .81 1.15 2.98 1.19 1.03 .65 .74 .65 1.26 .96 .59 1.62 .85 .81 .68 2.10 .69 1.42 .66 .34 6.34 1.00 .43 6.77 .54 1.70 .49 .75 Electrocution Geometric mean 37 44 88 85 77 14 51 47 233 463 64 102 627 211 338 935 328 416 1,000» 598 333 1,114 778 448 2,918 1,425 2,220 2,768 3,691 2,696 3,280 2,436 1,019 33,884 9,806 2,209 91,285 4,737 43,772 21,503 97,701 Ratio of judged to predicted 1.16 1.96 1.47 1.26 .22 .62 .55 1.85 2.86 .37 .54 2.71 .73 1.05 2.78 .80 .87 1.96 .95 .47 1.42 .92 .43 1.86 .91 .92 1.03 1.30 .86 .97 .61 .22 5.76 1.33 .26 9.32 .31 2.00 .51 1.14 Note. MVA = motor vehicle accident. * Standard. ease (the reverse is true), floods were estimated to take more lives than asthma (asthma is 9 times more likely), diabetes was seen as only half as frequent as fire and flames, homicides were judged almost as frequent as stroke and so on. The errors evident in the direct estimates were partitioned into primary and secondary components, as was done with the paired- comparison judgments in Experiment 1. The primary bias was an overestimation of low-frequency events and underestimation of high-frequency events by both groups. As shown by the quadratic curve in Figure JUDGED FREQUENCY OF LETHAL EVENTS 565 u •- 100 +- 0> o Q) 10 10 100 1000 10,000 loopoo 1,000,000 True Frequency Figure 10. Geometric means (GM) of ratio judgments by motor vehicle accident group sub- jects as a. function of true frequency (TF). (Curved line is best-fitting quadratic: log GM = .07 [log TFP + .03 log TF + 2.27.) 10, the crossover point for Group MVA was at a true rate of about 800; all events with frequencies lower than that were over- estimated, and all above that point were underestimated. For Group E (see Figure 11) the crossover point was less clear; it occurred around a true rate of 250. Secondary bias. Deviations from the re- gression curves were quite similar for the two groups (see Figures 10 and 11). The correlation between the two groups' residual values (i.e., the vertical distance between each point and the regression curve) was .91 across the 40 items (excluding small- pox), indicating a consistent secondary bias above and beyond the primary bias evi- denced by the regression curves. The anti- logs of these residuals are shown in Table 5, in the columns labeled ratio of judged to predicted. Some of the items with large re- siduals are labeled on the two figures. The similarity between the two groups of sub- jects, relative to their own regression lines, is striking. Frequency of death due to all accidents, motor vehicle accidents, preg- nancy, flood, tornado, and cancer was rela- tively overestimated by both groups. Death due to smallpox vaccination, diabetes, light- ning, heart disease, tuberculosis, and asthma was relatively underestimated by both. Comparison with Experiment 1. Over- all, there is a close relationship between the 566 LICHTENSTEIN ET AL. 10 100 1000 lopoo toopoo ipoo.ooo True Frequency Figure 11. Geometric means (GM) of ratio judgments by electrocution group subjects as a function of true frequency (TF). (Curved line is best-fitting quadratic: log GM = .OS (log TF)2 + .22 log TF + 1.58.) direct estimates of the present experiment and the paired-comparison results of Ex- periment 1. From the geometric means of the direct estimates one can compute ratios for each of the 106 pairs studied in Experi- ment 1. The logs of these derived ratios were highly correlated with the logs of the geometric mean frequency ratios from Ex- periment 1 (college students) : r — .94 for the MVA group and .93 for the E group (across all 106 pairs). Neither the judged ratios from Experi- ment 1 nor the ratios derived from the direct estimates of the present experiment were consistently closer to the true ratios. The judged ratios from Experiment 1 were less accurate when the true ratio was low (< 10:1) and more accurate when the true ratio was high (;> 10:1). Individual performance. For each sub- ject the correlation between log response and log true rate was calculated across the 40 stimuli (excluding smallpox). Individ- uals in Group E showed a range from .61 to .92 and a median of .77. Within Group MVA, correlations ranged from .28 to .90; the median was .66. Again, these correla- tions do not indicate substantial accuracy. Subjects who could make only the roughest discriminations, for example, knowing that death from botulism or lightning is less likely than death from all cancer or all acci- JUDGED FREQUENCY OF LETHAL EVENTS 567 Table 6 Ratings on Eight Predictor Variables Indirect Cause Smallpox Poisoning by vitamins Botulism Measles Fireworks Smallpox vaccination Whooping cough Polio Venomous bite or sting Tornado Lightning Nonvenomous animal Flood Excess cold Syphilis Pregnancy, abortion, and childbirth Infectious hepatitis Appendicitis Electrocution Motor-train collision Asthma Firearm accident Poisoning solid/liquid Tuberculosis Fire and flames Drowning Leukemia Accidental falls Homicide Emphysema Suicide Breast cancer Diabetes Motor vehicle accident Lung cancer Stomach cancer All accidents Stroke All cancer Heart disease All disease Range of scale M Death 2.20 1.23 2.82 2.07 2.43 1.30 1.48 2.49 2.41 3.46 2.34 2.30 3.66 2.62 2.51 3.07 2.03 2.00 2.90 3.03 1.62 3.89 3.02 2.71 4.07 3.82 3. 56 3.18 4.69 3.02 4.00 3.03 2.37 4.69 4.15 2.89 4.44 3.87 4.54 4.28 4.48 1-5 3.04 Suffer- ing 2.48 1.43 2.82 3.00 2.85 1.71 1.95 2.87 2.97 3.75 2.38 2.89 4.05 2.93 3.67 3.84 2.77 2.67 2.69 2.85 3.13 3.87 3.05 3.13 4.15 3.23 3.38 3.54 4.33 3.36 3.66 4.33 3.49 4.71 4.21 3.08 4.64 3.98 4.59 4.34 4.49 1-5 3.35 Direct Death 1.02 1.00 1.03 1.05 1.10 1.03 1.00 1.15 1.05 1.07 1.05 1.03 1.12 1.15 1.07 1.13 1.12 1.10 1.21 1.23 1.18 1.44 1.10 1.10 1.20 1.69 1.36 1.31 1.39 1.31 1.74 1.38 1.31 2.03 1.82 1.59 2.05 1.95 2.38 2.15 2.25 1-3 1.35 Suffer- ing 1.33 1.07 1.36 2.41 1.56 1.41 1.38 1.77 2.15 1.38 1.23 1.82 1.56 1.57 1.79 2.03 2.02 2.30 1.57 1.28 2.41 1.67 1.61 1.61 1.71 1.68 1.23 2.43 1.23 1.75 1.71 2.00 2.39 2.61 1.66 1.59 2.43 2.18 2.34 2.10 2.44 1-3 1.80 News- paper fre- quency 0 0 0 0 0 0 0 0 0 36 1 4 4 0 0 0 0 0 5 0 1 8 3 0 94 47 1 15 278 1 29 0 0 298 3 0 715 12 25 49 111 l-oo 42.4 News- paper inches 0 0 0 0 0 0 0 0 0 153.5 .8 33.8 41.8 0 0 0 0 0 42.2 0 1.9 28.2 17.9 0 320.7 247 14.8 124.8 5042.9 1.1 356.7 0 0 1440.5 35.9 0 2861.4 130.7 188.5 303.4 727.1 1-00 295.5 Catas- trophe 1.35 1 1.49 1 1 1 1 1 1 4.51 1.01 1 5.57 1.20 1 1.01 1 1 1 2.12 1 1.02 1.03 1.08 1.73 1.07 1 1.03 1.06 1 1 1 1 1.64 1 1 1.70 1 1 1 1.19 1-M 1.31 Condi- tion- ality 5.87 4.36 10.32 1.81 3.73 .71 3.84 4.81 6.84 6.25 10.06 3.19 6.52 10.15 5.19 4.57 7.79 3.53 15.81 14.87 2.07 10.34 10.81 7.68 10.58 17.65 15.00 4.79 18.32 11.03 17.23 9.39 6.45 8.97 14.26 11.87 6.97 11.76 13.16 13.00 8.00 0-20 8.77 dents, would show high correlations. Experiment 4: Experience and Bias Experiments 1 and 3 demonstrated that the frequencies of some lethal events are consistently misjudged. In hopes of learning more about the nature of these errors and biases, Experiment 4 examined people's direct and indirect experiences with these events and some of the events' special char- acteristics. Eight different characteristics were assessed for each lethal event and then used to predict the errors found in Experi- 568 LICHTENSTEIN ET AL. ments 1 and 3. Four of the measures as- sessed how much experience subjects feel they have had with the different causes of death. Two measures reflected the frequency with which causes of death appear in news- paper articles. The final measure reflected the degree to which the various causes of death were judged to be catastrophic (in- flicting simultaneous multiple casualties) and lethal (inevitably producing death for people suffering from the condition). Method Experience ratings. A new group of 61 subjects recruited through the University of Oregon cam- pus newspaper was asked to rate each of the 41 causes of death according to their personal ex- periences with the event as a cause of death and suffering. Two ratings of indirect experience were ob- tained by asking subjects to indicate how often they had heard about the event via the media (newspapers, magazines, radio, television, etc.) as (a) a cause of death and (b) a cause of suffering (but not death). Ratings were made on a 5-point scale whose extreme categories were never (coded as 1) and often (coded as 5). Subjects' direct experience with the 41 events as causes of death were elicited by having them check one of the following three statements for each event: At least one close friend or relative has died from this (Code 3) ; someone I know (other than a close friend or relative) has died from this (Code 2) ; no one I know has died from this (Code 1). Direct experience with these events as causes of suffering was elicited with similar questions, with the word died replaced by the phrases suffered (but not died). Thus, each subject provided four ratings for each of the 41 events. These were ratings of (a) indirect death (coded 1 to 5), (b) indirect suffer- ing (coded 1 to 5), (c) direct death (coded 1 to 3), and (d) direct suffering (coded 1 to 3). Ncivspapcr coverage. The news media provide two kinds of information about causes of death. One, as noted earlier, is reports of statistical anal- yses (Figure 1). The other, far more prevalent, is the day-to-day reporting of fatalities as they happen. The latter is likely to be biased toward violent and catastrophic events (see, for example, Arlen's [1975] survey of television's treatment of death). Because of the potential importance of media exposure, we supplemented people's ratings of their indirect (media) experiences with a sur- vey of newspaper reports. The local daily news- paper (the Eugene Register-Guard) was ex- amined on all days of alternative months for a year, starting with January 1, 1975 (for a total of 184 days). Two tallies were made for each cause of death: the total number of deaths reported and the square inches of reporting devoted to the deaths (excluding photographs). Catastrophe ratings. Economist Theodore Berg- strom (Note 2) has asked whether catastrophic events with multiple victims in close geographic and temporal proximity will be judged as more Table 7 Direct Estimates Correlation Matrix Variable 10 11 12 13 1. MVALGM 2. E LGM 3. MVA group residuals 4. E group residuals 5. Log true frequency 6. Indirect death 7. Indirect suffering 8. Direct death 9. Direct suffering 10. News frequency 11. News inches 12. Catastrophe 13. Conditional death .98 — .40 .35 — .36 .38 .91 — .89 .91 .00 .00 .85 .86 .45 .48 .86 .86 .46 .44 .90 .88 .19 .19 .52 .50 .22 .16 .56 .54 .59 .56 .45 .41 .45 .36 -.03 .02 .29 .40 .47 .51 .04 .08 .74 .76 .82 .46 .33 .29 -.12 .54 .65 .37 .47 -.28 .10 .30 -.07 Note. MVA = motor vehicle accident; E = electrocution; LGM = log geometric mean. JUDGED FREQUENCY OF LETHAL EVENTS 569 likely than events that take as many lives but in a less spectacular, one-at-a-time fashion. He hy- pothesized that catastrophes are more spectacular and thus more memorable, a speculation in keep- ing with availability considerations. On the other hand, the more frequent instances of noncatas- trophic events may lead them to be judged more accurately, whereas casualties from catastrophic events may be underestimated because of their massed presentation (Hintzman, 1976). To assess catastrophic potential, 13 employees of the Oregon Research Institute were asked to estimate the av- erage number of people who die from a single fatal episode of each of the 41 causes of death. Conditional death ratings. In Experiments 1 and 3 subjects appeared to underestimate (rela- tive to the regression line) the frequencies of deaths due to events that are common in nonfatal form, such as smallpox vaccination and asthma. One possible explanation of this error is that subjects confused P(event xjdeath) with P(death| event x) and failed to appreciate the importance of base rates (Tversky & Kahneman, 1974; Bar- Hillel, Note 3). Consider the question of whether a randomly selected death is more likely to be due to smallpox or smallpox vaccination. This question calls for comparing P(smallpox)death) with P(smallpox vaccination]death), the latter being statistically greater. However, subjects may be relying on P( death [smallpox) and P(death[ smallpox vaccination) to answer such questions. If the base rates for the two events are discrepant (there are many more smallpox vaccinations than cases of smallpox), the resulting judgments will be in error. To explore the role of this characteristic, 31 college students were asked to rate the proba- bility of death given that one suffered from or experienced each condition. The ratings were made on a scale from 0 (surely won't die) to 20 (surely will die). Results Mean values. Mean values for the six subjective scales and the two newspaper measures are shown in Table 6. As one would expect, subjects reported greater ex- perience with these events as causes of suf- fering than as causes of death. The most frequently experienced event was motor vehicle accidents, while the lowest ratings were given to poisoning by vitamins. During 184 days of newspaper reporting, 19 of the listed causes of death were never mentioned. Some of these 19 causes are quite frequent: cancer of the digestive sys- tem, diabetes, breast cancer, and tubercu- losis. In contrast, the eighth most frequently reported cause of death in the newspapers, tornadoes, is in fact relatively rare. The re- ported tornado deaths may represent all deaths from this cause in the United States during the dates covered. Note also that homicide, which is 23% less frequent than suicide, was reported 9.6 times as often, with 15 times as much space devoted to it.B Few of the listed causes of death were classed as catastrophic in terms of the judged number of people dying on a single occasion. Flood, tornado, and motor vehicle/ train collisions led the catastrophe ratings. The conditional death ratings seem rea- sonable. The lowest rating was given to smallpox vaccination, while the highest was to homicide, followed by drowning. Some chronic diseases—asthma, diabetes, syphilis, and tuberculosis—were rated below the overall mean of 8.77, but emphysema (11.03) and heart disease (13.00) were both rated well above the mean. Correlations: direct estimates. Correla- tional analyses were performed to deter- mine whether the eight measures predict the judgments and biases found in Experiment 3. Two aspects of the direct-estimate data were predicted from the eight character- istics : (a) the log geometric mean response to the 40 lethal events (excluding small- pox) and (b) the index of secondary bias used in Experiment 3 (the signed difference between the log geometric mean of the judged frequencies and the log geometric mean predicted by the quadratic regression curves shown in Figures 9 and 10). Table 7 shows the intercorrelation matrix for the four response variables (log geo- metric mean frequencies and residuals for Group MVA and for Group E), the true frequency, and the eight predictor variables. The lower left rectangle of correlations in- dicates the predictive power of the eight in- dependent variables. Three of the four ex- perience ratings showed strong correlations with the four response variables. Note that these ratings correlated more highly with 5 This result may be even more extreme than it appears, since there is good reason to suppose that the official records we used to establish "true" rates underestimate the frequency of suicide. 570 LICHTENSTEIN ET AL. the subjects' responses than with the true frequencies. The ratings of direct suffering showed only moderate correlations with subjects' responses. News frequency and news inches were also modestly good predictors of the re- sponse variables. They were poorly corre- lated with true frequency, demonstrating the biased view of reality that newspapers present." The catastrophe ratings showed quite low correlations with all other varia- bles. This may be due, in part, to the lack of variance in these ratings; over half were equal to 1.0, and only 10 of 41 were greater than 1.08. Finally, conditional death ratings were moderately correlated with the geo- metric mean responses, but not with the residuals. The correlations among the eight predic- tor measures are also shown in Table 7. Indirect death, indirect suffering, and direct death ratings showed fairly high intercor- relations but lower correlations with direct suffering. The two newspaper measures were highly intercorrelated. However, these newspaper measures correlated only mod- erately with the indirect death ratings, even though the instructions for the latter task emphasized newspaper coverage. The direct estimates made by the subjects in Experiment 3 may have been biased be- cause they were influenced by past experi- ence with indirect sources of information (such as newspapers), which themselves were biased. We suspected that ratings of direct experience might be less biased and, therefore, might provide more accurate es- timates of the true frequencies than did the direct estimates of frequency. This hypothe- sis was tested and was not supported. Al- though the direct death ratings did correlate more highly with the true frequency (r = .82) than did any of the other predictor measures, the direct estimates of Experi- ment 3 did even better (r = .89 and .91). Correlations: paired comparisons. Simi- lar correlational analyses were performed relating the eight measures with the paired- comparison judgments of Experiment 1. To do this, a difference score was formed on each measure for each of the 101 pairs (excluding smallpox) by subtracting the score associated with the less likely cause of death from the score associated with the more likely cause of death. These difference scores were then correlated with four de- pendent variables (the log geometric mean responses and the index of secondary bias used in Experiment 1, for students and for league members), with the log true ratio, and with each other. The resulting correla- tion matrix is not shown here, because it was quite similar to Table 7. As with the direct-estimate data, the ratio of the direct death ratings correlated with true ratio more highly (r — .62) than did any of the other predictor measures. However, it could not successfully be sub- stituted for the judged ratios of Experiment 1 in an attempt to improve accuracy, since the judged ratios were even more highly correlated with true ratio (r = .69 for stu- dents and .75 for league members). Regression analyses predicting responses and biases. To bring greater clarity to this mass of correlations, eight stepwise regres- sions were performed. Four of these anal- yses predicted the log geometric mean re- sponses of the four separate groups of subjects: students' paired comparisons, league members' paired comparisons, Group E's direct estimates, and Group MVA's di- rect estimates. The other four stepwise re- gression analyses predicted secondary bias (the residuals from the correlations of each of these four groups with the statistical fre- quencies). The predictor variables for each of the stepwise regressions were the eight mea- sures previously described, using differences between 101 pairs to predict the paired- comparison data or 40 mean ratings to pre- dict the direct estimates and their residuals. Because of the instability of stepwise re- gression solutions with highly intercorre- lated predictors, our primary criterion for variable selection was replicability. Only variables that entered the equations for both league and student subjects in Experiment 6 Similar evidence of bias in another newspaper may be found in Combs and Slovic (Note 4). JUDGED FREQUENCY OF LETHAL EVENTS 571 Table 8 Variables Emerging from Stepwise Multiple Regressions in Both Replications Dependent variables Log geometric mean Residuals Paired comparisons Direct estimates Paired comparisons Direct estimates Indirect suffering Direct death Indirect suffering Direct death News frequency Indirect death Direct death Conditional death" News frequency catastrophe • Negative weight. 1 or both Group E and Group MVA in Ex- periment 2 are discussed. Table 8 lists the variables that emerged from both groups of subjects. The inclusion criterion was an F to enter7 of 3.0 or greater. The log geo- metric means were highly predictable, with Rs ranging from .88 to .96 using just three of the eight predictors. The residuals were also predictable, with Rs ranging from .64 to .80 using the variables selected by the stepwise regression. Two variables, indirect suffering and di- rect death, did most of the job of predicting the subjects' log geometric mean responses for both paired comparisons and direct esti- mates. The regressions on the residuals show a more mixed pattern. For the re- siduals from the paired-comparison data, three predictors were common to both the student and league data: indirect death, di- rect death, and conditional death. Condi- tional death had a negative weight because of its low correlation with the dependent variable and its high correlation with in- direct death. For the prediction of residuals from the direct estimates, news frequency and catastrophe ratings were the only pre- dictors that were significant in both groups. In view of the highly skewed distributions of these two measures, it is somewhat sur- prising to see them emerge as valid predic- tors. However, news frequency correlated with direct-estimate residuals higher than did any other single predictor. Of the 7 causes of death with catastrophe ratings of 1.5 or greater, six (all accidents, motor ve- hicle accidents, flood, botulism, tornado, and fire and flames) were among the 10 causes of death with the highest residuals (i.e., the 10 most overestimated causes of death, rela- tive to the regression line). The above analyses indicate that mea- sures tapping the availability of information about causes of death do a good job of pre- dicting subjects' judgments of the frequen- cies and relative frequencies of these causes of death. Further, we have shown that the consistent errors people make (the second- ary bias) can be predicted from salient fea- tures of the events such as their catastrophic nature and from ratings of experience with the lethal events made by a different group of subjects. Experiment 5: Debiasing Experiments 1 and 3 showed that sub- jects make severe and consistent errors in judging the frequency or relative frequency of lethal events. Experiment 5 was designed to see if subjects could correct these errors when they were told the hypothesized causes of the errors. Emphasis was placed on the secondary bias and its possible causes: un- even newspaper coverage and the effects of imaginability and memorability. Study 5A Method In Study 5A, subjects made paired comparisons for 31 of the 106 pairs of Experiment 1. Twenty- one of these pairs were severely misjudged in Experiment 1 (either the percentage correct was 7 An "F to enter" tests the significance of the increase in the proportion of explained variance achieved by including an additional variable in the regression equation. 572 LIGHTEN STEIN ET AL. less than 60% or the geometric mean was off by a factor of 9 or more). The geometric means of the remaining 10 were estimated moderately well (within a factor of l.S). The present study was conducted with a college student population simi- lar to that in Experiment 1 and with the same instructions except that one group, the debiasing group (re = 30), was given the following special information: Note: In a previous study of this kind we found that, for some pairs, the relative likeli- hoods were greatly misperceived. Sometimes the ratio of the more likely to the less likely item was judged to be much greater than it really was. In other cases the ratio was judged much too small or even in the wrong direction; that is, the less likely item was judged to be more likely. We believe that when people estimate these likelihoods, they do so on the basis of a) how easy it is to imagine someone dying from such a cause, b) how many instances of such an event they can remember happening to someone they know, c) publicity about such events in the news media, or d) special features of the event that make it stand out in one's mind. Reliance on imaginability, memorability, and media publicity, although often useful, can lead to large errors in judgment. When events are disproportionately imaginable or memorable, they are likely to be overestimated. When they are rather unmemorable or unpublicized or other- wise undistinguished, they are likely to be under- estimated. Events such as ulcers that are com- mon, but usually non-fatal, may also be under- estimated because people tend to imagine or remember them in their non-fatal form. Try not to let your own judgments be biased by factors such as imaginability, memorability, or media publicity. A control group (» —22) also judged the 31 pairs without receiving any special instructions. Results Examination of percentage correct re- vealed no evidence for debiasing. The origi- nal subjects (Experiment 1) were best on 9 pairs, the control subjects were best on 12 pairs, and the debiasing group subjects were best on 10 pairs. A further search for improvement in the data of Study 5A can be made by compar- ing the ratio judgments of these two new groups of subjects either with the true ra- tios (under the assumption that the in- structions exhorted the subjects to come closer to the truth) or with the ratios pre- dicted from the regression analysis of the original subjects (under the assumption that the instructions emphasized the nature of the secondary bias, not the primary bias). No evidence for effective debiasing can be seen under either comparison. For geo- metric means, when the comparison is made to the true ratio, the original group was best on 12 pairs, the controls on 6 pairs, and the debiasing group on 13 pairs. When compared with the predicted ratios, the original group was best on 12 pairs, the control group on 7, and the debiasing group on 12. Looking only at the 21 pairs that were originally judged poorly, there is still no evidence of improvement in the debias- ing group. Even those pairs on which the debiasing group did best showed only mod- est improvement. For example, death by diabetes is 95 times more likely than death by syphilis. The debiasing group was "su- perior" in giving a geometric mean response of 9.7 rather than the original group's geo- metric mean of 2.4. Death by stroke is 102,000 times more likely than death by botulism. The value predicted by the regres- sion analysis of the original subjects was 1,002. Those original subjects showed a strong secondary bias; their geometric mean response was 106. The debiasing group gave a mean response of 135. Method Study 5B A second debiasing study was undertaken to provide subjects even more opportunity for using knowledge of the secondary biases to improve their performance. The subjects, drawn from the same student population, were shown 19 pairs of events. The instructions indicated that each of these pairs had been seriously misjudged in an earlier experiment (which was the case). For each pair, the subjects were given the response from Experiment 1 and were asked to improve it, that is, to give a new response that they thought would be closer to the true ratio. The instructions for a debiasing group of 29 subjects included a discussion of the presumed sources of error, illustrated with several examples showing the possible effects of personal experi- ence, media publicity, imaginability, and the like on previous subjects' judgments. A control group JUDGED FREQUENCY OF LETHAL EVENTS 573 of 27 subjects did not receive this additional discussion. The instructions read as follows. Brackets in- dicate material shown only to the debiasing group. We recently studied the ability of University of Oregon students to judge the likelihood of vari- ous causes of death in the United States. For example, subjects were given a pair of events such as: A. Measles, B. Tornado. They were asked: Which causes more deaths annually in the U.S., A or B? They were also asked to estimate how many times more likely the more frequent cause of death was compared to the less frequent of the two. We found that, for some pairs, the relative like- lihoods were greatly misjudged. Sometimes the ratio of the more likely to the less likely item was judged much too small or even in the wrong direction; that is, the less likely item was judged to be more likely. [We believe that when people estimate these frequencies, they do so on the basis of a) how easy it is to imagine someone dying from such a cause, b) how many instances of such an event they can remember happening to someone they know, c) publicity about such events in the news media, or d) special features of the event that make it stand out in one's mind.] [When events are disproportionately imaginable or memorable, they are likely to be overesti- mated. When they are rather unmemorable or unpublicized or otherwise undistinguished, they are likely to be underestimated. Events such as accidental falls, that are common but usually non-fatal, may also be underestimated because people tend to imagine or remember them in their non-fatal form.] On the following pages there are 19 pairings of death-producing events. The relative likelihood of the more common to the less common event was greatly misperceived in each of these pairs. [We want to see whether you can reduce the magnitude of the errors for these pairs. To do this think about how factors such as media coverage or ease of imagining or remembering the event as a cause of death are likely to work to bias the judgments for each of the pairs.] Here are some examples to illustrate the task: Previous Answer Your Answer A. Hepatitis B 4.SS B. Drowning The average subject chose B as more likely and judged it to be 4.55 times more likely than A. Which would you choose and what ratio would you give? Actually, the correct answer is B and the true ratio is 10.9 to 1. We see that the average sub- ject overestimated Hepatitis relative to Drown- ing. [Maybe this is because of the special at- tention given by the media to Hepatitis, espe- cially in relation to abuse of hypodermic needles.] Try this one: A. Leukemia B. Accidental Falls Previous Answer A 1.30 Your Answer The average subject thought death from leu- kemia was 30% more common (ratio 1.30 to 1) than death from falls. However, death from falls is really 20% more frequent. So the correct answer is B with a ratio of 1.20. [The error may stem from the dramatic nature of leukemia and the greater amount of media publicity it re- ceives, or it may stem from the fact that acci- dental falls are common but usually non-fatal.] For a final example, consider: A. Poisoning by solid or liquid B. Tuberculosis Previous Answer A 5.26 Your Answer The average subject thought death by poisoning was 5.26 times more likely than death from tu- berculosis. However, death from tuberculosis is really 44% more frequent than death from poi- soning so the correct answer is B with a ratio of 1.44. [Again, it is easy to see how media publicity regarding poisoning and the dramatic nature of the event could cause subjects to over- estimate it compared to the drab, undramatic, perhaps old-fashioned disease, tuberculosis.] Note that a ratio of 1.20 means 20% more likely, 1.50 means 50% more likely, 1.80 means more likely, etc. For each pair, write the letter of the item you think is a more likely cause of death and give your judgment about how many times more frequent the more frequent item is. Results The special instructions given to the de- biasing group had no effect on performance. Neither the debiasing group nor the control group was able to improve consistently upon the mean responses given by subjects in Experiment 1. For each pair, we calculated the percentage of subjects in the debiasing group and in the control group whose re- sponses were closer to the true ratio than 574 LICHTENSTEIN ET AL. was the geometric mean of the original Ex- periment 1 group. We also calculated the percentage of subjects in both groups whose responses were closer to the ratio predicted from the Experiment 1 regression line (i.e., who had smaller secondary bias). In every case the percentage closer to the true ratio was equal to the percentage closer to the regression line. The average percentage of improved answers was only 53.8% for the experimental group (range of 21%-82%) and 52.4% for the control group (range of 37%-70%). The experimental group showed a better improvement percentage than the control group on 10 pairs, the con- trol group was better for 8 pairs, and there was a tie on 1 pair. Discussion Psychological Significance As in laboratory studies, our subjects ex- hibited some competence in judging fre- quency. Frequency estimates for causes of death, words, and occupations generally in- creased with increases in true frequency; similarly, the discriminability of causes in- creased with the ratio of their statistical frequencies. Furthermore, our subjects' as- sessments of the frequencies of causes of death, both direct estimates and paired com- parisons, correlated more highly with the true answers than did any other measures, such as newspaper reportage and ratings of direct experience with the causes of death. Despite the sensitivity of judgments to true frequency, the overall accuracy of both paired comparisons and direct estimates of frequency was quite poor. Unless the true frequencies of a pair of lethal events dif- fered by more than a factor of two, there was no guarantee that subjects could cor- rectly indicate which was more frequent. Large errors were present in the judged ratios for many pairs of events. The high correlations between direct estimates and true frequency across almost a million-to- one range of the latter variable are deceptive. Large errors were present in these esti- mates, much as with the paired-comparison judgments. Primary bias. Experiments 1 and 3 dem- onstrated a strong primary bias, consisting of overestimation of low frequencies and underestimation of both high frequencies and large ratios, much as has been found before by Attneave (1953), Teigen (1973), and others (Poulton, 1973). We considered and rejected two possible reasons for this primary bias. One is that subjects avoid using extremely high (or low) numbers in making their responses. The absence of such biases with the words and occupations tasks of Experiment 2 makes this hypothesis im- plausible. Second, the underestimation of high ratios in Experiment 1 was not simply an artifact of averaging correct and incor- rect answers. This is shown by the per- sistence of the effect for pairs in which nearly everyone got the correct answer. Another possible explanation of the pri- mary bias is that it results from anchoring: Subjects first choose some representative value and then adjust upward or downward according to whatever considerations seem relevant to the case at hand. Studies of an- choring and adjustment procedures have shown that such adjustments tend to be insufficient (Lichtenstein & Slovic, 1971; Tversky & Kahneman, 1974). A number of laboratory studies of frequency estimation can be interpreted as showing a tendency to anchor on the average frequency in the lists learned (see Rowe & Rose, 1977). In- sufficient adjustment would produce too flat a curve, a finding often noted in labora- tory studies (see Hintzman, 1976). Per- haps the clearest evidence of anchoring may be found in Experiment 3, in which the one true frequency given to the subjects could easily have served as an anchor value. Group MVA, who were given a high anchor (50,000), generally assigned higher values to the items than did Group E, whose an- chor value was 1,000. In the paired-comparison tasks no such clear-cut anchor was provided. Nonetheless, Poulton (1968) has shown that in magni- tude-estimation studies, the subjective mag- nitude of the first stimulus presented serves as an anchor for subsequent judgments. This view is supported by Carroll's (1971) JUDGED FREQUENCY OF LETHAL EVENTS 575 finding of a .66 correlation between the log of individual subjects' first estimates and the mean log of all their responses in esti- mating word frequency. The present paired- comparison data are consistent with the notion that the response to the first stimulus serves as an anchor. The causes-of-death groups received, as their first stimulus, a pair they judged as having a relatively low ratio (pair 40; geometric mean response = 4.3 for students and 18.0 for league mem- bers), while the words and occupations groups' first stimulus was judged with rela- tively high ratios (116 and 265, respec- tively ). Both causes-of-death groups showed more underestimation of high ratios than did the words and occupations groups, as Poulton would predict. Yet another possible explanation of the primary bias derives from the availability heuristic (Tversky & Kahneman, 1973), which states that assessments of frequency or probability are based on the number of instances of the event that come to mind. Cohen (1966) has found that when subjects manage to recall any of the words in a cate- gory the mean number of words recalled per category is relatively independent of the number of words in that category. If this tendency is true also for categories learned outside the laboratory, such as causes of death, and if, as suggested by Tversky and Kahneman, people base their assessments on these all-too-equal recollections, a flat- tening of their responses, as observed, would result. Secondary bias. Subjects' responses ex- hibited numerous strong and consistent sec- ondary biases. Some portion of these errors may be due to the unrepresentative cover- age of these causes of death in the news media. Others have also speculated about the effects of such media bias. For exam- ple, Zebroski (Note 5) blamed the media for people's concerns about nuclear reactor safety. He noted that "fear sells"; the media dwell on potential catastrophes and not on the successful day-to-day operations of power plants. Author Richard Bach (1973) made a similar observation about the fear shown by a young couple going for their first airplane ride: In all that wind and engineblast and earth tilting and going small below us, I watched my Wis- consin lad and his girl, to see them change. De- spite their laughter, they had been afraid of the airplane. Their only knowledge of flight came from newspaper headlines, a knowledge of colli- sions and crashes and fatalities. They had never read a single report of a little airplane taking off, flying through the air and landing again safely. They could only believe that this must be possible, in spite of all the newspapers, and on that belief they staked their three dollars and their lives, (p. 37) The present results suggest that the media have important effects on our judgments, not only because of what they don't report (successful plane trips or reactor opera- tions), but because of what they do report to a disproportionate extent. Subjects may also be misinformed be- cause of bias in their direct exposure to the various causes of death. Young people, such as our student subjects, may be underex- posed to death from diseases associated with age, such as stroke, stomach cancer, and diabetes, all of which were underesti- mated, and overexposed to death from motor vehicle accidents, all accidents, and pregnancy, all of which were overestimated relative to the regression line. The two explanations of secondary bias given above assume that the bias occurs because the information received by the subject is inadequate or misleading. An- other explanation can be found by exam- ining hypotheses about the biases induced by people's cognitive storage and retrieval processes. Tversky and Kahneman's (1973) concept of availability, with its emphasis on vivid or sensational events, seems relevant. Examination of Figures 9 and 10 shows that among the most overestimated causes of death (relative to the regression line) were botulism, tornado, flood, homicide, motor vehicle accidents, all accidents, and cancer. These are all sensational events. Most of the causes of death that were underestimated (relative to the regression line)—asthma, tuberculosis, diabetes, stom- ach cancer, stroke, and heart disease—seem to be undramatic, quiet killers. 576 LICHTENSTEIN ET AL. Some of the evidence of secondary bias is inconsistent with previous laboratory find- ings. One such finding is that more concrete and imaginable words are judged to be less likely than equally frequent abstract words (e.g., Ghatala & Levin, 1976). While we had no direct measure of imaginability, one might assume that catastrophic events and those more heavily reported in the media tend to be more concrete and imaginable. However, all three of these surrogate mea- sures of imaginability (catastrophe, news frequency, and news inches) were positively correlated with the residuals (for both paired comparisons and direct estimates). Thus, in this sense imaginable events tended to be judged more likely, as predicted by availability considerations. Another difference between the present research and previous studies is found with catastrophic causes of death whose occur- rences tend to be massed rather than dis- tributed over time. Laboratory studies (e.g., Rowe & Rose, 1977) have consist- ently found that massing the occurrences of a word in a learned list tends to decrease its estimated frequency. Two explanations offered for this effect (Hintzman, 1976) are (a) encoding variability—spaced repetitions are more likely to receive differential cod- ing than massed items—and (b) deficient processing of massed items. In the current experiments, catastrophic (massed) events tended to be overestimated relative to the regression line. One key difference between the usual laboratory experiments and the present study is that the former do not use stimuli that become sensational or emo- tionally charged when massed. Such special characteristics may lead to extra processing, rather than to deficient processing, for cata- strophic causes of death. When we have been able to compare the present results with previous laboratory work, we have found about as many mis- matches as matches. The present study is based on material our subjects have learned in the real world; in most other laboratory work, the subjects were tested on material they had learned in the laboratory. Mand- ler (Note 6) has speculated on this dif- ference : In terms of presentation of to-be-remembered ma- terial, the laboratory experiment fails—in com- parison with the real world—with respect to three major problems: Frequency, salience, and context. The laboratory experiments fail with re- spect to frequency because the typical event that an individual must recall or recognize in everyday life has been encountered anywhere from a few to thousands of times; in the laboratory we look at the few and rarely look at the thousands. Salience must be of interest because encoding op- erations in the real world typically take place with particular attention to the relevance or sali- ence of a particular event to other aspects of the mental apparatus; we encode what is important, while in the laboratory we are required to encode what is unimportant. Furthermore, the context of real world memory involves not simply a restricted number of materials presented in the laboratory, together with a computer or a memory drum, but rather the larger context of the individual's current plans and intentions, geographic location, and social conditions, (pp. 3-4) Improving Judgments One question raised by this study is how to improve intuitive judgments of fre- quency. We did not attempt here to correct the primary (overestimation/underestima- tion) bias. Work by Teigen (1973) sug- gests that this can be done by asking people to allocate frequencies as percentages of the total rather than having them estimate ab- solute numbers. This technique, however, might not prove helpful when (as with causes of death) the largest frequency is over a million times larger than the smallest frequency. It would be exceedingly difficult for subjects to express ratios even as high as 3,000:1 (as they did in the present study) using a percentage response mode. Statistical correction, using regression equa- tions, might be the best way to correct the primary bias. Since the secondary bias observed here seems linked to availability, we hoped to reduce that bias by informing subjects about its probable source. This information was not useful. The failure of such frontal attacks to eliminate biases (see also Fisch- hoff, 1977) suggests some directed restruc- turing of judgment tasks may be necessary. For example, Selvidge (1972) proposed JUDGED FREQUENCY OF LETHAL EVENTS 577 having people make probability and fre- quency judgments on a scale in which other familiar events serve as marker points. In composing such a scale, great care would have to be taken to use only events whose subjective ordering fits their true ordering. Beyth-Marom and Fischhoff (1977) have shown that requiring people to work hard to produce specific examples of classes of events before estimating the frequencies of the classes can partially reduce availability bias. Another promising suggestion comes from Armstrong, Denniston, and Gordon (1975), who found that numerical estimates can be improved by having estimators de- compose the original question into a series of subquestions about which they are more knowledgeable and whose answers lead logically to the estimate of interest. For ex- ample, an answer to the question "How many people were killed in motor vehicle accidents in the United States in 1970?" might be improved by having people answer the following related questions: (a) What is the population of the U.S.? (b) How many automobile trips does the average U.S. citizen take in a year? (c) What is the probability of a fatal injury on any par- ticular trip? From the answers to these questions, one can calculate an answer to the original question. Societal Implications Economist Frank Knight once observed that "We are so built that what seems rea- sonable to us is likely to be confirmed by experience or we could not live in the world at all" (Knight, 1921, p. 227). But the pres- ent study and a growing body of other re- search (e.g., Kunreuther et al, 1978; Slo- vic, Kunreuther, & White, 1974; Kates, Note 1) indicate that in the evaluation of risks and hazards, Knight's optimistic as- sessment of human capabilities is wrong. People do not have accurate knowledge of the risks they face. As our society puts more and more effort into the regulation and control of these risks (banning cycla- mates in food, lowering highway speed limits, paying for emergency coronary-care equipment, etc.), it becomes increasingly important that these biases be recognized and, if possible, corrected. Improved pub- lic education is needed before we can expect the citizenry to make reasonable public- policy decisions about societal risks (Slovic, Fischhoff, & Lichtenstein, 1976; Slovic et al., 1974). And the experts who guide and influence these policies should be aware that when they rely on their own experience, memory, and common sense, they, too, may be susceptible to bias. We have, by necessity, studied sources of judgmental error in situations for which good estimates of true frequency exist. But our so- ciety must often make judgments about haz- ardous activities for which adequate statistical data is lacking, such as recombinant DNA re- search or nuclear waste disposal. We suspect that the biases found here (overestimation of rare events, underestimation of likely events, and an undue influence of drama or vivid- ness) may be operating, indeed, may even be amplified, in such situations. Reference Notes 1. Kates, R. W. Hasard and choice perceptions in flood plain management (Research Paper No. 78). Chicago: University of Chicago, Depart- ment of Geography, 1962. 2. Bergstrorn, T. C. Preference and choice in matters of life and death. In J. Hirshleifer, T. Bergstrorn, & E. Rappoport, Applying cost- benefit concepts to projects which alter human mortality. (Rep. UCLA-ENG 7478). Los An- geles : University of California, School of En- gineering and Applied Science, 1974. 3. Bar-Hillel, M. The base-rate fallacy in proba- bility judgments (Rep. 77-4). Eugene, Oreg.: Decision Research, 1977. 4. Combs, B., & Slovic, P. Causes of death: Biased newspaper coverage and biased judgments. Un- published manuscript, 1978. (Available from Decision Research, 1201 Oak St., Eugene, Ore- gon 97401.) 5. Zebroski, E. L. Attainment of balance in risk- benefit perceptions. In D. Okrent (Ed.), Risk- benefit methodolgy and application: Some pa- pers presented at the Engineering Foundation Workshop, Asilomar, Calif. (Rep. UCLA- ENG 7598). Los Angeles: University of Cali- fornia, School of Engineering and Applied Sci- ence, 1974. 6. Mandler, G. Memory research reconsidered: A critical review of traditional methods and dis- tinctions (Rep. No. 64). San Diego: University 578 LIECHTENSTEIN ET AL. of California, Center for Human Information Processing, 1976. References Arlen, M. J. The cold, bright charms of immor- tality. The New Yorker, January 27, 197S, pp. 73-78. Armstrong, J. S., Denniston, W. B., & Gordon, M. M. The use of the decomposition principle in making judgments. Organisational Behavior and Human Performance, 1975, 14, 257-263. Attneave, F. Psychological probability as a func- tion of experienced frequency. Journal of Ex- perimental Psychology, 1953, 46, 81-86. Bach, R. Nothing by chance. The American Way, 1973, 6, 32-38. Beyth-Marom, R., & Fischhoff, B. Direct mea- sures of availability and frequency judgments. Bulletin of the Psychonomic Society, 1977, 9, 236-238. Burton, I., Kates, R. W., & White, G. F. The environment as hazard. New York: Oxford University Press, 1978. Carroll, J. B. Measurement properties of subjec- tive magnitude estimates of word frequency. Journal of Verbal Learning and Verbal Be- havior, 1971, 10, 722-729. Cohen, B. H. Some-or-none characteristics of cod- ing behavior. Journal of Verbal Learning and Verbal Behavior, 1966, 5, 182-187. Coombs, C. H., Dawes, R. M., & Tversky, A. Mathematical psychology: An elementary intro- duction. Englewood Cliffs, NJ.: Prentice-Hall, 1970. Estes, W. K. The cognitive side of probability learning. Psychological Review, 1976, 83, 37-64. Fischhoff, B. Perceived informativeness of facts. Journal of Experimental Psychology: Human Perception and Performance, 1977, 3, 349-358. Ghatala, E. S., & Levin, J. R. Phenomenal back- ground frequency and the concreteness/imagery effect in verbal discrimination learning. Memory & Cognition, 1976, 4, 302-306. Hintzman, D. L. Apparent frequency as a function of frequency and the spacing of repetitions. Journal of Experimental Psychology, 1969, 80, 139-145. Hintzman, D. L. Repetition and memory. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory. New York: Academic Press, 1976. Hintzman, D. L. The psychology of learning and memory. San Francisco: Freeman, 1977. Howell, W. C. Representation of frequency in memory. Psychological Bulletin, 1973, 80, 44-53. Kates, R. W. Risk assessment of environmental hazard. Chichester, England: Wiley, 1978. Knight, F. H. Risk, uncertainty, and profit. New York: Houghton-Mifflin, 1921. Kucera, H., & Francis, W. N. Computational analysis of present-day American English. Prov- idence. R.I.: Brown University Press, 1967. Kunreuther, H., Ginsberg, R., Miller, L., Sagi, P., Slovic, P., Borkin, B., & Katz, N. Disaster in- surance protection: Public policy lessons. New York: Wiley, 1978. Lichtenstein, S., & Slovic, P. Reversals of prefer- ence between bids and choices in gambling de- cisions. Journal of Experimental Psychology, 1971, 89, 46-55. Peterson, C. R., & Beach, L. R. Man as an in- tuitive statistician. Psychological Bulletin, 1967, 68, 29-46. Postman, L. Short-term memory and incidental learning. In A. W. Melton (Ed.), Categories of human learning. New York: Academic Press, 1964. Poulton, E. C. The new psychophysics: Six mod- els for magnitude estimation. Psychological Bulletin, 1968, 69, 1-19. Poulton, E. C. Unwanted range effects from using within-subject experimental designs. Psycho- logical Bulletin, 1973, 80, 113-121. Rowe, E. J., & Rose, R. J. Effects of orienting task, spacing of repetitions, and list context on judgments of frequency. Memory & Cognition, 1977, 5, 505-512. Selvidge, J. Assigning probabilities to rare events. Unpublished doctoral dissertation, Harvard Uni- versity, 1972. Slovic, P., Fischhoff, B., & Lichtenstein, S. Cog- nitive processes and societal risk taking. In J. S. Carroll & J. W. Payne (Eds.), Cognition and social behavior. Hillsdale, NJ.: Erlbaum, 1976. Slovic, P., Kunreuther, H., & White, G. Deci- sion processes, rationality and adjustment to natural hazards. In G. F. White (Ed.), Natural hazards: Local, national and global. New York: Oxford University Press, 1974. Teigen, K. H. Number and percentage estimates in sequential tasks. Perceptual and Motor Skills, 1973, 36, 1035-1038. Thorndike, E. L., & Lorge, I. The teacher's word book of 30,000 words. New York: Columbia University, Teachers College, Bureau of Publi- cations, 1944. Tversky, A., & Kahneman, D. Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 1973, 5, 207-232. Tversky, A., & Kahneman, D. Judgment under un- certainty: Heuristics and biases. Science, 1974, 185, 1124-1131. Underwood, B. J. Attributes of memory. Psycho- logical Review, 1969, 76, 559-573. U.S. Bureau of the Census, Census of population: 1970. Occupation by industry. Final Report PC(2)-7C. Washington, D.C.: U.S. Government Printing Office, 1972. White, G. F. (Ed.). Natural hazards: Local, na- tional and global. New York: Oxford Univer- sity Press, 1974. Received January 17, 1978 Revision received June 23, 1978 •