Page images
PDF
EPUB
[blocks in formation]

ingly, some members of the panel have indicated that a prohibitive statement in the law such as "shall be deemed unsafe if the additive is found to induce cancer when ingested by man or animal," is not desirable.

The need to rely on sound, informed, and competent scientific judgment has repeatedly been emphasized. It has been suggested that provision should be made for the use of an expert committee of scientists for review of controversial problems or interpretations. This might take the form of the scientific review committees provided for under the Pesticide Act, or it might be composed of members of Health, Education, and Welfare, Agriculture, and of outside scientists nominated by the National Academy of Science.

Several panelists have emphasized the need to make public experimental information or data upon which judgments or decisions are based, and to make this public at the time the decisions are announced. This is a recommendation made not only by members of this panel, but it was similarly urged by members of the 1957 Panel on Food Additives provided by the Academy.

It is believed that publication of such data would be in the best public interest, and would greatly strengthen support for sound decisions. It was indicated yesterday, that in the opinion of some, full protection to the public would be provided without a specific cancer clause. Dr. Miller cited the Commissioner's statement on this matter in the 1957 hearings.

Cancer is but one of many considerations in the assessing of the safety of a proposed additive. Clearly a consideration of it must be included in deciding on the permissibility for use. An additive with a possibility of harmful effects of any sort in the proposed uso level should and can be declared by the Commission as unsafe without the Delaney clause being in the legislation.

If, therefore, one simply identifies or recognizes cancer-producing effect as one of the properties upon which judgment must be based in determining safety, it would appear to me-this is my personal opinion--that adequate protection would be afforded by the law without the inclusion of the Delaney clause.

I think that a law providing this adequate protection combined with ample provision for scientific review and judgment, plus publication of the basis of decisions, would serve as a sound, effective, workable framework, which would assure for us the maximum health and benefit to the consumer, and provide us with that flexibility which is necessary for the adoption of new improvements as these are afforded by scientific advances.

As I say, I recognize that this is an interpretive summary. But it does seem to me that there are a number of points on which this panel is in considerable agreement. And I think it important that we recognize these points of agreements, and then turn our attention, if we may, to attempting to relate these, as I have tried to do, to the legislative question at hand.

Dr. Stewart?

Dr. STEWART. I think the record ought to show that this is Dr. Darby's personal statement. I don't think it really represents a sum

[blocks in formation]

mary of the views expressed by the panel yesterday. There are certainly many points that Dr. Darby has emphasized in his statement this morning that don't agree with my own personal opinions about this matter, and I think it would be a lot safer to have Dr. Darby's statement simply represent his own personal opinion, rather than a summary of the views of the panel.

Dr. DARBY. I have so stated that this is not a statement agreed to by the panel. It has not been reviewed by the panel. It is an attempt to interpret-first, to identify areas of agreement on scientific points, and then to interpret these into a relationship to the legislation. And I think that it is this that we should discuss. I would welcome discussions of these points. This is not intended as a final summarizing statement to conclude discussion. It is an attempt to point up discussion, Dr. Stewart.

Dr. STEWART. That is what the heading should be, then-an attempt to point up the discussion-not a summary of the views of the panel. With that I agree.

Dr. DARBY. Would you like to open the discussion!

Dr. STEWART. Yes. I might say a few things.

I think there are some "red herrings" in this. It was mentioned yesterday, for example, that sugar and salt in appropriate concentration and dose in some experiments have been shown to produce cancers. Now, I haven't gone over those particular papers. I don't know how many reliable cancer investigators in the country have studied this particular subject. I think when something like this comes up, it would be appropriate to have these substances studied by a number of competent investigators under a variety of experimental conditions. The other point to be taken with respect to a substance-substances like salt or sugar, the amendment could specifically eliminate them from falling under the Delaney clause, or the Delaney clause might be modified somewhat as follows, with some sort of a statement like thisthat with respect to salt and sugar, unless practical and scientific evidence has definitely shown that these chemicals or foodstuffs are not carcinogenic to man-everybody knows we eat salt, everybody knows we eat sugar. And these experiments that have been done to show that these are carcinogenic really need careful study before they are used as an argument against a law that really is designed to protect human beings from the addition to food of synthetic additives which are carcinogenic to animals under appropriate tests..

Now, Dr. Zavon mentioned yesterday that of 100 substances tested,

I believe you said that 5 would show false positives.

Dr. Zavox. There is a statistical probability of that.

Dr. STEWART. Now, have you published that?

Dr. ZAVON. I have the reference here for admission into the record. Dr. STEWART. IIas it been published in a journal!

Dr. Zavox. It has been published in the Journal of the American Statistical Association, by Dr. Sterling, from my laboratory. Dr. STEWART. I had not seen that particular reference.

Dr. BLUM. May I ask, Dr. Zavon, is that a blanket statement, with no qualifications as to the conditions. Doesn't it state the number of animals in the experiment? Doesn't it state some other things!

[blocks in formation]

Dr. Zavox. This is not a question of the number of animals. It is a question of the number of experiments, and the publication decisions the title of the article is "Publication Decisions and Their Possible Effects on Inferences Drawn From Tests of Significance, or Vice Versa.' 99

I would not attempt to qualify myself as a statistician, and, therefore, would beg off on a debate as to the validity or lack of validity of this publication. However, I note that it was followed up in the same journal by a letter from Gordon Tullock of the University of Virginia, in which he extended this concept. I would be happy to have both a copy of his letter from that journal, and the publication itself, submitted to the record for your perusal at a later date, if this is possible.

Dr. BLUM. I am still puzzled as to what the meaning of this is. Certainly there is the question of false positives, and so forth. But to these exact figures, I don't know quite how to take them.

Dr. Zavox. Dr. Sterling, whom I believe to be a competent and recognized statistician, and has worked in this field for some years, has developed some data indicating that the lacunas of distribution of reports is or can be quite severe, due to the limitation of publication available for negative data. And, therefore, over a period of time, we get in the literature an accumulation of positive reports. As you have indicated. Dr. Stewart, you have to go over the reports themselves and the data in order to attempt to evaluate what has actually been done and its true significance. When we only get positive data or predominantly positive data reported, the tremendous energy, time, and consequently expense of running some of the types of experiment that we have been discussing here almost precludes repetition in the ordinary course of events. So that a positive statement has great difficulty in being refuted, however correct or incorrect it may be.

I am not attempting to impugn the validity of experiments. I am merely pointing out that it is extremely difficult to get repetition of the type of experimental procedure that we are talking about, because of the length of time necessary, the number of animals necessary, and all of the conditions surrounding this type of experimental procedure.

The CHAIRMAN. Are you asking that this be included in the record? Dr. Zavox. Yes, sir, I am.

The CHAIRMAN. May we have a chance to look at it?

Dr. ZAVON. Surely.

Mr. DINGELL Mr. Chairman, may I ask a question at this point? The CHAIRMAN. Well, I think they are in the midst of discussing a particular point. Let them conclude the discussion, and then you may inquire.

Will that be satisfactory!

Mr. DINGELL That will be very satisfactory.

Dr. DARBY. Maybe Dr. Levin can comment on this and help us. Dr. LEVIN. I believe, Mr. Chairman, there is probably an honest confusion here between two points, one of which is a part of standard statistical theory. And the other is the point that when people get positive findings on various experiments, they are more apt to pub fish them than when they get negative results. Now, the fact that

[ocr errors][merged small][merged small]

people are more apt to publish positive findings than negative I don't think is particularly pertinent to our discussion, since we are talking about tests which would be conducted specifically for the purpose of evaluating these additives under the direction or industry, or by the Food and Drug Administration. So this would not be a matter in which the evidence would depend upon publication.

Dr. ZAVON. I am sorry I have to disagree with you very energetically here, because if we conduct an experiment, and we get a negative report, and we give it to Food and Drug Administration for evaluation, and they clear the material—and I think Dr. Kensler suggested several instances in which this could have occurred, at a certain levelif, on the other hand, we, for one reason or another, go to a higher level, which gives a positive answer

Dr. LEVIN. That is where the confusion is. You are talking about the evaluation of a test in one breath, and the publication of tests in the other. These are not the same things.

Dr. ZAVON. These are the same thing in this sense. If we publish positive, or we submit positive data, this is exactly the same thing. If we submit negative data

Dr. LEVIN. Are you saying that publication and submission are the same thing?

Dr. ZAVON. In this sense they are, because we are submitting it to a regulatory agency for their perusal.

Dr. LEVIN. Obviously, they are not the same thing, and I don't see how you can make them the same thing, because you can submit data without ever publishing them. All you do is submit them.

Mr. YOUNGER. It is a public document, as soon as it is submitted? Dr. LEVIN. I think the term publication in this article refers to publication in a scientific journal. Now, the 5-percent error, Mr. Chairman, simply refers to the fact that if you have a characteristic which occurs in a population in a given percentage-say 10 percent of our population has violet eyes. And you take samples from that population of varying sizes, repeatedly, you obviously are not going to get 10 percent each time. You are going to get variations around 10 percent. And the variations which you get will usually form a normal curve, according to one type of observation.

Now, we usually say that we will say that the actual percentage is greater or less than 10 percent only if the chances of finding such a difference is 5 percent. In other words, we will accept that much error. And that is immediately the 5-percent figure used as a measurement of acceptable statistical error. It has nothing to do with any particular type of scientific experiment. It does not mean that you will always find that a certain substance will produce a certain effect in 5 percent of the cases. It has simply to do with the normal variation in a characteristic which will be found if you sample repeatedly from a large population. This is standard statistical theory, I think, with which we are all familiar, and that is all it means.

The CHAIRMAN. I have had occasion to look over the information suggested for the record. Is that the pamphlet mentioned to me yesterday?

Dr. Zivos. That is right, sir.

[blocks in formation]

The CHAIRMAN. All right. And the other material, from your publication, that you submitted with it-is that from the journal! Dr. Zavox. This is from the Journal of the American Statistical Association.

The CHAIRMAN. All right. Let it be received for the record, then. (The document referred to is as follows:)

PUBLICATION

DECISIONS AND THEIR POSSIBLE EFFECTS ON INFERENCES DRAWN FROM TESTS OF SIGNIFICANCE OR VICE VERSA'

(By Theodore D. Sterling, University of Cincinnati)

There is some evidence that in fields where statistical tests of significance are commonly used, research which yields nonsignificant results is not published. Such research being unknown to other investigators may be repeated independently until eventually by chance a significant result occurs-an "error of the first kind" and is published. Significant results published in these fields are seldom verified by independent replication. The possibility thus arises that the literature of such a field consists in substantial part of false conclusions resulting from errors of the first kind in statistical tests of significance

It has become commonplace to speak of a "level of significance" in reporting outcomes of experiments. This significance level refers to risks of rejecting the null hypothesis, I., erroneously, and seemingly, has no other direct relationship to experimental work. The experimenter who uses so-called tests of significance to evaluate observed differences usually reports that he has tested H. by finding the probability of the experimental results on the assumption that H. is true, and he does (or does not) ascribe some effect to experimental treatments. What with the shortage of publication space and the desire for objectivity it often seems that the responsibility for rejecting a hypothesis rests squarely on a crucial value in a table of probabilities.

The risk of choosing the incorrect inference from experimental observation depends on a stated risk of rejecting H. if true and on the risk of failing to do so if H. is not true. Here is a dilemma which is dealt with in practice by two conventions. As Sarage notes [7, p. 256] publications tend to report the results of the test as well as that level of significance for which the corresponding test of the relevant family would be on the borderline between acceptance and rejection (in the view of the author). The individual reader now makes his own test at a level of significance appropriate to him. How much uncertainty such a reader is willing to tolerate in rejecting a hypothesis that might be true will depend on his confidence in the methods of data collection, his views concerning the relevance of alternative hypotheses, or the weight he gives to evidence from other sources. In addition, scientific readers differ in fundamental strategies for games against nature and their tolerance for errors can hardly be expected to remain unchanged from one experimental problem to another. The type of reporting mentioned by Savage may well be most satisfactory for author and reader alike.

Some publications, notably of social science content, have adopted a somewhat more extreme.convention. Here a borderline between acceptance and rejection of H. is taken as a relatively fixed point, usually at Pr(E/H)≤.05 or at that approximate region for which the probability, (Pr) of the outcome (E) of the experiment, calculated on the assumption that . is true, is no larger than five in a hundred [3] [6] [S]. General adherence to such a rigid strategy is interesting by itself but might have no further consequences on the decisions reached. However, when a fixed level of significance is used as a critical criterion for selecting reports for dissemination in professional journals it may result in embarrassing and unanticipated results.

The author wishes to express his thanks to Sir Ronald Fisher whose discussion on related topics stimulated this research la the first place, and to Leo Katz, Oliver Lacey, Enders Robinson, and Paul Siegel for reading and criticizing earlier drafts of this manuscript.

The fact that some tables present only the 0.05 and 0.01 levels of significance encour ages the use of these two levels of significance [8, p. 292].

« PreviousContinue »