Page images

(1) make any publication whereby the data furnished by any particular person, business establishment, or other organization can be identified; or

(2) permit anyone other than sworn officers or employees of the Center to examine the individual reports; or

(3) use the information for any purpose other than the compilation of statistics.

The potential for invasion of privacy by improper use and disclosure of information is related to and limited by the data stored in a Center. I will list three examples of this relationship.

First, much of the data would be in the form of summaries at appropriate levels of statistical aggregation. Some of these would show how many reporting units were included in a particular category, but since they would contain no identification of individual persons or businesses they would present no problem of disclosure or invasion of privacy.

Second, where original source material is needed for special analyses, only samples rather than whole populations or universes would be transferred to the Center. In exceptional cases where, in order to perform the functions of the Center it is necessary to have access to data for a complete universe, this can be accomplished by having the work done at the Center with the source data remaining under the custody the collecting agency.

Third, certain types of data which may be roughly characterized as the "dossier" or "case file" type would be excluded from storage in the Center. The following are illustrative examples :

Individual personnel records of Government agencies and Civil Service Commission files on individual employees and applicants (letters of reference, performance ratings, test scores, etc.).

Files compiled by FBI, regulatory or other agencies as a result of investigations on individual persons, or businesses or other organizations.

FBI fingerprint files and files on persons convicted of crimes.
Files of revoked drivers' permits.

Medical records on Government employees or applicants and patients of Government institutions.

In short, by carefully limiting the data put in such a Center, it will be possible to affect significantly the potential for improper use and disclosure of information.

As I indicated earlier in my testimony, some of the fears which have been expressed about the creation of a Data Center have stemmed from misunderstanding. But there is also a real basis for concern over invasion of privacy attendant to such a Center. The retention in machine readable form and the use of computers in retrieval certainly enlarges the scale on which privacy could be invaded. The concentration of more information in one place represents a potential for more damage if confidentiality restrictions are violated. It may also increase the incentive for violation. On the other hand, the establishment of a Center would promote and facilitate greater attention to technological design features of information storage and retrieval—both hardware and software-which would assist in safeguarding against improper disclosure.

However, sole reliance cannot be placed on technology. The fixing of responsibility and accountability for the design and exercise of safeguards to assure confidentiality and privacy is desirable. To this end, in addition to the legislation which I have referred to, we would propose that periodic reports should be made public identifying the various bodies of information which have been transferred to the Center for storage and describing what and to whom information has been made available. We would further propose that a public advisory council be established, composed of members from outside the Federal Government representing public interests generally. This Council should include State and local government representatives, lawyers, computer scientists, and users and suppliers of information from the business and academic communities. Its functions would be to advise on public interest in policies, standards, and procedures regarding such matters as types of data to be preserved, processing schedules and programs, service functions performed, and protection of confidentiality. The Council should also audit and report to the public on the adequacy and effectiveness of procedures and practices of the Center to provide adequate protection of confidentiality and privacy.

Finally, I want to emphasize the distinction between a statistical information system which the Data Center we are considering would serve, and other information systems which are designed to compile information about individual reporting units—be they people, businesses or ther organizational units. The distinction is important because of the difference in what they portend for invasion of privacy. Performance of the functions of a Statistical Data Center would not require, and law would not permit, the disclosure of information about individual persons, businesses or other organizations. In contrast, if systems designed to compile information about indiv lual reporting units are to perform their functions, they must disclose. Their purpose is to provide information on the basis of which a decision can be made or action taken regarding a particular person. Such activities raise very fundamental questions about the invasion of privacy, but they should be clearly separated from consideration of the desirability of creating a Statistical Data Center.

Senator Long. Mr. Zwick, you mentioned that one of the safeguards is that the law would not permit this. Of course, we want a law that will not permit it, but I am not too impressed with just having laws. If we keep the privacy from being invaded, it would be enough. We have had a law against wiretapping; we have had departmental regulations against wiretapping, and invasion of privacy. We have had States with laws against wiretapping, but in spite of ail of that we had many Federal agencies and agents who violate the Federal law and the State law and the regulations of their own department in wiretapping.

So it takes more than just a law to say that privacy must be maintained. There must be a strict enforcement of it, and there must be, I think, every possible safeguard, especially, in light of the news media, to be able to check to see what is going on and whether those rights are not being invaded to fully guarantee that they will not be.

I am curious, when do you think the proposal for a data bank will be ready to submit to the President or to the Congress?

Mr. ZWICK. Quite frankly we are still trying to work out in more concrete detail what such a Center will look like. I think to evaluate it you have to be very specific in terms of the piece of legislation, what type of data you will put in, what you mean by samples; whether we will make it this year is now an open issue. There has been no decision at this point in the executive branch, as to whether or not we will submit a bill during this session.

Senator Long. If I can respectfully suggest to you, in making your proposal, that every possible effort be made to protect the privacy of the information that is placed in the center in those computers—what goes out as well as what goes in—I am sure it will be of great benefit

to you.

Mr. ZWICK. Well, sir, I agree completely with that point. And I also agree that, just having a law does not guarantee that you are going to have compliance. But by concentrating in one place the responsibility we believe that it is easier to monitor whether or not the law is being abided by.

Secondly, by having a public advisory commission, which would audit and report publicly on the performance of the Data Center, we have another check on it.

Thirdly, the requirement to make public reports on what information is in the Center and who has had access to it is another check. In other words, by concentrating in one place we think you can, in fact, improve the privacy situation by making sure that one person who is responsible takes this responsibility very seriously, as the Census Bureau does today.

[blocks in formation]

So that we may make real improvements by concentrating, focusing, on that one point, the whole privacy issue.

Senator Long. All right.
Mr. FENSTERWALD. I just have one question, Mr. Chairman.

Mr. Zwick, what do you think about the proposal that an individual, if he is willing to pay for retrieval, can see his own dossier any time?

Mr. Zwick. Mr. Fensterwald, I just had not thought of this issue before.

First, I would like to react to the word "dossier." It is not certain that we would even have within the Center the information on a particular individual. Let us assume again that you happen to be in the 1-percent sample of social security that came into the Center. It is not clear—and this is a technical detail—as to whether we would consolidate all that information with other information on one tape or several tapes. But it is clear that we could go through a search and find out everything about Mr. Fensterwald in the Center.

We have not faced the issue of whether it would be desirable to make that available to him I have at the moment mixed reactions.

The more people who start asking, the more you are going to make publicly available who the exact individuals are that are in the information system, so there is some disclosure of privacy.

On the other hand, your point, I think, is well taken. If data on you is in the Center, you have a right to know what is in there. We have just not faced this one.

Mr. FENSTERWALD. That is all.
Senator Long. Mr. Kass?
Mr. Kass. No.
Senator LONG. Mr. Waters?

Mr. WATERS. I have just one. It is in connection with what the Chief Counsel has raised, and I think it is well taken.

I am concerned with the information in personal injury cases, for example, of individuals who can be compelled under existing discovery procedures to disclose many things about themselves which they cannot get from the Government. They can be required, for example, to turn over copies of their own income tax returns.

Do you contemplate if this were available to the individual that he might be required to secure it and turn it over to other people?

Mr. ZWICK. Well, that, of course, is one of the risks that is attendant to Mr. Fensterwald's questions. He comes to the Center and asks, “Am I in there?” Before he asks that question, there is only some small probability that there is anything on Mr. Fensterwald in that center. As soon as he has asked it, and we search and conclude indeed there is something on him, already a piece of information is available; namely that in the Data Center there is a set of information on Mr. Fensterwald, and since it will be public, what information is in the Center, we will have given both the fact that he is there, plus what information is in the Center to a third person that might want to invade his privacy. So I do think there is a balance here which I have not thought through.

Mr. WATERS. In your consideration of the proposed Federal Data Center or National Data Center, you will give consideration to this type of thing, will you not?

Mr. ZWICK. Yes.

Mr. WATERS. Thank you, Mr. Chairman. That is all.

Senator Long. Thank you, Mr. Zwick. Your statement has been very helpful to us.

Thank you, Mr. Bowman, for being with us. I am sorry we did not ask Mr. Zwick hard enough questions to get some assistance in answering from you. But he did very well.

Mr. Zwick. Thank you very much.

Senator Long. Our next witness was to have been Mr. D. Reid Ross, Vice President, Regional Industrial Development Corporation of St. Louis. But I am informed he will be unable to appear this morning due to the fact that he is presently recovering from an operation.

He has asked permission to submit his statement and some accompanying material and have it placed in the record at the appropriate time. Without objection that will be done.

(The documents referred to follow :)


St. Louis, Mo., March 9, 1967. Hon. EDWARD V. LONG United States Senate, 3107 New Senate Office Building, Washington, D.C.

DEAR SENATOR LONG : Mr. Reid Ross is currently hospitalized and will not be available to testify as scheduled before the Subcommittee on Administrative Practice and Procedure on March 14 as presently planned. Mr. Ross has asked that his letter to you, dated February 2, 1967, and the proposed, entitled "A Phased Plan for a Regional Economic Data Bank for the St. Louis Region” be entered into the record as a part of his testimony. He would also appreciate the record being held open so that he might submit additional testimony within the next three weeks. Thank you very much for your courtesy and consideration in this matter. Very truly yours,

LEROY J. GROSSMAN, Director of Economic Research.


St. Louis, Mo., February 2, 1967. Hon. EDWARD V.LONG, U.S. Senate Office Building, Washington, D.C.

DEAR SENATOR LONG: Because of your concern that the creation of data banks may result in an invasion of personal privacy, it occurs to me that the enclosed report on the nature of the proposed Regional Data Bank that we are now establishing for the St. Louis region should be called to your attention. Attached to this report is an appendix itemizing the source and the nature of the economic data we propose to collect, store and retrieve for economic research purposes.

Let me assure you that we share your views regarding protection of individual privacy and will build adequate safeguards into our information system to protect individual and corporate information.

Briefly stated, most of the economic data we will utilize will come from published sources and will be so aggregated as to be identifiable only by geographic location and not by the name of the individual or the firm that provided the information. Further, the data on individuals that we would collect would (1) be limited to such items as transportation patterns, income and occupation, (2) be obtained on a voluntary basis since we have no governmental powers; and

(3) be combined to provide only totals by some geographic unit (such as a traffic Zone), thereby avoiding individual disclosure.

Additionally, information obtained by us on individual firms would, of course, be voluntary and not be published except by some geographic or industry aggregation. Further, this information will come in large measure, from our membership and financial supporters, whose interests we obviously would not violate. The maximum value to be obtained from data we collect in the fashion just described can be realized only if data collected by various federal agencies are also obtainable after aggregation in the same geographic or industrial categories we have indicated we will be utilizing. We are certain that suitable safeguards are being employed now by federal data collecting agencies such as the Census Bureau so that any published data obviates possible disclosure of individual or firm information, and is usually aggregated in a manner suitable to us.

I would deeply appreciate your comments concerning the nature and purpose of the data we seek to collect and utilize. I would also welcome your observations as to our concern about rights and methods guaranteeing individual privacy.

Additionally, I would like to make some observations about the proposed national data center. I have read the Ruggles Report, The Dunn Report and the Kaysen Report recommending creation of such an agency which would establish safeguards against intruding upon the privacy of individuals, and concur in these recommendations. As you know, the task force established by the Bureau of the Budget to study this matter was asked to consider “measures which should be taken to improve the storage of and access to U.S. Government Statistics."

RIDC is a large user of government statistics in the preparation of its industry studies. We attempt to identify products that can be made profitably in the St. Louis region and to measure the size of the market for these products that could be served by a St. Louis facility. To conduct such studies within our limited financial resources, we must have access to all published data, to avoid duplication and to minimize the time and expense of identifying data gaps and collecting and analyzing new data that we must generate.

Therefore, we have built extensive files of various U.S. governmental statistical reports and tables. Nevertheless, we are periodically amazed to learn that extremely useful data is often obtainable from various federal agencies. We learn about the data only after extensive efforts on our part, for no one seems to know anything about it. To collect this data ourselves, however is financially impossible.

For example, on my recent trip to Washington, I visited the Industry Division of the Census Bureau and learned that certain statistics are available on the mining machinery industry that will be invaluable to us in analyzing the feasibility of manufacturing such equipment in St. Louis. In my fifteen years of experience in utilizing governmental statistics, I had never heard of this statistical series.

It has been said that we have doubled the world's scientific knowledge in the last fifteen years. Certainly, we are doing research in most fields more rapidly than most experts in those fields can even keep track of, much less read, comprehend or utilize, and simultaneously conduct their daily professional affairs. Fortunately, we have also entered the computer age in the last fifteen years and now have the capability to record, store and retrieve results of research in many fields, by use of machines.

For this reason, a federal data center becomes as much a necessity for today's researchers as the Library of Congress has been in the past. In fact, a data center is nothing more than a library of machine-readable statistics recorded on magnetic tape or punched cards instead of in books, and can and should be operated as a library.

For example, even public libraries have not permitted every reader to read every book in the library, nor have they bought every book that has been printed. They charge fees and some books are not released to children. This analogy has applicability to a national data center which certainly does not have to put into machine-readable form all data collected by federal agencies. Further, it can withhold some data from some users. It can also charge fees, thereby meeting some of its costs and at the same time, discouraging unproductive, or unlawful statistical “browsing". Likewise, it does not have to have on its premises all computerized data compiled by all government agencies, provided its comprehensive indexing system noted the existence and the location of the data not physically in the center. To further prevent inappropriate disclosures, the governmental agencies responsible would aggregate information when needed and thereby maintain security.

For these reasons, I am convinced that a national data center is fully in the public interest and a necessity in this age of exponetially expanding knowledge. Further, U.S. government statistics should be collected by a central agency so that we can integrate them with state and local government statistics in order

« PreviousContinue »