My data, Your data, Everybody's data

by M. Spiropulu

The recent Notice of Proposed Rule Making(NPRM) extends the Freedom Of Information Act (FOIA resources) to "require Federal awarding agencies to ensure that all data produced under an award will be made available to the public under the FOIA. If the agency obtaining the data does so solely at the request of a private party, the agency may authorize a reasonable user fee equaling the incremental cost of obtaining the data."
Senator Shelby (R-AL) submitted this proposed revision to the Office of Management and Budget(OMB).

The supplementary information provided in the NPRM state that "according to congressional floor statements made in support of the provision, its aim is to ``provide the public with access to federally funded research data'' that is ``used by the Federal Government in developing policy and rules.'' 144 Cong. Rec. S12134 (October 9, 1998) (Statement of Sen. Lott); see id. (Statement of Sen. Shelby) (the provision ``represents a first step in ensuring that the public has access to all studies used by the Federal Government to develop Federal policy''.) The proponents also stated that ``the amended Circular shall apply to all Federally funded research, regardless of the level of funding or whether the award recipient is also using non- Federal funds.'' Id. (Statement of Sen. Campbell). They also explained that ``[t]he Conferees recognize that this language covers research data not currently covered by the Freedom of Information Act. The provision applies to all Federally funded research data regardless of whether the awarding agency has the data at the time the request is made'' under the FOIA."

This amendment to the FIOA has opened a debate referred to as Secrecy In Science . In the past decade a number of books and articles on "Secrecy" have been published. Among them is the homonymous book by well known and respected senator Daniel Patrick Moynihan (1998). In his book Sen. Moynihan makes the point that the governmental agencies, by withholding crucial information, contributed to poorly made decisions that had disastrous consequencies. However it seems that going from the point of Sen. Moynihan to the NPRM amendment is a very long stretch.

There are a lot of points in the amendment that raise reasonable questions. Starting from the definition of the "data": the data is any sort of recorded information regardless of where and how it is recorded. Therefore digital and analog data, software, writings in logbooks, designs and drawings, tables and charts and statistical records, calibration databases, alignment files, trigger tables and processing programs, monte carlo programs and detector simulations are all data. And in most of the cases, even if one manages to package the data in a concise and usable format, the experience of working in big collaborations tells that one needs substantial other information to make sense of the data (which is usually acquired by walking in the offices of a number of people and asking them). Can all this data be subject to release? And if so can they be released before publication of all possible analyses that are based on the data? Or even before analyses? Usually the data are being collected and analyzed in a longitudinal fashion in successive runs and maybe subsets only of the data are analyzed at anyone time. Are these subsets of data subject to release before the total sample of data has been analyzed?

How is the cost of acquiring the data determined and who is being paid for the data released? How much additional administrative support will the agencies need to comply with the law? For how long after the collection of the data , ought the data be available to the public? There is a great concern among scientists themselves on how and if indeed it makes sense to archive their data in ways that the data can be analyzable in the future. The technology of information storage is changing rapidly and the information to be stored is huge. The only non-volatile and useful archiving of the data as of now is the publication of analyses of the data. Archiving the data over long periods of time might prove a huge, expensive and perhaps unnecessary challenge.

The amendment as now worded covers the data produced by agencies that are funded by the Federal government. But it does not state whether there is a difference if the agencies are partially funded by the Federal government (there are only floor statements on this is issue). For example, there are projects in international HEP collaborations that use funds from foreign organizations and releasing the data might cause a conflict between the participant institutions of a collaboration.

One last worry that comes to mind is the interpretation of the data. It is understandable that one might make data available to peer scientists for checks and reanalyses of the data and for duplication of the results. But it is unclear to me how the public can make sense of the data.

The NPRM amendment to the FOIA is vague and with a number of possible interpretations. The words "data", "policy", "regulatory agency", "cost", need to be defined in a way that activation of such a law will not prove to be disruptive and destructive to the scientific work.

The day due for public comments to the proposed revision is April 5, 1999.
ADDRESSES: Comments on this proposed revision should be addressed to: F. James Charney, Policy Analyst, Office of Management and Budget, Room 6025, New Executive Office Building, Washington, DC 20503. If possible, please include a word processing version of comments on a computer disk. Comments may also be submitted via E-mail to: fcharney@omb.eop.gov. Please include the full body of E-mail comments in the text of the message and not as an attachment. Please include the name, title, organization, postal address, and E-mail address in the text of the message. FOR FURTHER INFORMATION CONTACT: F. James Charney, Policy Analyst, Office of Management and Budget, at (202) 395-3993.

Send to Acqua alle Funi your opinion on the issue

Check the audience responses to a recent NPR Science Friday discussion on Secrecy in Science