KEYWORDS: Administrative Records, CD-ROM, Hypertext Marked Language (HTML), imaging, information repositories, jukebox, National Archives and Records Administration (NARA), public access, Superfund, Superfund Document Management System (SDMS), 386.
The following paper reviews some ways in which these issues affect the management of public records in the U.S. Environmental Protection Agency (EPA) during the transition to a digitally-imaged collection. The EPA's Dallas-based Region 6 office Records Management program is currently converting a number of program records to optical storage and retrieval systems. Activities to digitize the Superfund program's Administrative Records will be briefly examined.
Documents must be retrievable and accessible to be useful. This is equally true regardless of the medium. However, any handling of a paper document risks its deterioration or misplacement. The EPA's Office of Information Resources Management recently published a list of "Threats to Manual Information Systems [6]." 1. Threats to availability
* Theft
* Accidental destruction of records
* Deliberate destruction of records
2. Threats to integrity
* Accidental damage
* Deliberate distortion of records through fraud or sabotage
3. Threats to confidentiality
* Unauthorized disclosure
Balancing the need for access against the challenges of preservation can be a delicate act. A document has no point if it has no utility, but neither has it utility if it vanishes. The above list can be extended. Since most records have no copies, if the original is lost or destroyed, it may be irreplaceable. Tracking records is like tracking the pages of a book -- you don't know one's missing until you look for it. Access to a unique paper document generally means that only one person at a time can use it. Imaging holds great promise to move the Agency beyond the majority of these access problems.
An imaged document may be viewed by many parties without conflict or delay. Copies may be downloaded to diskette without threat to the official copy. Users' access levels can be appropriately set to protect confidential information from improper disclosure. Electronic redaction can serve the same purpose. Another alternative is to simply not image confidential material unless it is declassified at some point. Once imaged, problems with damaged or missing pages disappear. By the same token, documents imprinted on optical platters should never disappear, depending on the lifetime of the medium. So long as standard backup and security procedures are followed, and one or more duplicate optical platters are stored offsite, preservation issues are addressed and more attention can be given to access. The tradeoff carries a risk: If the technology somehow crashes and the original documents have been removed to remote storage, no one has ready access.
The EPA's ten Regional offices, and numerous satellites, reflect the distributed nature of the Agency. Environmental problems differ according to geographical and other local factors. Consequently, the Regional offices tend to operate with national oversight, but great independence from other Regions. Perhaps more by evolution than design, or maybe in an attempt to respond to many voices with differing concerns, the SDMS is designed with a great deal of capacity. Each Region will have to adapt it to their needs, or else face changing their records management to meet the system's design.
The Superfund Records program in San Francisco uses a bar code method of identifying and tracking individual documents, and has no rigorous file structure. The Dallas office, on the other hand, has a well-established file structure relying on parent-child relationships to organize its documents. SDMS was not designed to manage hierarchical relationships between documents. The challenge then, for Region 6, has been to represent its collection through relational indices, and to design an interface to act as a filter to the system, without requiring extensive recoding of the application. Due to its size it lacks flexibility. It is hoped that the Region 9 design, and the Region 6 adaptation, will point the way for each successive Region to overlay their own "filter" on the general system. This method will enable a "one size fits all" approach, provided that the glass slipper is large enough for the feet of ten Cinderellas.
The indices for the Superfund Document Management System are stored in an Oracle 7 relational database. This is complemented by a fulltext database known as HighVIEW developed by Highland Technologies. The search engine is through ZyINDEX, and allows highly complex search strategies. Searches may be composed to search via relational indices or fulltext, or both. Boolean, proximity, and quorum operators are available, and numeric ranges may be specified, as well. Wild card symbols can be employed to substitute for any character(s) within a term, such as sh??e to retrieve shore or shade, or sho* to recall show, shore, or short [7]. The end user accesses the system through a point-and-click Windows interface. As long as the personal computer which any given employee uses has at minimum a "386" coprocessor and Microsoft Windows, the imaged document may be retrieved at the desktop across a Novell local area network.
Documents are batch scanned into a Pentium file server before being transferred to six-gigabyte 12" WORM optical platters running on a jukebox system. Present estimates predict that optical discs will suffer no significant deterioration of data for 10-30 years, assuming careful storage and handling [8]. A study done recently concluded by NARA and the National Institute of Standards and Testing suggests that certain optical media may endure from 57 to 121 years, depending on conditions of heat and humidity to which discs are subjected [9].
Several Region 6 imaging programs are well-underway [10]. In many respects the SDMS project may be the most ambitious. The collection is organic, and growing rapidly. It is heavily used both by EPA program staff, and the public through Freedom of Information Act requests. And, there is an increasing interest in making databases and document collections openly accessible to any interested party across the Internet. Several EPA databases are already accessible through the EPA gopher and World Wide Web homepage. The following case provides examples of some immediate benefits to be won from imaging Government documents.
A common problem associated with the AR, particularly when it is only available in hard copy at a repository (typically a library or public facility near a Superfund site), is that parts of the document collection can be damaged from handling, or worse, simply "disappear" from the AR altogether. This hinders (or prevents) timely review of the AR by the public, which must take place within a set time frame (typically between 30 and 60 days). Furthermore, hard copy AR files often occupy many linear feet of stack/shelf space, and can be intimidating to the lay public who are often only interested in a small portion of the overall AR. Storing and managing these large volumes of organic files in Regional records centers is also problematic. Happily, evolving digital technology may offer a solution to all of these problems, as well as significant cost savings to the government (and thus, the taxpayer).
Superfund community involvement staff in the EPA Region 6 offices, Dallas, Texas, have taken limited, informal surveys of existing Superfund site information repositories to determine availability of CD-ROM hardware. The results of these surveys are promising: Upwards of 70% of the repositories either already possess CD-ROM capacity, or plan to acquire this technology in the coming months. Most library managers are well aware of the significant space savings and ease of information access afforded by compact discs.
This technology is particularly promising from a security standpoint. The recurring malady of damaged, missing or incomplete documents in the AR can be virtually eliminated by simple control mechanisms already in place at most libraries. The CD-ROM is considered as a reference document, and is checked out to library patrons for use, then returned after review is complete. Users will also have the ability to make hard copies of the imaged information, or for a reasonable fee, purchase a CD-ROM copy of the AR directly from EPA (e.g., under the Freedom of Information Act).
Discussions on using CD-ROM discs as delivery vehicles for Administrative Records are still in the preliminary stages. Recent developments in CD-Recordable technology promote this approach, but the mastering process from the 12-inch WORM optical platter needs to be designed. Consideration is being given to coding or selecting viewing software which will allow the CD to be launched with a simple command. The user would be able to navigate through menus to select either individual or groups of images. Emphasis will be placed on ease of use.
The situation is similar to that which the Agency has faced since its foundation: Independent programs with distinct mandates settle on internal remedies to what may in fact be systemic problems. Or worse, local patches accumulate until they become system integration problems for the entire Agency. Reinvention is the catchword driving much of the Government's modernization efforts. Reinvention initiatives have made their ways into information management principles and systems technologies. At the core of information systems are the documents which they are designed to handle. The EPA imaging projects are attempts to enhance user access to public documents and reduce the hassle of dealing with incompatible, and sometimes incomprehensible Government information systems. [11]
Optical technology is improving, network technology and use are co-evolving at a revolutionary pace. Despite the perils, the paper medium has limitations and costs which may be addressed splendidly by developments in the digital cosmos. Optical technology offers enough advantages over present ways of doing business that the investment may prove well worth it. THE VIEWS EXPRESSED IN THIS PAPER DO NOT NECESSARILY REPRESENT THOSE OF THE U.S EPA OR THE UNITED STATES GOVERNMENT.