User Needs Assessment and Evaluation for the UC Berkeley Electronic Environmental Library Project: a Preliminary Report

Nancy A. Van House
School of Library and Information Studies
University of California
Berkeley, CA 94720-4600, USA
Tel. 1-510-642-9980
E-mail: nav-lis@cmsa.berkeley.edu

ABSTRACT

The UC Berkeley Electronic Environmental Library is a massive, distributed, electronic, work-centered library of information in a variety of formats supporting environmental planning. The goals of the user needs assessment and evaluation component of this project is to maximize usability by means of user- centered system design and evaluation, and to explore how the methods of user-centered system design, usability testing, and library evaluation can be applied to digital libraries. The digital library combines the characteristics of libraries, electronic information retrieval systems, and computer systems that support work to create new and interesting problems of design and evaluation which require new methods for design and evaluation.

KEYWORDS: Digital libraries, user-centered system design, evaluation, needs assessment.

INTRODUCTION

The UC Berkeley Electronic Environmental Library Project is a multi-disciplinary project funded under the NSF/NASA/ARPA Digital Libraries Initiative. The goal is to develop a massive, distributed, electronic, work-centered library of environmental information containing text, images, maps, sound, full-motion videos, numeric datasets, and hypertextual multimedia composite documents to support actual environmental planning decisions.

By work-centered, we mean a digital library designed to support the work of the users, in this case, enviromental planning. These services are distinguished from those required of other forms of digital libraries, such as those found in education or entertainment, in that the key decisions about the system, and evaluation of it, are based on supporting the users' work tasks. In our case, the users are from diverse organizations but are united by their goal of environmental planning. This project is unusual among digital library research projects in this focus on information in support of public policymaking. The potential users and uses of the system are extremely varied and the applications are complex and of substantial practical significance.

Our primary goals are to provide a coherent, content- based view of a diverse distributed collection which will scale to very large collections and large numbers of clients and servers, and to improve data acquisition technology. We are addressing these problems by research focussing on:

The testbed consists of diverse information types that are used in environmental planning. The initial focus is water planning for the San Francisco Bay Delta, although in time we will widen our scope to other bioregions and issues. The target users are the participants in environmental planning employees of state agencies, but also members of local and federal agencies, environmental and industry groups, and the public.

The testbed is being implemented on two tracks. A low-tech system consists of a search engine and data available over the World Wide Web. The corpus at this writing consists of photographs and both ASCII and OCRed versions of scanned documents, primarily state publication. The combination of OCRed and scanned text allow users to both manipulate the text and to view the original document, complete with images, tables, and other data that does not easily translate during OCR.

The current search engine for textual materials is based on the Dienst protocol, with full-text searching to be added shortly. The photos are accessed via a geographical browser. This version is publicly available at http://elib.cs.berkeley.edu.

The second track is a high-tech system implemented on an Illustra database, accessed via a geographical browser and innovative text searching based on natural language processing. The data includes diverse georeferenced (that is, referring to a geographical area) datasets as well as text and still images of several different kinds, and ultimately video as well. Functions of the high-tech version will migrate to the low-tech version as feasible.

The project is proceeding by means of an iterative design process with substantial attention to the needs of users in the areas of content, resource discovery and retrieval methods, document analysis, interface design, and browsing.

This paper discusses the user needs assessment and evaluation component of the project: its underlying premises, methods, and initial findings. We are approximately six months into a four year project, so these are the early findings of a developing project.

ENVIRONMENTAL PLANNING IN CALIFORNIA

Water is a key resource anywhere, but nowhere more so than in the arid West. California has, especially in recent years, experienced cycles of drought and flood. The state has an elaborate system of state, federal, and local water projects that collect, store, and redistribute water from one end of the state to the other. The functions of this complex and delicately-balanced system include water supply, flood control, power generation, recreation, and fish and wildlife preservation. Water planning in California is complex, contentious, and highly political.

The San Francisco Bay Delta is a linchpin in the state water system. Two-thirds of the state's population get their water via the SF Bay Delta, which consists of over 50 man-made islands, a thousand miles of levees, and hundreds of miles of meandering waterways where fresh river water and salt water from the Bay come together. The Delta is also a fragile ecosystem that provides habitat for hundreds of species of fish, waterfowl, mammals, and plants while supporting extensive agriculture and recreation.

Environmental planning is complex and so are environmental documents. An environmental report for a part of the Delta may address waterways, water quality, endangered species, recreation, economic development, land use, agriculture, soil, transportation and utility infrastructures, shipping, flood control, flood insurance programs, political structure, legislation, ecosystem protection, and history. It will include text, maps, charts, graphs, tables, and photos.

Issues recur over time and across geographical areas and therefore across planning initiatives and documents. The Delta smelt, for example, an endangered species of fish, spends different parts of its lifecycle in different places and so will turn up in planning documents covering several areas. For the smelt to survive, these analyses have to be shared and the plans have to be coordinated.

The California Department of Water Resources (DWR) is the state agency with major responsibility for water planning. The DWR coordinates its efforts with a host of other agencies, ranging from the federal Bureau of Reclamation to the Environmental Protection Agency; other state agencies; and local agencies, ranging from water districts to mosquito abatement districts. Other key water planning stakeholder groups include environmental, agricultural, and industry groups.

DWR is also engaged in public education. California suffers from a chronic, mammoth, and growing undersupply of water, even in flood years. DWR's mission includes educating the public about the importance of water and its proper use by such means as developing curriculum materials for schools.

Public policymaking and planning, especially on an issue as critical as water, is information-intense. Water planning requires forecasting supply and demand; developing and modelling alternatives for the management of water supply and demand; and forecasting their costs and impacts. It must take into account the underlying science and the environmental context, past conditions and outcomes, sophisticated modelling of complex systems, and public policy priorities.

Water planning in California is a highly consultative process. It includes extensive analysis by state agency staff and others; exhaustive review and analysis by other stakeholders; and protracted public discussion.

Groups other than state agencies are also involved in producing environmental information. Local and federal agencies are also involved. Many environmental and industry groups monitor and critique the state's work and develop their own data, analyses, and interpretations.

A major goal of the UC Berkeley Electronic Environmental Library project is to support this planning process. Its goals include providing effective access to existing information, reducing the duplication of effort in data collection and analysis, improving the coordination of planning across projects, enhancing the interagency and public review process, and aiding in the dissemination of the large amount of information generated as part of the planning process.

In summary, some key aspects of California water planning that impact digital libraries:

USER NEEDS ASSESSMENT AND EVALUATION

The primary goal of the user needs assessment and evaluation component of this project is to maximize the usability of the UC Berkeley Electronic Environmental Library by means of user-centered system design and evaluation. We are pursuing diagnostic or formative evaluation base on both expert and user evaluation beginning at an early stage in the system's life cycle [sweeney93] and proceeding through the life of the project.

A second goal is to explore how the methods of user- centered system design, usability testing, and library evaluation can be combined, adapted, and extended to the design and evaluation of digital libraries. It is our contention that the digital library combines the characteristics of its precursors -- libraries, electronic information retrieval systems, and computer systems that support work -- to create new and interesting problems of design and evaluation which require new methods for design and evaluation, or at least a rethinking of existing methods.

Usability has been defined as "[a system's] capability in human functional terms to be used easily and effectively by the specified range of users, given specified training and support, to fulfill a specified range of tasks, within the specified range of environmental scenarios" (Shakel quoted in [dillon94], p. 14). This definition stresses the importance of context. Information-seeking can best be understood as a means toward a user-defined end of performing some higher-order task. A digital library must ultimately be evaluated according to how well it supports the users' tasks, and so users are key partners in making that assessment.

The usual approach to user-centered system design is an iterative process similar to the following, derived from Dillon [dillon94]:

Our initial stakeholder and user requirements analyses focussed on staff within the Department of Water Resources. The staff are varied, including scientists and engineers but also writers and education specialists who work with the public. We are currently moving out to other agencies and interest groups involved in SF Bay Delta Planning. Ultimately our digital library will be used by people ranging from environmental planners to interested members of the public to schoolchildren.

User requirements analysis for a work-centered digital library serving a group much larger and more diverse than a single organization or work group must look well beyond task analysis. We have defined four levels of analysis at which user needs assessment and evaluation must operate:

The purpose of defining these levels is to guide user needs assessment, system design, and evaluation. Only by first understanding how environmental planning is done, by whom, and the role of information can we understand how a digital library can support this process. It is always difficult to anticipate the innovative uses to which new technology will be put, and designing systems to suit users' current behavior risks casting in concrete outmoded procedures. A focus on users' functional goals (as well as current practices) provides useful information about how information is used and therefore how technology may help [brown92, dillon94, nielsen94].

Different data collection methods are appropriate for different levels of analysis; for example, to understand the environmental planning process we are using unstructured interviews; sense-making approaches [dervin83] are useful for understanding information acts; and many standard usability methods are appropriate for evaluating system use.

Some initial findings from our interviews reinforce our initial focus on the larger context rather than information system preferences. Our interviews indicate that information search and retrieval are not particularly salient to many of our subjects. Their primary focus is their tasks. Their reflectivity on their information seeking and use is generally low: they use the tools to which they are accustomed, and rely largely on interpersonal information channels. However, when they see the potential of new tools to improve their work, some are eager to adopt them.

Users' evaluations of services are based on expectations, which are in turn based on prior experience [parasur90]. The library evaluation literature has found a low level of expectation among naive users [vanhouse93] and a tendency toward uncritical acceptance of what expert evaluators rate to be low levels of service. User evaluations of digital libraries or their ideas about possible information products and services, therefore, while useful, are an inadequate basis for the design of digital libraries. Hence evaluators and designers must consider but go beyond users' expectations and suggestions.

Our data collection methods include:

In practice, the distinction among these methods blurs. For example, during the interviews -- many of which take place at the user's desk so that we can observe their working environment and tools [suchman91] -- some users have logged onto the Web version of the system to illustrate their points, which often segues into an impromptu protocol analysis.

FINDINGS

At this early stage some interesting findings are already emerging. We divide them into those concerned with the user-centered system design process, and those concerned with the digital library, its content and users' needs and behaviors.

User-Centered System Design.

The idealized model of user-centered system design described above is an oversimplification in this complex setting. Unlike systems designed to serve a constrained user group for a specific function, and like traditional libraries which serve diverse users and purposes, the range of stakeholders here is potentially vast. How do we design for usability when the "dialog between partners" [adler92] (as user-centered system design has been called) involves so many heterogeneous groups?

Professional and disciplinary communities and government agencies each have their own schema for understanding the world and their task domains [elio90]. And the public is likely to differ from these experts. Because water planning is a complex, on-going process, regular participants are likely to engage in a joint sense-making process that results in a shared view of the world and their tasks that differs from the public's [harris94]. How do we design digital libraries to accomodate these differences? Whose schema get reflected in the system's design?

Design is always a political process, with differing interests and priorities [greenbaum91]. Evaluation is likewise political [childers93]. Environmental planning is of course intensely political. Multiple constituency models of organizational effectiveness [zammuto84] may provide some guidance, but there is no easy solution for adjudicating among different groups' needs. Can we design digital libraries to satisfy both subject experts and novices? Organizational insiders and outsiders?

The standard approach of setting performance targets and evaluating the system against them is also problematic. Neither the applications for which the system will be used nor the level of performance deemed acceptable is fixed. Users' needs and expectations develop along with the system, and their experience with this and other systems will affect their expectations and evaluations [parasur90]. The technology of digital libraries is continually changing. Evaluation needs to be fluid and dynamic. And yet the system builders need design targets.

The Content.

The content of work support systems is determined by the work to be aided. The content of traditional libraries is determined by the parent organization, which sets policy about what information is to be included. Organizational priorities and limits on acquisitions budgets and storage space determine what is contained. Digital libraries, with their own corpora and links to external information sources, raise new questions about the process and standards by which content decisions are made.

In the technical and political context of environmental information, conflicts arise over the appropriateness of including some information. The quality of some data may be challenged. Interest groups' analyses may differ substantially from those of government agencies. Individuals with varying levels of scientific and political legitimacy may seek to contribute to the corpus as a means of airing their views. The decisions about what information to include and what to exclude gives rise to debates over ownership, control, censorship, and public participation in policymaking.

The Tasks.

Changes in information availability are likely to change the tasks supported by that information. Planning is essentially an information function: describing the current situation, developing alternatives, forecasting their impacts and costs, disseminating plans and recommendations, collecting and synthesizing input -- these are information activities. An effective digital library will not only assist this process but may profoundly change it. The current process is constrained by the tools available. Although government information is in theory freely available, government agencies often have much valuable information that is not widely distributed due to simple logistics.

Current trends in planning in California will probably make the usefulness of the digital library even greater. The state is working deliberately to make the process of water planning more open, with earlier public involvement, and with more emphasis on coordination across many smaller projects rather than large state initiatives. The digital library is a potentially valuable tool in making this process more open, dynamic, responsive, and transparent.

We may see subtle and complex changes in tasks and work processes as well as the higher-level planning functions. Ruhleder [ruhleder94] warns that information systems researchers must understand the effects of information systems on the codification of data used to accomplish tasks and the relations between users and their tools, techniques or systems for accessing and interpreting data. She points out that media, thought, artifacts, and work processes are deeply intertwined. Because environmental planning is information-intense, we expect changes in tools will change work practices. For example, environmental planning is profoundly influenced by, on the one hand, the existence of many complex time-series datasets, and, on the other, the logistical complexities of acquiring and using data from different sources and in different formats. Improved access to these data will likely change how they are used.

Information Acts.

Among our subjects so far, the information search process is primarily casual and piecemeal, with a heavy reliance on experts. Their behavior can perhaps better be described as information trolling than information search. When something relevant floats past, they snag it: a mention in a conversation, a paper sent to them to review, something stumbled across on the Web. When they do search for information, they are likely to simply call an expert. When they do formal literature searching, their attention to detail is surprisingly low.

One reason for this behavior appears to be the nature of the information need. Workplace users, at least these users of environmental information, want to retrieve information rather than documents per se. For example, they may want to know the total demands on water of a given river. This information may be in the files of an individual, a dataset, or a document. Their need is not for a document but for an answer to a question. In library terms, it is a reference question, not a document request. Users need powerful, complex retrieval and analysis of heterogeneous objects. Our users are enthusiastic about Textiles [hearst93] because it allows analysis and searching at the level of topic rather than document.

Planning and analytical work consists of a continuum or flow, with reports as products that instantiate the work at a point in time. A work group needs to be able to integrate existing internal bodies of multi-datatype documents with external sources. They are continually creating new materials, requiring differing degrees and functionality of external access. Flexible authoring, structuring, and delivery mechanisms are required.

Finally, the digital library needs to be integrated into and augment the users' established work practices. It must interoperate with the work groups' other systems.

CONCLUSIONS

We are at an early stage in user-centered evaluation and iterative design of a project that we believe to be unique and influential. The development of a digital library with unique search features to support environmental planning for a critical resource is an opportunity to further research on both digital library design and user-centered design and evaluation methods.

The digital library combines the characteristics of its precursors -- libraries, electronic information retrieval systems, and computer systems that support work -- to create new and interesting problems of design and evaluation which require new methods for design and evaluation, or at least a rethinking of existing methods.

ACKNOWLEDGMENTS

This project is funded by the NSF/NASA/ARPA Digital Libraries Initiative. The UC Berkeley project is led by Robert Wilensky, Principal Investigator, and Michael Stonebraker, co-Principal Investigator. A description is available at

http://http.cs.berkeley.edu/~wilensky/proj-html/proj-summary.

This paper draws on the work and insights of a multidisciplinary group that includes Prof. Robert Twiss of the UC Department of Landscape Architecture, and Mark Butler, Lisa Schiff, and Gloria Stockton of the School of Library and Information Studies. Gary Darling of the California Resources Agency, Ray MacDowell of the California Department of Water Resources, and numerous employees of the Dept. of Water Resources have been invaluable in helping us to understand their work. The many other members of the UC Berkeley Digital Libraries project have also contributed to the work described here.

[adler92] Adler, Paul S., and Terry A. Winograd. The usability challenge. In Adler, Paul S., and Terry A. Winograd, eds. Usability: Turning Technologies into Tools. Oxford University Press, NY, 1992, 3-14.

[brown92] Brown, John Seeley and Paul Duguid. Enacting design for the workplace. In [adler92], 164-197.

[childers93] Childers, Thomas, and Nancy A. Van House. What's Good: Describing Your Public Library's Effectiveness. American Library Association, Chicago, 1993.

[dervin83] Dervin, Brenda. An overview of sense- making research: concepts, methods, and results to date. International Communications Association Annual Meeting, Dallas, May, 1983.

[dillon94] Dillon, Andrew. Designing Usable Electronic Text: Ergonomic Aspects of Human Information Usage. Taylor and Francis, Inc., Bristol, PA, 1994.

[elio90] Elio, Renee and Peternela B. Scharf. Modeling novice-to-expert shifts in problem solving strategy and knowledge organization. Cognitive Science 14 (1990), 579-639.

[greenbaum91] Greenbaum, Joan and Morten Kyng. Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum Associates, Hillsdale, NJ, 1991.

[harris94] Harris, Stanley G. Organizational culture and individual sensemaking. Organization Science 5,3 (August 1994), 309-321.

[hearst93] Hearst, M. and Plaunt, C. Subtopic structuring for full-length document access. In the 16th Annual International ACM/SIGIR Conference on Research and Development for Information Retrieval, Pittsburgh, 1993, 59-69.

[nielsen94] Nielsen, Jacob. As they may work. Interactions 1.4 (1994), 419-24.

[parasur90] Parasuraman, A., Leonard L. Berry, and Valarie A. Zeithaml An Empirical Examination of Relationships in an Extended Service Quality Model. Marketing Science Institution Working Paper, Report 90-122, Cambridge, MA, 1990.

[ruhleder94] Ruhleder, Karen. Rich and lean representations of information for knowledge work: the role of computing packages in the work of classical scholars. ACM Transactions in Information Systems 12, 2 (1994), 208-30.

[suchman91] Suchman, Lucy A. and Randall H. Trigg. Understanding practice: video as a medium for reflection and design. In [greenbaum91], 65-90.

[sweeney93] Sweeney, M., M. Maguire, and B. Shakel. Evaluating user-computer interaction: a framework. International Journal of Man-Machine Studies 38 (1993), 689-711.

[vanhouse93] Van House, Nancy, and Childers, Thomas. The Public Library Effectiveness Study. American Library Association, Chicago, 1993.

[zammuto84] Zammuto, Raymond F. A comparison of multiple constituency models of organizational effectiveness. Academy of Management Review 9 (1984), 606-616.