Early Prototypes of the Repository for Patterned Injury Data

Prasun Dewan, Kevin Jeffay, John Smith, David Stotts
Computer Science Department
University of North Carolina
Chapel Hill, NC 27599-3175
colab-fac@cs.unc.edu William Oliver
Armed Forces Institute of Pathology
Walter Reed Army Hospital
Washington DC
oliver@cs.unc.edu

ABSTRACT

We have constructed a proof-of-principle system for supporting collaborative forensic medicine. The early prototype is built on ABC/DGS, a graph-server and collaborative hypermedia system built in the UNC Collaboratory. A second prototype is underway that has more flexible control of multi-person creation of, and access to, the shared patient data and pathology artifacts. Created with Dewan's Suite, this version maintains consistent yet different independent views of the underlying data, and moderates access through these views. We conclude by describing a planned third prototype, to be built not on ABC, but on a modification of the WWW httpd distributed data server.
KEYWORDS: forensic pathology, ABC, DGS, Suite, view coupling, access control, group access, World Wide Web, httpd, distributed data server

OVERVIEW

Forensic specialists have long understood the importance of toolmarks and trace evidence in the investigation of violent crimes and in successfully prosecuting those who commit these acts. However, except in specific areas such as forensic odontology and tire and shoeprint impression analysis, there is little formal work in the area of patterned injury analysis as a problem in forensic pathology. A better ability to use patterned injuries to determine what object or objects were used to commit violence, matching a possible weapon or object to a wound with some degree of certainty, or of matching a mark on a body with an object at a scene would be a tremendous boon.

We believe there are three primary reasons why large-scale formal work has not been done. The first is that developing expertise in the area is profoundly experiential. The universe of possible objects and the variety of wounds that any particular object can cause is large, and experience is gained slowly. Second, there is no central collection of patterned injuries. A centralized collection of solved cases provides a uniform teaching base, a standard by which other cases and other approaches to cases can be evaluated, and a resource for approaching unsolved cases. Finally, it is difficult for pathologists to consult on a large-scale basis about patterned injuries. The usefulness of multi-person interactive consultation about images has been demonstrated in radiology and other medical disciplines, and is a driving force behind the imaging workstation systems being implemented by the FBI.

The Repository for Patterned Injury Data (RPID) project is addressing each of these problems. First, we are building a digital library for forensic medicine, an electronic information repository to support collaboration on medical cases involving patterned injuries. The library is to contain traditional data forms (text, images) as well as newer multi-media data forms (audio, video, handwriting). Second, we are assembling a computer and communications infrastructure to provide access to the library for real-time interactive consultation and database exploration. Third, we will carry out a continuous series of evaluative and experimental studies to validate the designs and guide further system development.

The design of the RPID was reported in the first Digital Libraries conference [dl94]. For clarity here, we briefly summarize the structure and goals of the project. Following this recap, we illustrate the current prototypes and discuss the interactions we have established for consulting pathologists. We conclude with our plans for subsequent versions of the RPID.

RECAP OF PROJECT GOALS AND DESIGN

Prevention of violent crime continues to be an important national priority, and it is increasingly important in the daily lives of our citizenry. Those actively involved in investigating these crimes and in apprehending the people who commit them must be able to pool their knowledge and expertise to provide maximum effectiveness. The Repository for Patterned Injury Data (RPID) we are creating, and our research with it, will enable more effective interaction of forensic pathologists. Specifically, it will enable forensic pathologists to consult a large testbed of patterned injury data, share new case data, apply enhancement and analysis algorithms to images, and consult remotely with one another.

The analysis of patterned injury in forensic pathology presents a challenge that draws uniquely upon both the medical and forensic expertise of the investigator. As a problem in wounding, it is a medical challenge. As a problem in image enhancement and pattern analysis, it is a forensic challenge. This skill benefits heavily from experience. The investigator must have both experience in wound analysis and a knowledge of the universe of discourse, wherein lies the object which caused the injury.

For instance, discerning the general properties of an object (that it is rounded, sharp- edged, etc.) is well within the general knowledge base of most forensic pathologists. Recognizing that a specific injury is likely to have been caused by, say, a socket wrench requires the examiner not only to know about pathology but also to have some knowledge of automotive tools. This need for an encyclopedic recall of objects and object properties makes much patterned injury analysis difficult, especially when injuries occur in areas which abound with specialized tools, implements, and objects (e.g., construction sites, factories, hair-dressing salons, etc.). Moreover, recognizing the types of marks made by partial or oblique blows is a further challenge in analyzing the geometry of impressions.

No forensic pathologist or odontologist can be an expert in all areas of hardware manufacture and utilization. Instead, we must rely on the experience of our colleagues. Unfortunately, when an expert in the field retires, a wealth of specialized experience is lost, usually along with a career's worth of valuable patterned injury data and analysis results. The RPID project is an attempt to save such knowledge and to make such experience permanently available to pathologists and investigators across the country. We are doing this by establishing an electronic registry of solved patterned injuries, and by developing a computing and communications infrastructure wherein pathologists and investigators can electronically access the data, search for cases germane to their current problems, and consult with one another.

Key capabilities of the RPID include:

Key benefits of RPID to law enforcement and forensic personnel include:

FIRST RPID PROTOTYPE

Figure 1: Screen from RPID prototype; shown are image list for a case, autopsy report with links to images, and links within an image to other information.

We have assembled an initial collection of forensic medicine cases for early experiments with RPID. Data for each case includes photos of the scene surrounding the events, photos of the person involved, text reports from the attending physicians, text reports from law enforcement personnel, traditional database information entered by the pathologists involved (victim age, sex, name, event classification, etc.), and computer-generated graphics resulting from analysis of case data.

The data for the repository is coming from the extensive archives of the Armed Forces Institute of Pathology (AFIP); from the Office of the Armed Forces Medical Examiner's (OAFME) Lindenberg collection of microscope slides and pictorial data; and from the Milton Helpern Forensic Pathology Museum. Other specialized collections, such as images of tire tread marks, muzzle imprints, and others, are maintained by individual pathologists and researchers; we intend to bring these data into the repository, once it has been established. Investigators who have expressed interest in submitting their data into the collection include Dr. Marcella Fierro, who has an extensive collection of muzzle imprints and soot deposition patterns, Dr. Richard Froede, former Armed Forces Medical Examiner, and Dr. Homer Campbell of the New Mexico Office of the Medical Investigator. The North Carolina Chief Medical Examiner's Office is also contributing 15 years of state autopsy records.

We have encoded this information in the ABC collaboration system [matrixCSCW92] and have adapted several of the ABC browsers to allow rendering and annotation of the images. Figure 1 shows an ABC screen of one case from RPID. In this view, autopsy photos are collected and links among them have been created to compose related views. We have also linked in the text report from the autopsy, with the link anchor connecting specific regions in the photos with specific paragraphs in the report. In the photo, link anchors show up as rectangles in contrasting colors.

In this prototype, we depend on non-integrated collaboration support systems for information sharing and group interactions. In the Colab we support XTV shared X window system [Abd88a,Abe90a]. With XTV, the collaborators can view and analyze common images, with all collaborators seeing the same output from a single tool process. When reports are viewed, all see the same text. When a link in an image is selected, the destination image (or report) is displayed on all workstations. The common viewspace is achieved by multiplexing the X server protocol streams from the individual workstations participating in the shared X sessions.

INCREASED FLEXIBILITY IN SHARING DATA SPACES

An important abstraction in a digital library is the notion of a shared workspace. A shared workspace is a window on the screen that gets updated in response to actions of other users. Several projects, both within UNC and elsewhere, are actively researching issues in the design and implementation of shared workspaces. In this section, we show how this concept is being applied in RPID.

In the initial RPID prototype, sharing is done in XTV by multiplexing X protocol streams, giving multiple users identical screen views. We are developing a second prototype that encodes more specific, and more flexible, sharing semantics based on abstract data objects defined in the implementation code. The basic image database remains the same; for interaction control we are using a combination of the Suite multi-user interface building system [dewan-framework-transactions] and Trellis process models [cscw94].

In general, the RPID library helps pathologists resolve cases by producing autopsy reports on them. These reports are represented internally as linked data, amenable to printing or to browsing as hypermedia documents. Using Suite, we create roles for the people involved in creating and using data in each case, and we assign access privileges based on roles. Suite manages these administration of these privileges as well as basic concurrency control on the database.

Roles and Rights: An Example in Suite

The Suite-based implementation provides each user with a window on the data contained in each case. The overall structure and substructure of the data (object/field names and types) is shown, and according to the privileges of the specific user, values for the data are shown with appropriate levels of detail.

The designer of each case must specify access rights on a substructure basis, determined according to the roles played by the users. Examples of roles in RPID include pathologist, examiner, data entry, manager, judge, etc. The existence of different substructures can be revealed or hidden on a role-by-role basis; for revealed substructure, the values can be hidden, fully displayed, or partially displayed with various levels of detail.

Each case is assigned to an examiner, who is responsible for its final contents. The NewCase object in the windows of Figure 2 contains an outline of the general structure of such a report. The report identifies the subject, examiner, assistant, and toxicologist on the case, describes the wounds and evidence found, and contains the results of the toxicology tests.

Figure 2: Examiner and consultant share autopsy report and list of previous cases; they have different access rights to and views of the information.

In our hypothesized scenarios, several people, playing different roles, collaborate with the examiner in the production of the report. A secretary helps enter most of the information in the report. A toxicologist produces the results of the toxicology tests. The assistant signs the report once it is completed. Finally, one or more consultants help the examiner reach a verdict on the manner of death by entering consultation comments.

Shared workspaces provide high-level support for this collaboration among distributed users. In general, each collaborator manipulates a set of fields shown in his workspace. The workspaces of the different users are coupled by a set of coupling constraints defined by the library. These constraints ensure that users automatically receive information in which they are interested in a timely manner. Most of the previous work in shared workspaces constrain all coupled workspaces to be identical, thereby supporting WYWSIWIS (What You See Is What I See) collaboration. The coupling constraints defined for this application, however, are more complex and depend on the roles of the users and the fields they are editing. For instance, the Manner field of the forensics report entered by the examiner is sent to others when he executes the Transmit command, while every character typed by a consultant in the Consultation field is immediately sent to all other consultants on the case (Figure 2).

In addition to the values of the objects in the workspace, the application can also couple the views of these objects. A view of an object selects the set of fields of the object that are displayed to the user. For instance, the view of the object, PreviousCases, determines which of the previous cases are shown to the user. The digital library currently provides users with several commands to change the view of this object. For instance, it allows them to select all previous cases, none of the previous cases, all cases that involve wounds to a particular part of the body, and all cases in which the wound length is greater than, equal to, or less than a specific length. The application allows consultants working on the case to browse through previous case information separately (Figure 2) or together (Figure 3).

Figure 3: Examiner and consultant share view showing all previous cases with wounds of length <= 20.

As potentially sensitive information becomes available to multiple users, it is important to ensure that it be protected from unauthorized access. Several access control schemes have been devised for objects managed by file systems and database systems. Some of these concepts apply directly to our library. For instance, the list of previous cases is associated with a Write right, which ensures that a previous case cannot be updated. Several additional access concepts, not found in traditional systems, are supported in this application to support the desired protection. Rights are associated with fine-grained objects, which override those defined in the parent objects. For instance, the Assistant field is associated with its own Write right, which is used to ensure that only an Assistant can sign his name. The Read right must be interpreted differently, since users do not explicitly request reads. A read protected object is made opaque in the display by replacing the display of its contents with a special string indicating that the item is readonly (Subject and Pathologists fields of Figure 2). In addition to the traditional Read and Write rights, a large number of other rights are supported. For instance, an Update right ensures that only the examiner can commit the current case to the database and the ViewCoupled right ensures that only a consultant can couple his view with the views of other consultants. In general, every operation that can be shared with another user is associated with its own right. We have a large number of such operations (over 50), and the rights to these operations are arranged in right groups such as DataRights and CouplingRights. A right group allows access to whole group of operations to be given by one specification. Rights are given not to individual users but to roles such as Examiner, Assistant, Secretary, and Consultant. Moreover, these roles are arranged in a DAG, which allows a role to inherit rights from its parent roles. For instance, an Examiner is a Consultant, and thus inherits the rights of a Consultant, which it can override.

In general, it may not be possible or even desirable for all collaborators to always work in the synchronous mode. For instance, two consultants separated by a slow or unavailable line, may want to enter their comments independently (Figure 4). The library allows two separate versions of the workspace created by different users to be automatically merged, as shown in Figure 5.

Figure 4: Consultation comments are entered asynchronously.

Figure 5: Consultation comments are merged.

All of the capabilities mentioned here have been implemented using the Suite system [dewan-framework-transactions]. Individual papers on coupling [dewan-coupling-proceedings], access control [dewan-shen-access], and merging [munson-dewan] describe these capabilities in more detail. Here, we have shown the application of these capabilities in our library project.

Access rights, precedences, work flow, and roles are all described for now in C structures written within Suite. We are experimenting with having Suite communicate with the Trellis engine to get process information related to collaboration groups and group behavior. Once this is working we will have the ability to alter dynamically the interaction rules for collaborations without recompiling the system.

Currently, the shared workspace component of our work is separate from other components such as the hyper-linking and shared-window components. Our proposed work on the collaboration bus will integrate it with these other components.

PLANS FOR SUBSEQUENT PROTOTYPES

Our initial prototype has been built on the Distributed Graph Server (DGS) in UNC's Colab, using the ABC collaborative hypermedia system for organizing the data and accessing the DGS. For subsequent prototypes we plan to work with a scaled-down system designed around an augmentation of the WWW distributed data server httpd. We expect that this design approach will give great leverage, since the httpd is well-tested, in wide-spread use, and gives off-the-shelf solutions to several problems such as user authentication, data encryption, and compatibility with other systems.

Use of the WWW server for distributed data exchange

The ABC system supporting the RPID is built on a distributed data server called DGS [dgsHT93], a research project in the UNC Colab. While a full implementation of the database within the ABC/DGS structure has many advantages, the unstructured flat graph paradigm of the World Wide Web has grown rapidly in the past two years to become the de facto standard of browsable and sharable information systems. The implementation of our system within the WWW would allow greater use of existing implementations and the concomitant savings of time and resources. The wide availability of WWW implementations across systems would allow dissemination of the product to users with a wider range of infrastructures. There are several major server modifications to make, as well as several new applications required to interact with the server to provide the collaboration functions characteristic of the RPID.

Using httpd we can get a full-scale RPID operating sooner, with more reliability and less expense. However, we will have some changes to make in the server, resulting in a specialized version. For example, we will need to add the ability for the server to manage hyperlinks that have multiple sources and multiple destinations; this structure has proven useful in other projects [cscw94] for representing collaboration protocols in hypermedia. The current httpd server does not support such links. Our modification will be upwardly compatible with existing httpd versions and with existing WWW data.

Another alteration we must design and implement is a pseudoserver to sit between the interface clients and the server. The pseudoserver will capture traffic between clients and the server to broadcast actions to groups of collaborating users. When one user is browsing data, the collaborators will see the same image. The pseudoserver will also be a common collection point for capture of user/system interactions (called protocols in the evaluation section below). We must collect these activity traces to perform valid assessments of system performance and utility. The pseudoserver technique has been successfully used in other collaboration systems architectures [dgsHT93,menges93,abdelwahab91].

Custom-built application software

Though we plan to use a modified httpd distributed data server, current WWW interface clients do not provide all of the functionality we need for the data sharing and collaborative manipulation aspects of the RPID. Thus, we have several custom-designed applications systems to construct to interact with the data server.

To support browsing, objects will be linked with one another along a variety of dimensions. For example, all of the data associated with a given case could be linked into a tree that included as one branch the images, as another branch the various reports associated with the case, and, as a third branch, video clips of the crime scene. However, images in one case could also be linked to similar or related images in other cases according to specific features, such as length of wound, depth, or shape. Thus, users will be able to browse within the material from a given case but also across the primary structure of the library to data associated with other cases. The system will record a trace of users' browsing paths to facilitate subsequent analysis, as a learning aid, and as a means of capturing one form of expert knowledge. We will have to construct Web clients that give better support for hierarchy and link typing than the current flat HTML standard allows.

To support search, data will be characterized in terms of specific features and parameters on those features when they are entered in the repository. Features will be used to generate links in the hypermedia graph structure. They will also be stored in a conventional information retrieval system. Consequently, users will be able to submit queries to the retrieval system, obtain a set of objects, which they may then view or otherwise access through the hypermedia system. Once a user has arrived at a given object, he or she will be able to branch out using hypermedia browsing facilities to other objects linked to it. Thus, the system will combine capabilities of both hypermedia browsing and information retrieval.

The system will be extensible so that additional applications can be included in the environment. This will enable users to invoke specialized tools - such as image enhancement programs. Once such applications have been run, their output can also be stored in the repository for future use. Thus, the system will provide a flexible, easy to use environment for exploratory data analysis. The system must also include provision for audio and video conversations and discussions. Initially, teleconferencing will be supplied by enhanced telephone lines and specialized hardware/software configurations incorporated into users' workstations. As digital technologies mature, we hope to include teleconferencing through the computer network.

Since consultation is so important and since it is so hard to get people together at the same time, particularly if their skills are in high demand, the system will also include facilities to enable a user to organize bodies of material, including his or her recorded statements, that can later be viewed and responded to by another, consulting user. Thus, collaborators will be able to carry on extended, asynchronous "conversations" with regard to specific data without both having to be available at the same time. A similar facility for collecting, recording, and playback will also be useful to forensic pathologists when presenting evidence in the courtroom.

Conferencing capabilities

Currently, WWW browsers do not support computer conferencing that would enable two or more users to consult the same data at the same time. Work is being done on this problem at several locations and we expect a generalized solution during the next year or two. In the meantime, we will build an interim solution that will enable multiple users to consult the same data but not edit it. Thus, they would be able during a teleconference to refer to the same information, but only one user would be able to modify that data.

Our approach for data sharing will be the pseudoserver described above, in conjunction with the WWW httpd server. For user-to-user conferencing, we will take a modular approach that will upgrade with advances in technology. Rather than construct custom audio and teleconferencing facilities, we will integrate existing solutions with the RPID architecture. A natural progression through several technology stages would be:

The first level of this progression is immediate. We expect to achieve the 4th level of technology in the final RPID prototype. The feasibility of the last two levels will be assessed, but we expect them to be beyond the budget of the current project.

User Authentication and Data Encryption

Security is a concern for a distributed system containing sensitive and confidential patient medical information. Several companies are producing secure versions of the httpd server, and we anticipate basing the RPID on such a secure server. In a secure server, transactions are encrypted in both directions (into and out of the server), and will allow safe transmission of data; if intercepted, the encrypted data will be unintelligible and so useless to any illicit recipient. We will have to acquire the source for a secure server so we can make our collaboration-supporting modifications and user evaluation modifications.

Related Work

While there are several projects reported in the literature that provide hypermedia data in a medical context [FrisseHTCACM88,BurgerHT91,Fowler94] the RPID project has some unique aspects.

The previous efforts have not been in the context of large or widely distributed data sets. They have allowed physicians to simulate in hypermedia their research notebooks, for example; or they have provided centralized facilities for small library subsets.

A unique emphasis of the RPID is enabling of collaboration among users of the library. Another unique aspect of RPID is the research focus on building an infrastructure that will allow the easy integration of new user tools and source data as the library grows. Thus, we are designing for, and investigating the practicality of, continual expansion of the data and increasing distribution among the library sites.

There are also numerous projects that provide image storage and manipulation facilities to medical personnel using images like radiographs and micrographs. These systems tend to focus heavily on the graphics capabilities, and are little more than traditional databases otherwise. A stunning, though currently non-digital, database is the "Visual Diagnosis of Child Abuse" slide set being collected by the National Resource Center on Child Abuse and Neglect. Marcella Fierro has collected an excellent database of muzzle imprints and related patterned injuries. Dr. Fierro's database is unfortunately not digitized; we hope to add it to our database system once a useful system of exploration and consultation is developed. Many departments of Pathology have established World Wide Web (WWW) sites, such as the University of Texas at Houston, University of Washington, University of Alberta, UVA, University of Utah, Fujita Health University, and the Victorian Institute of Forensic Pathology in Australia. These sites have (non-forensic) image galleries available or under construction, but do not represent interactive database efforts. A good listing of currently available links can be found at the Univ. of Alberta site http://synapse.uah.ualberta.ca/synapse/000p0025.htm. The NIJ is also funding a number of literature databases, but these do not involve image analysis directly.

A number of new and important databases and manipulations systems have been developed in non-medical forensics, including the AFIS fingerprint identification system, Drugfire shell casing analysis database, the FBI shoe impression collection, and others. Again, however, there are important differences between these systems and the RPID with respect to interactivity, consultation, and image search methods.

REFERENCES

[Abe90a] M. Abel. Experiences in exploratory distributed organization. In Intellectual Teamwork: Social and Technological Foundations of Cooperative Work, pages 489-510. Lawrence Erlbaum, 1990.

[abdelwahab91] H. Abdel-Wahab and M. A. Feit. XTV: A framework for sharing X Window clients in remote synchronous collaboration. In Proc. IEEE Conf. on Communications Software: Communications for Distributed Applications and Systems, pages 159-167, April 1991.

[Abd88a] H. Abdel-Wahab, S. U. Guan, and J. Nievergelt. Shared workspaces for group collaboration: An experiment using internet and unix interprocess communications. IEEE Communications Magazine, 26(11):10-16, November 1988.

[BurgerHT91] A. M. Burger, B. D. Meyer, C. P. Jung, and K. B. Long. The virtual notebook system. In Proceedings of ACM Hypertext '91, pages 395-401. ACM, December 1991.

[dewan-coupling-proceedings] Prasun Dewan and Rajiv Choudhary. Flexible user interface coupling in collaborative systems. Proceedings of the ACM CHI'91 Conference, pages 41-49, April 1991.

[dewan-framework-transactions] Prasun Dewan and Rajiv Choudhary. A high-level and flexible framework for implementing multi-user user interfaces. ACM Transactions on Information Systems, 10(4):345-380, October 1992.

[Fowler94] J. Fowler, D. G. Baker, R. Dargahi, V. Kouramajian, H. Gilson, K. B. Long, C. Petermann, and G. A. Gorry. Experience with the virtual notebook system: Abstraction in hypertext. In Proc. of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW '94), pages 133-143, October 1994.

[FrisseHTCACM88] Mark E. Frisse. Searching for information in a hypertext medical handbook. Communications of the ACM, 31(7):880-886, July 1988.

[cscw94] R. Furuta and P. D. Stotts. Interpreted collaboration protocols and their use in groupware prototyping. In Proc. of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW '94), pages 121-131. ACM, New York, October 1994.

[matrixCSCW92] K. Jeffay, J. K. Lin, J. Menges, F. D. Smith, and J. B. Smith. Architecture of the artifact-based collaboration system matrix. In Proceedings of CSCW '92 (Toronto), pages 195-202. ACM Press, 1992.

[munson-dewan] Jon Munson and Prasun Dewan. A flexible object merging framework. In Proc. of the ACM Conf. on Computer Supported Cooperative Work (CSCW '94), pages 231-242, October 1994.

[menges93] J. Menges. The X engine library: A C++ library for constructing X pseudoservers. In Proc. of the 7th Annual X Technical Conference, pages 129-141, 1993.

[dewan-shen-access] Honghai Shen and Prasun Dewan. Access control for collaborative environments. In Proc. of the ACM Conference on Computer Supported Cooperative Work, pages 51-58, November 1992.

[dl94] D. Stotts, J. Smith, P. Dewan, K. Jeffay, F. D. Smith, D. K. Smith, S. Weiss, J. Coggins, and Wm. Oliver. A patterened injury digital library for collaborative forensic medicine. In Proc. of Digital Libraries '94, pages 25-33, June 1994.

[dgsHT93] D. E. Shackelford, J. B. Smith, and F. D. Smith. The architecutre and implementation of a distributed hypermedia storage system. In Proceedings of ACM Hypertext '93, pages 1-13. ACM, November 1993.