[May 2006]

Dear friends:

 

The recent announcement of LC's decision concerning series authority records has given rise to considerable discussion about a number of issues which have concerned me for many years. The comments which I wish to make are, like many recent posts, intended to relate this single issue (series authorities decisions at one library) to more fundamental issues that involve us all. These are: 1) nature of cooperation and resource sharing; 2) successful bibliographic (re)search; 3) purpose and goals of cataloging; 4) direction and manner of change.

 

1) Cooperation. It is a fundamental principle of cooperation that all parties involved do their share of the work  A cooperative agreement in which one partner (e.g. LC) or a few is/are expected to do all or most of the work succeeds only so long as those partners continue doing the work in a satisfactory manner. The decision to cease providing series records is simply the latest evidence that the success of our cooperative agreements depends upon the reliability of the cooperating institutions. The decisions of OCLC to allow vendor records and Cornell's below minimal level records should have sparked an even stronger opposition. Since most libraries have assumed the contributions of LC and all others, staff have been reduced to minimum (or lower) and now there is simply no possibility of picking up the extra work load. Predictions concerning shared cataloging led many libraries for many years to eliminate or not fill cataloger positions in the expectation that our cataloging will be done by someone at another institution. I think that everyone knows at this point that some (how many?) PCC libraries will follow LC regardless of the official PCC stance because they have no adequate staff to take up what LC formerly did. LC knows this too.

 

In order for cooperation to work for all involved, the nature, purpose and goals of the work must be clearly understood and affirmed by all participants. These have been clearly stated in a number of public policy documents and it is toward the satisfaction of these publicly stated professional goals that most catalogers orient their work. Yet in a shared database such as OCLC the enormous diversity of participating institutions creates problems for determining mutually acceptable goals. Vendors need order records, not research tools. Public libraries cater to a user community which differs markedly from the academic users of a research library. Rare books require a significantly greater attention to description than current academic monographs, for which subject analysis is more important. Some members of the profession have an overriding concern with user perceptions of the ease of searching (everything through a single box Google style) and are satisfied with "getting information"; the model of the user for these librarians is the high school or undergraduate student who is required to "include a bibliography of at least 5 sources excluding encyclopedias". Others more concerned with the thorough and intelligent searching which serious scholarship requires argue for detailed subject indexing of all materials which alone will support cross-language subject searching. In a database created and maintained by participants with so many conflicting understandings of the goals of the information created and entered into the database, these local and conflicting goals must be pursued locally, which of course requires local staff.

 

In a cooperative database serving such disparate institutional needs, the option for local control of local demands depends upon local staffing; where staff are too few in number or lacking the requisite range of skills, the only options are either following whatever everybody else does or the abandonment of any evaluation for locally appropriate and adequate information.

 

2) Research. I use Google. A lot. But Ms Marcum's insistance on the sufficiency of and preference of researchers for Google over the library catalog is tellingly illustrated with an account of how an undergraduate now goes about writing a paper, even stressing that the student wants an A. (http://www.loc.gov/library/reports/CatalogingSpeech.pdf) Ms. Calhoun's report to LC was based on a review of a small portion of the last 5 years work in the English language. To do research as described by Marcum or published by Calhoun one does not need a library and certainly not a librarian, whether cataloger or administrator. Nor does one need a library catalog. However, some of us younger folk make much more rigorous demands of scholarship, not only of the argumentation but for the literature search as well. If everything were free and available with fulltext online, then perhaps neither a library nor a librarian would be necessary. But everything is not online and we must live in the actual present rather than in someone's plans and dreams for the future. In the actual present most of the worlds' published material is not available online anywhere, and much of what is online is only available to subscribers. In this situation, which is ours, we need to be able to find what is available to us here and now in the library to which we have access. And when we find citations anywhere, though whatever means to relevant material which is not available online, we need to be able to find out whether or not it is available in or through our library and if so where and how to obtain it. We can, of course, digitize everything we have (necessitating many copyright violations) or at least our bibliographical information and make it available for web search engines to index, thereby allowing and forcing everyone to use Google to find out what is in the library. Yet if we do the latter—contribute bibliographic information—then someone has to do that in some intelligible and searchable manner. If one wants to search by author, the author has to be identified as an author—rather than the title—and the same goes for series searching and every other kind of specific (intelligent) searching.

 

The trouble with Marcum, Calhoun et al. is that they are arguing for information seeking rather than research, and in this model, any information found implies a successful search. The Google model of information seeking is not a model of intelligent research; it is a model of easy information gathering. For someone who wants "a bibliography", Google works. For someone who wants "the bibliography", Google searching is a wonderful starting point and a great last minute source for recent materials, but if Google alone is used, it fails miserably. No matter what the subject, one need only compare what a Google search retrieves with what one can find in a good multilingual bibliography on your topic to realize the extent of Google's failure as a one-stop research method. Rather than taking a survey to determine what "most people" do first or last or find easiest, professional librarians in academic institutions ought to be responsible for developing, implementing and teaching a variety of methods, each of which may be appropriate for several different types of search/research but not necessarily useful or adequate for others.

 3) Cataloging. The simplest argument for the continued necessity of cataloging is the demand for cataloging born-digital materials available online. The next simplest argument is that there is no machine of any sort which can take a book, article, painting, piece of music or computer file and inform the searcher of the book's (computer file's, score's, etc.) title, author, publisher, etc. Either these kinds of information about every document must be entered into the machine in a machine-readable fashion by someone who can interpret the document, or every item must be so standardized that the interpretation of the particular significance of every bit of text (sound, color, etc.) follows from its location in the item. The vast numbers of hits in a Google search is due to the fact that everything has the same significance in a Google search: one cannot search for authors, composers, titles, publishers etc. in any other way than as bits of text of no particular significance. Thus a Google search for Mongol and Java retrieves 326,000 hits, while a search for Mongols and Java retrieves 116,000. In contrast, a search in OCLC provides 10 hits, 9 of which are of very limited interest or useless (e.g. maps of the Mongol empire and general books on the Mongols), while a search in the Regenstein Library catalog retrieves 2 books, the only two non-fictional books ever written on the topic of the Mongol invasion of Java, neither of which are available online, both of which have multilingual bibliographies. Furthermore, only the more recently published book appears in the OCLC search due to the poor subject analysis provided for the earlier book.

 

Information technologies are helpless without information, and worthless if misinformation is input. The Indiana University white paper The Future of Cataloging recognizes this while the Calhoun report does not. One of the chief defects of automated indexing and analysis is the high rate of production of misinformation. In this respect, I have found that automatic indexing produces records similar in quality to the below minimal records produced by Cornell: very bad to totally useless. David Banush of Cornell has so lovingly suggested in a posting to PCCPOL list that catalogers are an aging and conservative group of secure and comfortable bureaucrats "vigorously (sometimes stridently) defending the status quo, or even the status quo ante" whose "prospects for long-term growth in a very dynamic global information economy are dim." I would reply that the prospects for long-term growth of an information economy without high quality information are even dimmer.

 

 

The library and the information economy in which it is embedded is portrayed in Marcum's papers, the Calhoun report, the California report and even the Indiana University white paper as a technical system in which there is no hint of the possibility of error and misinformation and how that will effect the efficiency of the technical system itself. There is a desperate need for library administrators to read in depth the ergonomic literature on error and that on failure in organizations. (I recommend Bogner Human error in medicine, Dörner, The logic of failure, Hoc et al., Expertise and technology, Hollnagel Cognitive reliability and error analysis method, Kerdellant Le prix de l’incompétence, Lagadec La civilisation du risque, Leplat and Terssac (eds.) Les facteurs humains de la fiabilité dans les systèmes complexes, Merry and Smith, Errors, medicine and the law, Morel, Les décisions absurdes, Reason Human error, Silverman Critiquing human error, Rasmussen et al. New technology and human error, Vestrucci Modelli per la valutazione dell’affidabilità umana, Woods et al., Behind human error, Frese and Zapf (eds.), Fehler bei der Arbeit mit dem Computer, and all of the writings on high reliability organizations of Karl Weick. I have dealt with these matters at length in Theory and practice of bibliographic failure, and in a recent paper of mine ("Colorless green ideals in the language of bibliographic description", available in the articles in press section of the online edition of Language & Communication) I looked at the library catalog as a communication system, combining linguistic aspects of information retrieval and human error research.)

 

4) Change. Perhaps the most disastrous and shortsighted aspect of policy decisions such as minimal level records and the abandonment of series authorities is the fact that future technological capabilities will depend—as they do now—on the presence rather than the absence of information in the record. As many commentators have remarked, automatic authority checking both for correction of mistakes and for collocation in the presence of variation ONLY works if the information is both transcribed from the piece into the bibliographical record and the correct/authorized/standard form established in an authority file. Without that double aspect of bibliographic description and control no automatic error correction and no collocation software presently available or in the future will ever be possible. On these matters library administrators are almost universally technologically ignorant and have absolutely no idea how information technologies work and what are the minimum requirements for their successful implementation. Almost every change demanded by the Karen Calhouns and Deanna Marcums will ensure failure in the implementation of any future technological developments.

 

Summary:

-There are vast differences in the expectations of the various users for what the shared utilities should accept and provide (book vendors vs. PCC Full Level standards).

-There are at least 2 completely different and mutually exclusive understandings of what research is and what researchers require (Calhoun vs. T. Mann).

-There are those who believe that information technologies do not require structured information or that software and/or the market will provide whatever is needed efficiently and adequately for research needs, and there are others who believe that information technologies are marvelous tools which require intelligent input and use, and that given the needs of academic research, the necessary production of information for technological manipulation and exploitation can only be successfully accomplished by persons who share the intellectual backgrounds, commitments and research activities of the academic community.

-There are those who believe that current promises of future technological possibilities must be believed and no alternative futures may be considered, and there are those who, cognizant of the preceding three professional debates and dilemmas surrounding cooperation, research practices and data quality, would reject all institutional action and forecasting that has been narrowly determined by technological hopes and expectations.

 

Sincerely

David Bade

Joseph Regenstein Library, University of Chicago