National Diet Library Newsletter
|
|
|
|
Designing Digital Library Services for 21st Century Research Libraries:
A view from New York
March 12, 2003
The National Diet Library
Tokyo, Japan
William D. Walker
Senior Vice President and
The Andrew W. Mellon Director
The Research Libraries
The New York Public Library
|
* Numbers in parentheses refer to accompanying
slides
(in Japanese) (total 79 pieces)
Ten short years ago, American libraries were just beginning to introduce end-user information technology and the Internet was not yet widely available. Today, networked technologies and end-user applications have become ubiquitous in American libraries of all types. American libraries are now busy meeting library users' increased access expectations by providing increasingly sophisticated digital resources to fuel educational activities and scholarly communication. To accomplish this, libraries are developing comprehensive discovery tools (online catalogs, archival finding aids, text retrieval systems), deep archives of digital content, robust image and multimedia archives, as well as dynamic links to global information resources via the World-Wide Web (WWW). All of these digital components fit into the design of a contemporary digital library program. However, to be fully engaged in digital library initiatives, libraries must also be involved in the development of standards; software architecture for storage, retrieval and display; and digital rights management systems. This is a challenging new agenda for traditional, print-based libraries. Our work at the New York Public Library has been evolutionary, not revolutionary. Over the past 10 years, we have introduced new organizational models that have helped us develop a capacity to mount digital library programs. During the next forty minutes, I will describe some of the ways that NYPL has approached the philosophical, technological, and developmental challenges of undertaking a digital library program. In September 2002, on the occasion of the 400th anniversary of Oxford's Bodleian Library, Michael Keller, University Librarian at Stanford University and the Publisher of HighWire Press, delivered a paper on the attributes of a great research library at the beginning of the 21st Century (2). Keller identified five essential characteristics of a great research library. Parenthetically, it is reassuring to note that both the National Diet Library and The New York Public Library meet all five criteria. According to Keller, a great research library must:
The administration and staff of the NYPL Research Libraries believe that Mr. Keller is accurate when he includes digital library program activities as a requirement. At NYPL, we believe that the creation and support of a substantial digital program is strategically the single most important activity that we can undertake to prepare for the future. We believe that our next generation of users will expect to connect to our collections and expert staff via digital networks. Already today, over 30% of our users "come" to the Research Libraries via our website portal, rather than coming to the library in person. This statistic has increased dramatically over the past two years with the introduction of E-reference services (3, 4) via the website for all divisions of the Research Libraries. Furthermore, we believe that our users soon will expect that a high percentage of the content that they wish to use will be in digital format (whether digitized in a "just-in-case" or a "just-in-time" scenario). In addition, researchers want access to malleable archives so that digital objects can be used to create new publications, products and tools, both for entrepreneurial and educational purposes. Thus, it seems imperative that our global library community develop digital rights management systems to break down the intellectual property barriers that currently exist between content providers, copyright holders, and user communities. This will be necessary to facilitate access to information in all disciplines: the sciences, humanities, and especially the fine and performing arts. Finally, it is important for standards and architecture issues to be resolved to facilitate long-term preservation of digital information. So, how is The New York Public Library meeting the challenges of these
strategic objectives? From this director's perspective, the digital
agenda has brought renewed energy to our traditional programs
(5). At many levels, staff is now fully engaged in developing
the future vision for the Library's programs and services. But we
are not undertaking
For those in the audience who are not familiar with the New York Public
Library, I would like to take several minutes to provide a brief description
to establish a context for my
From the day it opened over 90 years ago, The New York Public Library has collected from the whole world and for the whole world. Chartered as a private, not-for-profit organization, the Library is governed by an independent board of trustees. The Library is not an agency of city or state government, but a stand-alone private organization that serves in the public good. Administratively, the Library is organized into two functional units: The Branch Libraries and The Research Libraries. A President and Chief Executive Officer, Dr. Paul LeClerc, oversees all programs (7). Currently, the Library has an annual operating budget of US$290 million (34,800,000,000 yen), a staff of 3,500 individuals, and an annual traffic of 15 million onsite user visits, with an additional 4.5 million user visits to the virtual library via the Web. The Branch Library System maintains 85 neighborhood libraries (8) in the three New York City Boroughs of Manhattan, the Bronx, and Staten Island. The Branches serve local audiences, contain collections of books and multimedia that circulate (can be checked out), and provide community information, educational and entertainment services. For example, the Branch Libraries provide story hours for children (9), "Centers for Reading and Writing" to help adults who have difficulties with these basic skills (10), and an "English as a Second Language Program" to help immigrant populations assimilate into the American culture. The administrative organization of NYPL over which I have responsibility, The Research Libraries, are located in four buildings, all in Manhattan (11). The Humanities and Social Sciences Library is located in the Library's landmark building at Fifth Avenue and 42nd Street; the Schomburg Center for Research in Black Culture is located in Harlem; the New York Public Library for the Performing Arts is located on Lincoln Center's Campus, and the newest research center, The Science, Industry and Business Library is located near the Empire State Building in Manhattan. Over the past ten years, all Research Libraries' facilities have been updated with high-bandwidth telecommunications and data infrastructures to allow access to large digital datafiles, including multimedia. The Library's holdings now total over 43 million items: 15 million book-like items and 28 million non-book items (12, 13). This latter group is made up of maps, prints, audiovisual materials, letters, sheet music, and illuminated manuscripts, for example, a 15th Century copy of Boccaccio's "DeCleres et Nobles Femmes." The library's primary source collections are also massive, with over 46 miles of archival collections that document both past and contemporary civilization. As is the case for the collections of other large, non-academic research libraries around the world, no materials circulate, but only may be used onsite. In addition to collecting content produced by other agencies, the Library itself documents in the arts and literature. For example, The Library for the Performing Arts' "Theatre on Film and Tape" (TOFT) program is the only authorized library agency in America to film live theatre performances (14). The TOFT staff film every Broadway show, smaller theatre productions in New York, as well as regional theatre across the country. NYPL's general and special collections provide important content for the digital library program, and one important role that NYPL is playing within the US digital library community is that of a "content farm." Researchers and scholars place great pressure on the library to make its rare and unique holdings available in digital formats. To that end, the Library has developed a substantial in-house infrastructure to support the reformatting of holdings into digital format, but on a very selective basis. The Library now estimates that perhaps no more than ten percent of the Research Libraries holdings will be digitally reformatted over the next ten years. In the mid-1990s the NYPL Research Libraries' administration and staff made a strategic decision to realign, strengthen, and aggressively market access services in order to facilitate library users' ability to discover the extraordinary wealth of content contained in the collections. Traditional public services within the library were reengineered and enhanced. Public service staff began to assume new roles as teachers (15), and this staff then launched an ambitious training program for the public. In addition, the Library began to provide off-site users with access to library resources via an enriched website. Today, the Library's website (16) has become the main portal through which researchers can gain access to NYPL resources: it provides a gateway to our catalogs, our electronic resources, our reference services, and our classes, public programs, and exhibitions. In 1997, NYPL was among the first public libraries in the US to give offsite users access to electronic serials content via the Web (17). In our case, NYPL acquired licenses to give our 4 million registered library users (18) 24/7 access to the full-text content of several thousand electronic journals, first from ProQuest, then from a wide range of content providers (19). This was a huge service delivery enhancement, and this access was extremely popular with the citizens of New York City, since the Library was suddenly available to them at any time from any location.
NYPL's Model for Digital Advancement Building upon these successes, the Library developed an incremental strategy for Digital Library Program development. As the illustration shows, at the basic level is an electronic library composed of the Web site, the integrated library system, and digital content from third parties. At a more advanced level, the library develops sophisticated discovery tools and finding aids. On the third level of the pyramid is the digitization of content, of all types and in all formats. Finally, at the top of the model are knowledge management activities, publications, and courseware that are derived from digital content and digital tools. As the Library becomes engaged at higher levels of digital library activities, it must increasingly strengthen its investment in human, technological and financial resources. The NYPL Research Libraries has taken measured steps to integrate digital activities carefully into its traditional agenda. Key to the Library's success has been rigorous training programs for both the staff and the library's users to ensure that both groups have the necessary competencies to work confidently and successfully with digital information. To illustrate the breadth of digital activities that the NYPL staff now manages, I would like to present several case studies to describe the availability of E-content, the user training program, efforts to produce enhanced discovery tools, NYPL's digital reformatting activities, and new product development. Making
content broadly accessible
These purchased resources are supplemented by gifts from publishers and content providers, as well as e-resources that are in the public domain, such as patents, census data, and government information. Today, over 90% of our electronic content is accessed directly from content providers' remote servers via the Web. The Library is now able to make available full text archives of current and retrospective newspapers holdings, such as the "New York Times (25, 26)," the "Wall Street Journal," and the "Times of London (27)." This service is extremely popular, since it eliminates the need to consult microfilm -a labor-intensive process for both the staff and users. The Library also offers researchers many specialized electronic tools. For those who closely follow the stock market, NYPL provides access to the interactive network of "Bloomberg Financial Markets" (28) for news, stock market quotes, and market perspectives. For users researching a particular company or business, the Library offers access to powerful resources such as "D&B Million Dollar Directory" which covers over 1.6 million public and private businesses (29). Reference staff report that business directories in print format are almost never used anymore; researchers always opt for the more up-to-date, full-text electronic versions of these directories. In fact, NYPL's coverage is strong in almost all subject areas, except medicine and law. For those patrons pursuing their family histories, the Library makes available such resources as "Ancestry Plus", "Heritage Quest Online" (30), and "Burke's Peerage and Gentry Online." For patrons who need to conduct fine or decorative arts appraisals, we offer access to online auction records via "ArtFact" (31), "Artnet", and "Book and Auction Records Online." The Library also provides a number of options for image research (32). For photographs, we provide access to the "Associated Press Photo Archive," for paintings or sculpture, one can consult the "AMICO" (Art Museum Image Consortium) file or the Research Libraries Group's "Cultural Materials Archive." Later in this paper, I will describe NYPL's own initiative to provide digital access to images held in NYPL's collections. At NYPL, library users can access this electronic content through a variety of gateways, including links from online catalog records, web lists of E-resources available across the institution, or from disciplinary divisions' portals. Library
staff and end-user readiness: the SIBL training programs
In order to train the public, first one must train the staff. In the mid-1990s, NYPL's science and business staff, schooled with an orientation toward printed resources, needed to be prepared to work successfully in an electronic library environment. Thanks to a training grant from the Kellogg Foundation, staff was given the opportunity to learn basic technology competencies and to ingest the characteristics of electronic resources {ÛÃoth the search systems and e-content. The Library made a sizeable investment in the staff to provide both remedial training and to develop new competencies. Staff members who were not successful in acquiring target competencies on the first round of training were given a second opportunity. Beyond the basic competency level, SIBL librarians also were expected to assume new roles as instructors, and so it was necessary to provide formal education in instructional pedagogy. Projects were assigned in database design and knowledge management. Eventually, over 90% of SIBL's staff members assumed roles as the new generation of librarians at the New York Public Library. The other 10% either migrated to more traditional positions or opted for retirement. SIBL opened at a time when few members of the public had experience using electronic information. A public training program was required to assist researchers to use successfully the rich array of e-content (36, 37). SIBL was built with four hands-on classrooms (15 seats each). Initially, staff offered a curriculum of ten classes in science and business disciplines. During the first year, and based on public demand, five additional classes were created and added by the SIBL librarians. By the end of the first 12 months, over 13,000 people were trained in the program. In addition, customized, special classes were developed for targeted groups, including non-profits, high school science students, and library donors. By 2003, over 65,000 people have attended classes at SIBL. Today, two full time librarians oversee the public education program, but all reference librarians have teaching responsibilities. Librarians develop new specialized classes on an ongoing basis, and specialized classes include such topics as Stocks and Mutual Funds, Market research, the Apparel and Textile Industries, Companies and Contacts, and Conducting Export research. Classes in the general curriculum fall into four main areas: Basic Library Skills, Science Resources, Business Resources, and Government Information. Several years ago, the Library stopped offering "Introduction to the Web" classes - the demand no longer exists. Based on SIBL's successful program, we have now built and opened public training facilities in the Humanities and Social Sciences Library and at the Library for the Performing Arts. Staff offers a curriculum that includes Biographical Research, Researching Maps, Art Information Services, the Culture of Italy, Locating Manuscript and Archival Collections, and Intellectual Property in the Humanities, to name a few. Staff is now working on the development of web-based courseware to better serve offsite users, and the Library has stepped up marketing of Research Libraries' classes through an online newsletter and promotion at conferences. Enhanced discovery tools - access to finding aidsAs an integral component of digital library programs, American libraries are focused on the information seeking behaviors of researchers, often from a discipline-specific perspective. The Digital Library Federation (DLF) has sponsored several studies that examine how scholars look to libraries as a place to enable the exploration and the mining of ideas. In a 2001 DLF publication, "Scholarly Work in the Humanities and the Evolving Information Environment," (38) it was reported that humanists expect the library to design, produce, and deliver digital tools that permit extensive exploration of both print and digital content. Highlighted in the report is the notion that scholars place high value on the availability of online finding aids for primary source materials. To this end, and during the past several years, The New York Public Library has made incredible progress in converting traditional finding aids to online format using the Electronic Archival Description (EAD) format (39,40,41,42,43). These fully-searchable discovery tools have been mounted on the website, and we are now creating links to full-text digital content when it exists. Currently, the library has mounted over 400 comprehensive finding aids. Libraries across America have undertaken similar efforts, and the researchers in NYPL's reading rooms find it very exciting to be able to identify and view primary source materials held in other major repositories, without having to leave New York. For example, the Online Archive of California (www.oac.cdlib.org) provides a fine example of a multi-institutional effort (44- 47). The University of California libraries have pooled their finding aids so that it's possible to explore intra-institutional collections through a unified search. Increasingly, links exist to full-text page images. Digitization of US Libraries' holdingsIn the US, there exists no single coordinating agency that has unified the digitization efforts of libraries (48). Likewise, there has been no central funding source for digitization. As a result, in the last decade of the 20th Century, few large-scale, library-based digitization initiatives existed outside of the Library of Congress' "American Memory Project" (49) and "Making of America" (50), a University of Michigan and Cornell University partnership. However, libraries have mounted a myriad of digital demonstration products. Many have taken the form of "online exhibitions," and a fair number of these have been derived from actual onsite exhibitions that have been mounted in libraries. These web products enable a library to deliver an exhibition to an expanded audience and to include many value-added resources. For example, the current exhibition at NYPL, "Urban Neighbors," is intended to illustrate the natural history of New York City wildlife for an adult audience (51). The companion digital exhibition adds a strong educational component for high school students, including a teachers guide and an interactive sighting guide for children (52). The
Digitization Projects of NYPL
In aggregate, these projects have begun to make the collections come alive for researchers. As one might imagine, library users find these resources useful glimpses into special collections. Staff members like the preservation benefits of electronic browsing, and researchers also place high value on being able to acquire easily digital images for publication. Large scale digitization from NYPL's collectionsFinally, in 2000, the Library was given the capacity to mount the first large-scale digitization project that is known as NYPL Visual Treasurers (60). A US Foundation, the Atlantic Philanthropies, awarded NYPL a grant of $6 million to create a digital archive of over 600,000 images selected from the NYPL collections. These funds have been supplemented by grants from a number of other sources. The objective of the project is to digitize over 600,000 images from the NYPL collections, develop software to manage the images and the associated metadata, and to make the images accessible to the public anywhere in the world and at no charge. The first phase of the project (about 300,000 images) will become available to the general public during the summer of 2003. Currently, most units of the library are engaged in some aspect of the project, ranging from the selection and preservation of materials, to the creation of metadata. To establish a production mode, NYPL has built in-house digital imaging labs, developed relationships with several outside digitization contractors, created teams of metadata specialists, and developed an Oracle software platform to hold and to provide access to the images and the metadata. In selecting materials for digitization, a priority is given to images that have been in high demand from individuals in academia or those conducting advanced picture research, including work leading to publication or museum exhibitions. The Library has a special interest in using the project to expand service to the museum community in the greater New York metropolitan area and elsewhere. A high priority is also given to materials in the Library's special collections, including rare and unique photographs, prints, maps, and other materials, especially items whose access is restricted due to their fragile condition. Selection of materials for digitization is also done to complement other major digital image collections across the US and to avoid duplication. The Library's approach is to include large, complete collections of images, rather than selected examples, so that the resulting aggregate research collection is one of extensive depth and can form the basis for advanced research. Given that the ongoing cost of digital storage is substantial, selection is also made with an eye toward entrepreneurship - selecting images that will be in demand for fee-based photographic or digital reprographic services that the Library provides. An external advisory group of photographers, art historians, teachers, museum curators and artists advises the library on selection policies and practices. (Samples of the Visual Treasurers Archive are demonstrated, Slides 62- 69) Ultimately, images will be shared with other union archives such as the Research Libraries Group "Cultural Materials" archive (70). The Library will share its images in this way in order to give researchers alternative search and browsing options and to place NYPL's images within a larger cultural context. A second, large digitization project is underway at NYPL. The African American Migration Experience digital project will create a digital archive on important Black Studies materials, focusing on 13 major migrations, from the slave trade to the movement of Blacks across the American continent in the 19th and 20th Centuries (71, 72). For this $2.5 million (300 million yen) project, the Library will digitize images, texts, recorded sound and video. The Schomburg Center's collections will serve as the major resource for digitization, but staff will also digitize from selected African American repositories across the United States. From this digital archive, editorial staff and academic specialists will create a thirteen-chapter website that will place the images, text and media into an educational context. The project will include education guides for teachers and suggested curricula for schools. To engage in large-scale digitization, NYPL has need to create a new organization to manage digital library activities. In 2000, the Library created a distinct Digital Library Program unit which has responsibility for the digital imaging labs, software development, metadata, editorial and web development, quality control and contract administration operations. No images are captured if item level metadata does not exist. Staff from this centralized metadata unit is assigned across the Library's divisions, as needed. These professionals are highly trained, often with either foreign language skills or as subject specialists - but not necessarily librarians (73). Digitization is accomplished in several ways. We have built an in-house Digital Imaging Lab that is staffed with NYPL employees (74). These staff members have superb photography skills, and they handle our most fragile and valuable materials. In the library, we have assigned space to contractors who scan "medium rare" materials. Also, the Library sends materials out of house to three established vendors. NYPL staff performs quality control on images, whether shot by NYPL staff or contractors. Finally, we have added new crews of technology support staff, especially those with Oracle and Unix skills. At any given moment, over 80 members of the Research Libraries staff is engaged in digital library activities since many curatorial divisions lend staff to the digital effort. This represents an assignment of nearly 15% of the workforce. Referring back to my earlier illustration, at the highest level of effort and at the top of the pyramid of digital activity is the creation of new tools and content. At this level, library staff, faculty, and other users will be able to extract highly malleable data from digital repositories to create new publications, educational resources, and innovative products. One of the most noteworthy such products is the "Early English Books Online, Text Creation Partnership" (75). The Early English Books Online archive includes 125,000 titles published in England before 1700, from the first book printed in English by William Caxton, through the age of Shakespeare. The electronic resource is available from ProQuest, and it is searchable by author, title, printer, publication date, type of illustration, and Library of Congress subject heading. However, since the file contains page images, it is not possible to search the text. Since 2000, the University of Michigan and Oxford University have provided leadership for an exceptional partnership to create structured SGML/SML text editions for a significant portion of the Short Title Catalog of Early English books. Since ProQuest has already created digital images for 125,000 works, Michigan and Oxford, with the support of the international library community, are in the process of creating accurately keyboarded and tagged editions of a significant portion of these materials The challenge of keying Middle English scripts and texts that were written before spelling conventions were standardized is enormous. But, the impact of being able to search through text that is often undiscovered is also enormous. For example, it is possible now to search through thousands of pages of unexplored text to find a concept such as "love." The keyed text is then linked back to the ProQuest page image archive, and it is possible to toggle back and forth between the two versions. As one scholar described the enhanced search / discovery process, "it is now possible to do in several minutes what it took me several years to do." (76-79) It remains to be seen how active libraries will be in the creation of new digital products. A more likely scenario is that libraries will partner with the academic community - scholars, faculty, and researchers to achieve these results. Challenges on the US digital landscapeI've painted a rather glowing picture of progress on digital programs at NYPL and in the US in general. Of course, it is indeed impressive to observe the breadth and diversity of the many digital library products now underway in America. However, I also want to mention several of the major challenges that have slowed down our progress. Balancing traditional library programs with the new agenda of digital library activities has proved to be considerable strain on both financial and human resources. Very few large research libraries have been willing to displace funding for traditional collection activities in order to pay for digital library initiatives. Print collections continue to remain a core value. Similarly, few libraries have been able to reassign significant numbers of staff away from traditional activities to digital efforts. As a result, and in the void of any central source of government funding, the evolution toward large-scale digital library activities has been very slow. Standards and architecture issues have not yet been resolved. During the past five years, the Digital Library Federation members have worked diligently to develop and promote standards such as EAD (Encoded Archival Description), OAI (the Open Archives Initiative) and METS (Metadata Encoding & Transmission Standard), as well as models for user authentication. Currently, the Library of Congress is providing an essential service in sorting out issues related to the identification of long term archival systems for digital information. However, much work remains to be undertaken in many areas to create, adopt and disseminate standards that can be easily used by the library community. And finally, In America, libraries have been hindered by the lack of a centralized organization to provide leadership and coordination for a mega-digital library project across all types of libraries and archival organizations. Currently, several thousand libraries have mounted digital library files that are not compatible, cross-searchable, nor, in many cases, easily integrated. I am happy to report that these challenges are being addressed. A recent Digital Library Federation strategic planning initiative has set forth a course of action, and this federation of libraries will assume renewed leadership in each of these areas. DLF members are deeply committed to advancing digital library agenda in partnership, and all libraries in the United States will benefit. Be assured that in New York and in America, libraries see their future
planted firmly in a networked environment. How exciting it is to
have these opportunities to transform information and related services
as a result of the technologies available. How very exciting to know
that researchers in Tokyo will be able to freely consult a wealth of resources
in New York and that New Yorkers will have access to the riches of libraries
across Japan!
|
|
|

