Skip navigation

CDNLAO


CDNLAO Newsletter

No. 61,  March 2008

Special topic: Metadata

Metadata: to what extent has Japan’s response to challenge advanced?

by Machiko Nakai,National Diet Library


The word “metadata,” which came into use in the late 1990s in Japan, had a new sound to symbolize a future digital age and changing libraries for librarians. Japanese libraries generally accepted the word “metadata” as the bibliographic data of digital resources. Both catalogers and staff promoting digital library projects considered the realization of metadata as a big challenge: how to create data to access and preserve digital information in addition to existing library materials, what standard should be applied for data creation. 

To what extent has this task moved forward? --- I introduce the past, present and future of our approaches for metadata in the library community in Japan, focusing on the National Diet Library (NDL).

1. Digital library and metadata: the beginning

1-1. National Diet Library Electronic Library Concept

In Japan, some government-led digital library projects started in the ’90s. In 1994, the NDL began its digital library work with the Pilot Electronic Library Project, which was a collaborative project with an external institution. In 1998, we formulated the National Diet Library Electronic Library Concept and promoted projects based on the concept. 

In this concept, the NDL pointed out the importance of organizing electronic publications and bibliographic control and included in the digital library projects the electronic archiving of information that is important and of scholarly value on the Internet. Based on this concept, the focus of the NDL’s digital library projects until around 2002 would be to work on the project of collecting and archiving Internet resources as well as digitization of the library collection.

1-2. The NDL Metadata Element Set

Facing the whole new projects to be built up, we had a lot of things to consider. One of them is metadata, a tool for accessing Internet resources. In 2000, the staff members responsible for digital library and bibliography set up a working group and discussed the metadata standards to be applied for. As a result, we specified the NDL Metadata Element Set  by adding necessary qualifiers to the fifteen elements of the Dublin Core Metadata Set and released it in 2001. 

1-3. WARP and Dnavi

In 2002, NDL released the first version of its digital library system which had been developed since 2000. WARP (Web Archiving Project), a project of collecting and preserving Internet resources, was launched concurrently with the opening to the public of Dnavi (Database Navigation Service), a metadata database of useful databases on the Internet. For both WARP and Dnavi, the NDL Metadata Element Set was adopted as bibliographic data for resources collected and provided. As of 2007, about 2,000 websites and 1,500 titles of electronic journals are available in WARP, and 10,000 databases in Dnavi.

However, as NDL’s digital library projects were developed by divisions newly organized for the purpose, there were no systems, no experiences nor business models for creating metadata at the time. This is one of the reasons that only metadata of simple level have been assigned. In the aspect of business model, although we have hoped that metadata would be supplied by the creator or the publisher of resources, we have failed to take an effective approach to realize it.

1-4. Approaches for metadata in university libraries

University libraries in Japan also started to expand and enhance their digital library functions in the late 1990s. The National Institute of Informatics (NII), which has been operating NACSIS-CAT (union catalog of academic libraries and institutions), planned a joint project for developing a metadata database of digital resources held by university libraries and others and set to work on its development in 2002. This project, targeting various kinds of academic information resources available on universities’ websites, aimed to accumulate in the NII’s database the metadata created by each university and to have them shared. 

The Institute formulated the NII Metadata Element Set, which has as its basis the fifteen elements of Dublin Core and qualifiers added for their own use, and adopted it as the metadata standard for the project. In 2004, there were more than 250 participating institutions and 60,000 metadata accumulated.

1-5. Digital libraries in Japan and Dublin Core

As described above, it may be said that approaches for metadata in Japanese libraries started with the adoption of the DC metadata. The simplicity of its elements and its characteristic as an international standard appealed to institutions that were about to start a new project. When the qualified DC appeared around 2000, the Japanese library community, which had had some questions about the fifteen elements of the simple DC also took them as acceptable by setting qualifiers that refined the elements. It can be said that the NDL and NII each formed their own application profile with the addition of qualifiers for the resources they need to treat and for the tools they had applied to existing MARC data, such as authority files and classification schema.

In 2001, we had a significant experience of holding the annual conference of the DC in Tokyo co-hosted by the NDL, NII and other institutions, which was the first one held in Asia.

However, it was actually not so easy to apply the DC to electronic resources. It took trial and error because of its flat structure, the difficulty in determining the granularity of electronic resources, and the multiple writing systems (Chinese characters, Katakana, etc.) of the Japanese language that makes it complicated to treat. Moreover, in the early stage of the digital library, metadata for electronic resources were separated from bibliographic data for other existing materials, so both kinds of data could not be simultaneously retrieved.

2. Turning point to digital archive

2-1. Next step of the NDL’s digital library

Following the start of digital library services in 2002 including WARP, Dnavi, and the Digital Library from the Meiji Era which provides digitized materials, the NDL formed the next plan in 2004, the Digital Library Medium Term Plan for 2004.

This plan states two principal objectives.

The first is the construction of a digital archive which will enable collection and long-term preservation of Internet resources on the premise of a future institutionalization of web archiving. For this purpose, the plan stipulates that “identifiers for long-term data storage and for preservation of uniformity will be provided, as are metadata for access or storage.” The second objective is the development of a portal function that “guides users not only to the NDL's digital archives, but also to digital information sources and information services provided by national and other public institutions.”

To accomplish these objectives, after reviewing the existing metadata standard, the NDL drew up two new metadata standards in February 2007 and opened them to the public on its website.

2-2. NDL Digital Archiving System Metadata Schemas (NDL-DA Metadata)

NDL-DA Metadata are metadata schemas developed to construct a digital archiving system which has the aim of managing and preserving for the long term a variety of digital resources including Internet resources collected by the NDL and digitized materials. For archiving, in addition to bibliographic metadata, information which describes technical requirements, information for preservation, information on the rights, and information for management, are required as well as an information package to preserve these metadata and contents all together.

NDL-DA Metadata adopt METS (Metadata Encoding & Transmission Standard) as its information package and has original elements based on PREMIS for technical, rights and preservation metadata. For bibliographic metadata, MODS (Metadata Object Description Schema), which was developed by the Library of Congress, has been adopted. The reason we chose the MODS is that it is compatible with MARC21 and can be easily linked to bibliographic data other than that of electronic resources, and that it is appropriate especially to describe metadata for digitalized materials.

The NDL is developing a digital archiving system implemented with the NDL-DA to be put into operation in 2009.

2-3. National Diet Library Dublin Core Metadata Element Set (DC-NDL)

DC-NDL is a revised version of the NDL Metadata Element Set formulated in 2001. The aim of the revision was to remake the NDL Metadata Element Set to let it serve as an application profile for libraries in Japan to adopt DC metadata. This was done by adapting it to the expansion of DC metadata, as typified by the DCMI Abstract Model, as well as responding to the problems existing in the NDL Metadata Element Set. There was also a more direct reason: its revision was required to respond to the development of the NDL Digital Archive Portal (PORTA).

PORTA is a system that realizes the portal function mentioned in the Digital Library Medium Term Plan for 2004. Its aim was to make it possible to do an integrated search of several kinds of digital archives by cross search or by harvesting of metadata. Its construction started in 2004 and its prototype was opened to the public in 2005. In October 2007 the system entered into full scale operation. As of February 2008, twenty systems are covered, including NDL’s digital library services (the Digital Library from the Meiji Era, Rare Books Image Database, etc.), NDL-OPAC, digital archives of the National Archives and public libraries. PORTA uses DC metadata as elements which form the basis for integrated search for diversified forms of data. DC-NDL has been developed in tandem with the development of PORTA. While the NDL-DA Metadata are intended to be metadata for preservation, DC-NDL is meant as metadata for exchange.

2-4. Institutional repositories in university libraries

In university libraries as well, the orientation to digital archive became clear around 2004. Their focus was on the construction of institutional repositories in which each university made sure of archiving its own research products with the aim of making it an information infrastructure which contributes to advancing open access to academic information. The NII, which supports the construction, harvests metadata from each institutional repository and makes available to the public the portal Junii+ on a trial basis.

As metadata standard for Junii+, the NII formulated Junii2 in 2007, which uses DC metadata as a basis but is specialized in description of academic thesis. It can be said that the formulation of Junii2 was intended to encourage researchers belonging to each institution to register their research products with its repository and to facilitate creation of descriptive metadata for them. With this movement, it was decided to terminate the joint construction project of metadata started in 2002 covering resources of larger range in April 2008.

3. Metadata in the future

3-1. Toward cooperation in digital archiving 

As described above, in 2007, both the NDL and NII formulated metadata standards, which differ from each other according to the roles and objectives of the two institutions. The former aimed to preserve electronic resources for the long term in considering a factor of the international standard, and the latter focused on the construction of institutional repositories. However, we share the need for cooperation between digital archives. It could be said, too, that we also have a common issue of how to ensure interoperability of metadata, which we will tackle in the future along with discussions about how to share the operation among Japanese institutions and what will be required as institutional measures.

For the NDL, the top priority is to ensure the operation of metadata in its new digital archive system and fulfill the role as a reliable information infrastructure.

3-2. Web2.0 and Semantic Web

On another front, the technology of Web2.0 has opened up new possibilities such as free citation, delivery, linkage and tagging of data on the web. And activities concerning the Semantic Web have developed various tools to make more convenient use of web data by organizing them structurally. DC metadata have also forced adaptation to the framework of the Semantic Web. What is the role of the library in such a situation? It would be required to provide metadata which are useful to all, including people and agents, rather than pursuing conceptual standardization of metadata as has been done until today, that is, to be an information infrastructure also in this aspect. This covers all the information resources libraries have, including bibliographic data and authority data. It can be said that the concept and technology of metadata will become larger and more important for libraries in the future in all the aspects of creating and providing information.


Copyright (C) 2008 National Diet Library


Webmaster:

Branch Libraries and Cooperation Division, Administrative Department, National Diet Library
1-10-1 Nagata-cho, Chiyoda-ku, Tokyo 100-8924 Japan
Tel: +81-3-3581-2331 / Fax: +81-3-3508-2934 / E-mail: kokusai@ndl.go.jp
(The National Diet Library is responsible for the maintenance of the CDNLAO website)