![]()
CDNLAO Newsletter
No. 58, March 2007
![]()
- Experimental projects of collecting, archiving and providing Internet information resources
- Report of the Legal Deposit System Council
- Creation of the Digital Archive
The National Diet Library (NDL) set out the "National Diet Library Electronic Library Concept" in May 1998, which defines that information resources available on the Internet should be collected, archived and provided by the library. In 2000 the NDL formulated the "Basic Implementation Plan for Electronic Library Services" and started experimental projects from 2002. In February 2004 the NDL worked out the "National Diet Library Digital Library Medium Term Plan for 2004" and accordingly has since embarked on projects of constructing digital archives.
Based on these plans mentioned above, we have continued to develop projects of collecting and archiving Internet information resources. The following are the main projects that we have worked on.
1. Experimental projects of collecting, archiving and providing Internet information resources
In FY2002 in tandem with the opening of the Kansai-kan, the NDL started experimental projects with regard to collecting, archiving and providing Internet information resources. The following three projects were conducted as testbeds which were to lead to designing the operational model.
WARP (Web Archiving Project) (http://warp.ndl.go.jp)
WARP is the project in which the NDL has selectively collected
Internet resources to archive and make them available to the public
provided that the copyright holders give the library permission to do
so. The project started in FY2002 on an experimental basis and moved
into the operational stage in FY2006.
In WARP the NDL has collected websites of the following: national administrative agencies and institutions; prefectural governments; cities, towns, villages and their consolidation councils before and after consolidation; public-interest corporations/organizations and national universities before their changing into independent agencies; and various events such as the FIFA World Cup 2002. The NDL has also collected online periodicals issued on the Internet by organizations of the kinds mentioned above.
These information resources are being collected mainly by a
web crawler based on permission from the copyright holders. We
assign metadata to each website and online periodical as a unit object
to be organized (title) with which different versions (items) collected
on different dates are linked. As of January 2007, we have archived the
following number of Internet resources.
Table: WARP contents (as of January 2007)
| Type | Number of titles | Number of items | Number of files | Volume of data (GB) |
| Online periodicals (total) | 1,499 | 6,837 | 4.57 | 560 |
| Websites (total) | 1,922 | 7,521 | 54.37 | 3,403 |
| National agencies | 38 | 298 | 8.85 | 683 |
| Prefectures | 8 | 98 | 8.31 | 495 |
| Cities, towns, and villages to be consolidated | 1,687 | 6,095 | 21.33 | 1,436 |
| Public-interest corporations/organizations | 85 | 780 | 12.73 | 615 |
| Universities | 71 | 106 | 2.61 | 155 |
| Events | 26 | 102 | 0.47 | 18 |
| Others | 7 | 42 | 0.07 | 1 |
| Total | 3,421 | 14,358 | 58.94 | 3,963 |
-
Survey on Comprehensive Collection, Storage, and Archiving of Japanese Web Sites
This survey was conducted from October 2004 to March 2005 to study the feasibility and
methodology of collecting, storing, and archiving Japanese websites.
The survey targeted domestic websites including those in the JP domain.
According to the survey result, we calculated that the total volume of
Japanese websites was approximately 18.4 TB, total file numbers 450
millions. For more information on the survey, please see the summary on
the NDL website:
http://www.ndl.go.jp/en/aboutus/bulkresearch2005summary_e.html
-
Dnavi (Database Navigation Service) (http://dnavi.ndl.go.jp)
A large part of the Internet resources is, in fact, in the
form of databases, which cannot be collected by a web crawler and thus
cannot be searched via search sites. Dnavi navigates users to gateways
of databases in Japan. It started in FY2002 as an experimental project,
and shifted to the operational stage in FY2006 together with WARP.
There are 9,600 databases in Dnavi as of January 2006.
2. Report of the Legal Deposit System Council
In March 2002 the Chief Librarian of the NDL consulted the Legal
Deposit System Council to seek their views on the following questions:
Should networked electronic publications issued within Japan be
incorporated into the legal deposit system? If not, what selection
criteria should be applied to them, and by what means should they be
collected?
3. Creation of the Digital Archive
The NDL formulated the “National Diet Library Digital Library Medium Term Plan 2004” in February 2004. This plan specifies the objectives of the NDL digital library services, one of which is to make the NDL a major base of digital archives in Japan.
The NDL has been working on developing the NDL Digital Archive System (NDL DA System) as an infrastructure system for digital archives. This system aims to assure the overall operations from collecting, organizing, providing through preserving digital information, and to ensure its long-term preservation and availability. Digital information in this case includes not only Internet information resources but also packaged electronic publications such as CD-ROMs as well as information digitized from paper publications. The NDL DA System is based on the Open Archival Information System (OAIS) reference model (ISO14721:2003), and consists of three layers: application, preservation and mass storage. Digital information is to be preserved for the long term as information packages based on the Metadata Encoding and Transmission Standard (METS). We intend to use Metadata Object Description Schema (MODS) as our standard for creating descriptive metadata. We are working on system development in the hope that the NDL DA System will start operating in FY2009. The present working WARP system will be converted to the NDL DA System.
The NDL is also working on the creation of Japan's digital archive portal. Since 2004 a prototype system of the NDL Digital Archive Portal has been made available to the public. In 2007 we will start the full-scale service.
The NDL aims to collect Internet information in the Japan web domain based on future legislation. To turn this vision into reality, the present legislation should be adapted to address the current issues including that of intellectual property. We presume that there will be no short-term solution, but we will continue to work hard to help realize such legislation.
There is also a need to solve technological problems such as the obsolescence of playback systems for collected digital information resources in order to make them accessible for the long term. The NDL has been conducting feasibility studies on various preservation technologies including migration and emulation. In addition we need to promote standardization of technologies necessary for playback, metadata, etc. The NDL intends to cooperate with other digital archives including libraries and archives in Japan as well as the private sector and researchers to promote standardization.
We will continue to work hard on collecting and preserving Internet resources. We hope that we will be able to promote standardization and cooperation in Japan as well as to contribute to the international community.
Copyright (C)2007 National Diet Library, Japan
