Automation of library activity would be impossible without the presence of such an important element as electronic catalog, that is the core of library technology. In connection with the commissioning of the new building of the National Library of Belarus and the introduction of the Automated Library Information System (ALIS), the question of retro-conversion of card catalogs arose.
Retrospective conversion - transfer of already available bibliographic information on traditional data carriers (card catalogs) into a more user-friendly machine-readable form.
Due to the specifics of the subject area, the task of retroconversion required the specialists of OAO AGAT-SYSTEM to develop a unique industrial Information-Technological System of Retroconversion (RITS), including a set of hardware and software, required the implementation of a whole range of measures for the organization of workplaces and personnel training.
In the context of the task, it was necessary to process ~ 3.5 million cards of six different card catalogs. The deadline for task completion was 2 years (November 2005 - October 2007).
The key point in retroconversion process is the presentation of information from the catalog card as text and subsequent processing of the received text in order to mark the bibliographic description elements and to form electronic records in MARC format. In general, the work upon retroconversion includes:
-
Scanning of catalog cards;
-
Recognition (“decoding”) of graphic images of cards with special software for the purpose of obtaining texts;
-
Processing of the text of catalog cards, selection of individual elements of bibliographic description;
-
Forming of records in MARC-format.
The operation of obtaining text from a card can be considered typical, as there are special software packages that use the method of pattern recognition and transformation (or decryption) of text from a graphic source. Therefore, the ITSR developed at OAO “AGAT-SYSTEM” assumes the processing of information already recognized after scanning of catalog cards. The system was built on the basis of the Oracle DBMS using the client-server architecture, which ensures the storage and processing of significant amounts of information and the intensive sharing of resources by users of specialized workstations.
RITS consists of:
-
Databases (DB) for storing and accumulating information;
-
A software system designed to maintain the database, monitor the completion of technological operations and manage the system and personnel performance records;
-
Several types of computer workplaces (CW), functionally different depending on works performed at the appropriate stage of the technological process: CW correction (standardization) of the text.
Whatever perfect software is used for scanning, there are characters that are recognized uncertainly (or are not recognized at all, as they are made in writing or the physical characteristics of the index card did not allow obtaining a satisfactory image after scanning). Therefore, all texts received after the program “decryption” should pass the stage of manual processing. The essence of this processing lies in the correction of texts by operators using special software.
-
CW of text quality control;
-
CW text marking;
-
CW of text marking quality control;
-
CW of designations quality control.
The correctness of the assignment of cards of various types to the corresponding information flows is checked:
-
CW of system administrator;
-
Software system for automated processing of the text of catalog cards in order to mark separate elements of the bibliographic description corresponding to the fields of BELMARC Belarusian communicative format developed on the basis of UNIMARC.
This software system is included in the CW marking, significantly reduces the complexity of this technological operation, as it allows you to automatically identify and mark up to 64 fields of MARC format in the texts of catalog cards.
-
Software system for controlling of fields of bibliographic descriptions and uploading data to output files for information exchange.
-
Information is downloaded from the RITS database into files structured in accordance with GOST 7.14-98 (ISO 2709-96).
-
Programs control bibliographic records.
This module has no interface with the RITS database and is intended for selective or continuous monitoring of output files for the purpose of acceptance by customer’s specialists.
Features of information processing implemented in the system.
The main feature of information processing in RITS is verification of the correctness of performance of basic operations (correction and marking of the text). Verification mode consists in performing double processing of information of one catalog card by two employees and the subsequent automatic detection of data discrepancy. The decision to eliminate the detected errors is made by the operators of quality control workplace.
Due to the special requirements of the standard for preparation of multi-level bibliographic descriptions, primarily for multi-volume documents, as well as serial and other ongoing resources, a special user and program interface was developed. Thus, RITS allows one to “merge” the data of several catalog cards or to “partition” the data of one card into logical blocks with their subsequent processing and the possibility of linking storage units’ inventory numbers with the corresponding volumes.
It is necessary to note the principles of information distribution in the system during processing: when adjusting text, information on workplaces comes randomly, and at the most important operation - text marking (bibliographic markup) - information on workplaces comes in the form of sequential, limited by number of data block cards, corresponding to the physical allocation in the storage boxes. Such technological principles of processing guarantee the possibility of downloading of only completed data blocks that are multiples of the catalog box.
The technological RITS principles are primarily aimed at ensuring the strict quality requirements for the bibliographic records created. When sampling 5% of bibliographic records downloaded as output file, the system provides not more than one error in the marker (MARC field) for 10 records.