Main -> Documentation -> Database Builder – Pears -> Pears Database Build Process
Pears Database Build Process

Like the content of Newton databases, records and terms for a Pears database must go through several import stages before they can be permanently committed to a .pdb file. In addition to storing the records, index terms must be extracted and postings information must be updated. This document graphically represents the update process for a Pears database.

Note: What follows below is a conceptual model for what happens behind the scenes during the update of a Pears database. The actual tasks that you perform to kick off this process are covered in the document Creating a New Pears Database.

SSDOT and the Bartlett Utility

The SiteSearch Database Operations Tool (SSDOT) for Pears is a Java-based application included in the 4.2.0 release of the SiteSearch Database Builder software. SSDOT for Pears manages the Bartlett utility and its imbedded Java program Bosc in order to automate the database build process.


Description of the Update Process

Several stages are involved in transforming your source data into a searchable Pears database, beginning with the conversion of records in your input file and ending with a critical commit of records and indexes to the database. SSDOT for Pears is used to automate these stages by providing you with menu selections for registering and updating a Pears database.

1. Convert source data to ASN.1/BER format.

Data must be in BER format in order for the ZBase component to access and retrieve it. To make this happen for a Pears database, the Bartlett utility that operates behind SSDOT for Pears initially reads the database description configuration file to select the appropriate record handler. The record handler then pulls records from the input file and converts them to BER format. This stage is completed when the converted records are stored in a temporary journal file.

 

2. Extract new index postings (nips) from BER records.

Bartlett extracts terms from the BER records according to index definitions in the database description configuration file and stores them in memory as new index postings (NIPS). The capacity of the temporary memory that holds the NIPS is internally set at 250,000 bytes (approximately 10,000 MARC records). Once memory capacity is exceeded or all NIPS have been extracted during an update, Bartlett calls the imbedded utility Bosc to read the nips and update the indexes in the temporary journal file.

 

3. Records and Indexes are committed to the database.

Designated records and indexes are committed to a Pears database in two ways:

  • Safe Commit: This commit adds new regions to the end of the <database>.pdb file based on updates it receives from the temporary journal file during the update process.
  • Critical Commit: During this final commit, the old regions in the <database>.pdb file are overwritten.
 

 

Note: A safe commit gets its name from the fact that new regions containing records and indexes are appended to the end of the <database>.pdb file, allowing the regions to be quickly and automatically backed out and the database returned to its previous condition should something go wrong during the update process. It isn't until the critical commit that existing regions are overwritten with new ones. And even as a result of an unsuccessful critical commit, data is not permanently lost or corrupted. If something goes wrong during a critical commit, the temporary journal.pdb file is kept by Bartlett so that it may be played back against the <database>.pdb file.

 


See Also

Pears System Overview
Pears Record Handlers
Creating a New Pears Database
Pears Database Description Configuration File


[Main][Documentation][Support][Technical Reference][Community][Glossary][Search]

Last Modified: