Product Information
  Home - Documentation - Support - Search

Glossary

The glossary is updated as needed. If you don't see a term, tell us and we'll add it.

# |  A |  B |  C |  D |  E |  F |  G |  H |  I |  J |  K |  L |  M |  N |  O
P |  Q |  R |  S |  T |  U |  V |  W |  X |  Y |  Z

A - Top of Page

Accession Number
A number assigned to a database record when the record in entered into the database. This number uniquely identifies the record within the database. SiteSearch uses accession numbers as unique identifiers to locate and replace/delete records in a database.
ASN.1
An ISO standard (ISO 8824) language for the description of data in records. ASN.1, along with BER, facilitate the exchange of data between applications over networks that is independent of machine architecture and implementation language.

B - Top of Page

BER
1.An ISO standard (ISO 8825) for encoding records, providing a universal representation of data values.
2.A set of rules describing a mechanism to encode the abstract notation of ASN.1 into bit-strings. OCLC's Newton database records, as well as various SiteSearch interprocess communication messages, are BER-encoded records.
ber2txt
A SiteSearch utility program that converts ASN.1/BER data into ASCII for the purposes of data analysis and/or debugging. Can also be used to validate the structure of a ASN.1/BER file.

D - Top of Page

Database
See SiteSearch Database.
Database Build Process
The overall process by which ASN.1/BER data is converted into the five database files (HEDR, HDIR, POST, PDIR, and INDX) that make up a searchable SiteSearch database. Because the process involves several intermediate conversions, you typically execute the build process from within the SiteSearch Database Operations Tools, which performs the necessary conversions automatically.
Database Build Software
The SiteSearch database utility programs that are used in the database build process to create a SiteSearch database. These programs include initdb, pippin, sortnip, and rome.
Database Description (.dsc) File
A text file used in the database build process to explicitly define how your database is built. The file contains definitions for database identification, database files, term adjacency, indexing routines, and more. The database build software uses the .dsc file to initialize the physical database files (HEDR, HDIR, INDX, PDIR, and POST) and build the searchable indexes that make up the database.
Database Files
The five files (HEDR, HDIR, POST, PDIR, and INDX) that together make up a searchable SiteSearch database. These files are created during the database build process.
Diacritic
Refers to a character or symbol which has no standard keyboard equivalent, such as â, æ, ç, etc. Because they cannot be represented using a standard keyboard, diacritics are typically stripped from source data during the database build process. The pippin utility program is responsible for stripping/substituting diacritics during the database build process. The diacritics substitution tables pippin uses to process diacritics can be customized.
Document Type Definition (.dtd)
1) In SGML, a .dtd is a text file that describes the structure of a document and defines the framework for the elements (such as chapters and chapter headings, sections, and topics) that constitute a document.
2) In the SiteSearch system, a .dtd is a text file used by the sgmlconv utility program to map fields within an SGML source file to fields within a ASN.1/BER file.

F - Top of Page

Field ID
Within a BER record, a field ID is a number that identifies a particular field of data within the record. For example, if author's name is contained in field 23, it's field ID is 23. Because BER records are structure in a numerical, nested heirarchy, a field ID can also refer to a data field's entire ID path. For example, if the above author's name is nested within an author information field with an ID of 5, then the field ID for author's name could also be referred to as 5/23. Field IDs represented this way are also called BER "tag paths."

I - Top of Page

Index
A collection of like data (e.g., authors' names, titles, publication dates) that can be searched within a database. Thus, when you search a SiteSearch database, you must specify the index you wish to search, such as the author index, title index, keyword index, etc. Indexes within a SiteSearch database are created according to the index definitions contained in the database's description (.dsc) file.
Index Definition
Explicitly defines how a particular index within a SiteSearch database is created during the database build process. Index definitions are specified in the database description (.dsc) file. The definition itself contains instructions to the database build software on how to construct the index, specifying such attributes as whether the index is a keyword or phrase index, which BER fields to include in the index, whether to include term adjacency information within the index, etc.
initdb
A SiteSearch Database Builder utility program which initializes the five database files (HEDR, HDIR, POST, PDIR, and INDX) according to the definitions contained in the database description (.dsc) file. The initdb program can also be used to update the database description information contained in the HEDR file, and to enlarge the size of the database files.

M - Top of Page

MARC
Machine-Readable Cataloging (MARC). A standard format for storing bibliographic information in electronic format.
marcconv
A SiteSearch utility program that converts MARC data into ASN.1/BER format.

N - Top of Page

Newton
The search engine specifically designed for searching SiteSearch databases. Newton is included as a standalone engine with SiteSearch 3.1 and previous versions. Newton functionality is incorporated into the ZBase package in SiteSearch 4.0 and future versions. See also Pears.

O - Top of Page

Overcite
A SiteSearch 3.1 program that formats BER data for display (i.e., within a browser). Overcite relies on special instructions contained within formatter control (.fcl) files to process BER data. Processing instructions tell Overcite what data to extract from a BER record, what information to add to the beginning/end of a particular piece of data, how to handle special characters and diacritics, etc. Overcite is included as a standalone program in SiteSearch 3.1 and previous versions. Overcite functionality is incorporated into the various members of the ORG.oclc.formatter package of SiteSearch 4.0.

P - Top of Page

Pears
Pears is a new database engine shipped with SiteSearch 4.2.0. The OCLC Office of Research developed Pears as a replacement for the Newton database engine used to build databases for many OCLC products and shipped with SiteSearch 4.0.x /4.1.x . The Pears source code is available from the OCLC Office of Research under an Open Source license. For more information see the Pears System Overview.

pippin
A SiteSearch Database Builder utility program that 1) extracts index terms from ASN.1/BER data and stores them in a new index postings (NIP) file, and 2) stores (or updates) BER data in the HEDR database file, and 3) updates the HDIR database file.

S - Top of Page

SGML
A generic markup language for representing documents. SGML aims to separate information from its presentation and thus facilitate different presentations of the same information. It is an ISO standard (ISO 8879/1986) produced in 1986 and amended (Amendment 1:1988) in 1988. HTML is a subset of SGML.
sgmlconv
A SiteSearch utility program that converts tagged data such as SGML into ASN.1/BER format. The sgmlconv relies on a document type definition (.dtd) file for instructions on how to convert the tagged source data into ASN.1/BER format. The sgmlconv can also be used to generate a list of fields from tagged source data.
SiteSearch
A complete and customizable solution for creating, integrating, and accessing data resources and providing these resources to users worldwide.
SiteSearch Database
A database created with the SiteSearch database building tools for use with the SiteSearch system. A SiteSearch database consists of five files, HEDR, HDIR, POST, PDIR, and INDX, which together form a searchable SiteSearch database.
SiteSearch Database Operations Tool (SSDOT)
A utility program for creating and managing SiteSearch databases.

T - Top of Page

Tag
Within a BER record, a tag is the numerical identifier, or label, of a data element. The terms "tag" and "field id" are synonymous in regard to BER files. Within an SGML or other tagged data record, a tag is the structural label that identifies a data element, such as <author>, <title>, <pub_date>, etc.
Tag Path
The location of a data element within a BER record expressed as a list of numerical identifiers reflecting the structure of the BER record itself. For example, a data element with a "tag path" of 5/23 is located in node 5, sub-node 23.

W - Top of Page

WebZ Entity Language (WZEL)
A programming language used in SiteSearch 3.1 and previous versions for setting the values of entities, testing the values, and providing dynamic behaviors in documents based on the values of entities. WZEL "code" is included in the header section of HTML files and is executed by httpgate prior to being returned to the browser. WZEL code is not supported in SiteSearch 4.0.

Z - Top of Page

Z39.50
1. A computer-to-computer communications protocol designed to support searching and retrieval of information (full text documents, bibliographic data, images, and multimedia) from databases in a distributed network environment.
2. The identification number for "Information Retrieval Application Service Definition and Protocol Specifications for Library Applications", an ANSI standard adopted in 1988 and revised in 1995.
zclient
A SiteSearch 4.x utility program that acts as a minimal Z39.50 client used to test a database through a Z39.50 server.
zdemo
A SiteSearch 3.1 utility program that acts as a minimal Z39.50 client used to test a database through a Z39.50 server. The SiteSearch 4.x version of this program is zclient.

[Home] [Documentation] [Support] [Search]