Main -> Documentation -> Database Builder – Newton -> Creating a New SiteSearch Database -> Database Description (.dsc) File: Structure and Syntax -> Index Definitions

Index Definitions

Index definitions allow you to define the various indexes (e.g., author, subject, title, etc.) of a Open SiteSearch 4.0.x/4.1.x database in the database description (.dsc) file. They define which ASN.1/BER fields are to be indexed and how the pippin utility program will create the index (i.e., keyword, phrase, sparse terms, term adjacency, etc.). Review the syntax description and index definition example below for further information.

Syntax

index(index_id):[sparse] [notrangeable] [plural] routine(param) from(fieldlist) [with(adjacency variable list)]

Element

Description

index(index_id):

Begins the index definition and assigns an id number used by the Open SiteSearch Database Builder 4.0.x/4.1.x software and the Newton search engine. index_id can be any integer between 1 and 254.

sparse

Instructs the Database Builder software to create an additional index entry that groups terms together in the database, such as a language index which may only have 30 terms total. By defining the index as a sparse index, the indexed terms would be bundled together at the top of the index list instead of being stored in the midst of non-related terms.

This option is intended for use on indexes that have a limited number of sparsely distributed terms. When the postings for a sparse index are browsed, the scan is performed faster when the terms are stored together than if they were distributed throughout the database.

notrangeable

Indicates to the Newton search engine that ranging is not permitted on the index.

plural

Allows the index to be searched for both singular and plural forms of a search term. Pluralized searches occur when the patron enters the pluralization character (e.g., '+') at the end of the search term. Pluralized versions of a search term are then created by appending the items in the plural endings list to the search term and issuing each version as a separate query. See Plural Endings for more information.

routine(param)

Specifies which routine to use to create the index terms. See Index Routines for descriptions of the various routines available for use.

from(fieldlist)

Specifies the fields in the ASN.1/BER data to be indexed, where fieldlist is an ASN.1/BER tag path. fieldlist can contain multiple, comma-separated fields.

with(adjacency variable list)

Specifies which term adjacency definition(s) to apply to the index, where adjacency variable list can be a comma-separated list of definitions. See Term Adjacency Definitions for information about positional searching capabilities.

Example

Below is an example of three index definitions. Index 01 is a keyword subject index that allows for plural searching. Index 02 is a keyword author index. Index 03 defines a phrase author index. By indexing author names by both keywords and phrases in indexes 02 and 03, users can search for all or part of an author's name in any order and receive all of the search results. Also notice the term adjacency notations (e.g., 'with (fldid, pos)') used at the end of index 01 and index 02 allowing those indexes to be searched using the adjacency operators WITH and NEAR.


/*********************************************************/

/* BASIC (words) */

/*********************************************************/

Index(01): plural words() from(\

/* 026 */026/01, 026/02, 026/03, 026/04, 026/05, 026/06, \

026/07, 026/08 026/09, 026/10, 026/11, \

/* 035 */ 035/01, \

/* 135 */ 135/01, \

/* 036 */ 036/01, \

/* 038 */ 038/02, 038/03, 038/04, 038/05, 038/06, \

/* 038 */ 038/07, 038/08, 038/09, 038/10, 038/11, 038/12, \

/* 044 */ 044/01\

) with(fldid, pos)



/*********************************************************/

/* Author (words) (au) */

/*********************************************************/

Index(02): words() from(\

/* 027 */027/01\

) with(fldid, pos)



/*********************************************************/

/* Author (phrases) (au) */

/*********************************************************/

Index(03): phrase2() from(\

/* 027 */027/01)

Notice in the example above that the "/*" and "*/" symbols denote the beginning and ending of file documentation and that the "\" symbol acts as a line continuation marker to break long lines of text into several lines in the file.

Multiple tag paths in an index definition (and throughout the .dsc file) are separated with a comma followed by a blank space. However, the first field id listed in an index definition should start at the beginning of the line or directly after file documentation - DO NOT include a blank space before the field id. Refer to the first line of index 01 above for an example.

See Also

Creating a Database Description (.dsc) File
Database Description (.dsc) File: Structure and Syntax
Database Description (.dsc) File Example
Creating a New SiteSearch Database


[Main][Documentation][Support][Technical Reference][Community][Glossary][Search]

Last Modified: