|
Creating a
Pears Database Description Configuration File Based on an Existing Newton
.dsc File
Contents
Introduction
General Guidelines
[Bartlett] Section
[LockServer] Section
[DB] Section
[Handleinput_record_type] Section
[index_definition] Sections
Examples from the DC Database Framework
Examples from the MARCCat Database
Framework
Introduction
You
must have a Pears database description configuration
file to convert a Newton database to
a Pears database. This document provides some guidelines for creating
a Pears database description configuration file with the same characteristics
as an existing Newton database description
(.dsc) file.
You
may also find these guidelines useful when you are creating a database
description file for a new Pears database and you want to base the database
on an existing Newton database.
Since
this document supplements the Pears Database
Description Configuration File document, it organizes these guidelines
according to the structure of the database description configuration file.
General
Guidelines
- The sample
Pears databases available on the SiteSearch FTP site include database
description configuration files for each database. You may find these
useful as examples.
- As with WebZ
configuration files:
- use the
# character to comment out all or part of a line
- use #include
to incorporate the contents from another configuration file into
a file
Return
to Contents
[Bartlett]
Section
Return
to Contents
[LockServer]
Section
Return
to Contents
[DB]
Section
- The input record
type for converting a Newton database to a Pears database is DB. For
a new Pears database or for subsequently adding new records to a converted
Pears database, use the record handler
appropriate to the type of input records you wish to add to the
database.
- The Name variable
is equivalent to the database identification
in the .dsc file.
- The
blocksize and filename variables take the place of the database
file definitions in the .dsc file.
- The RecordIDIndex
variable denotes the index for the field that contains the unique identifier
in each record. This is equivalent to the Newton internalaccnindx
value.
Return
to Contents
[Handleinput_record_type]
Section
Return
to Contents
[index_definition]
Sections
- See Pears-Newton
Indexing Routine Comparison to create Pears index definitions that
duplicate Newton indexing routines.
- You do not
need to define sparse indexes
in Pears. Pears creates the equivalent of a sparse index automatically
without any action on your part.
- You can define
term adjacency for any keyword
index by adding this variable to its [index_definition] section:
OccurrenceRoutine
= ORG.oclc.pears.Bartlett.wordfield
- You can define
plural indexes for keyword indexes
by using ORG.oclc.pears.IndexRoutines.PluralWords indexing routine.
- You can define
a restrictor index by using
ORG.oclc.pears.Bartlett.termrest as the indexing routine, specifying
the index ID for the index for which you want to create a restrictor
index, and providing a list of the restrictor values. You do not need
to specify the size of the bitmaps that hold all restrictor values (Newton
restrictsize value). See Examples from the
MARCCat Database Framework.
- If you make
a Newton database available to patrons through the WebZ interface, keep
the same index numbers in its Pears database description configuration
file as those in the existing .dsc file. Then you won't have to change
the alternateID variables for the database's index definitions in its
WebZ database configuration file.
- Place each
tagpath reference (which denotes a field to be indexed) on a separate
line. You can use either a numeral in each tagpath variable (tagpath1,
tagpath2, and so on), or an asterisk (tagpath*), but you cannot mix
numerals and asterisks in the same index definition. If you use numerals,
each numeral must be in sequential order and you cannot skip any numbers.
- You define
stopwords in a special index definition, usually named [stopwords].
See the example below and the [stopwords]
section of Pears Database Description Configuration
File.
The remainder
of this section provides side-by-side examples of the Pears and Newton
index definitions from the Record Builder DC(2) and MARCCat database frameworks
provided with the Record Builder application. Even if you don't use Record
Builder, these examples compare index definitions in Pears and Newton.
Look at the examples from both database frameworks; they illustrate principles
that apply to many types of databases.
Note: |
|
For readability,
some parameters include a backslash (\) as a continuation character
or indent words, such as:
routine = \
ORG.oclc.pears.IndexRoutines.
PluralWords
In the
database description configuration file, keep each parameter on
a single line, like this:
routine = ORG.oclc.pears.IndexRoutines.PluralWords
or use backslash
characters as continuation characters if necessary.
|
Examples
from the DC Database Framework
Pears
|
Newton
|
[BasicIndex]
index = 2
routine =
ORG.oclc.pears.IndexRoutines.Words
OccurrenceRoutine = \
ORG.oclc.pears.Bartlett.wordfield
tagpath* = 101/1001
tagpath* = 101/1011/1001
tagpath* = 102/1001
tagpath* = 102/1011/1001
tagpath* = 103/1001
tagpath* = 103/1011/1001
tagpath* = 104/1001
tagpath* = 104/1011/1001
....
tagpath* = 116/1001
tagpath* = 90/202/1001
tagpath* = 90/201/1001
|
/* Basic Index
*/
index(2): plural words() from(\
/* DC:Type */101/1001, 101/1011/1001, \
/* DC:Format */102/1001, 102/1011/1001, \
/* DC:Description */103/1001, \
103/1011/1001, \
/* DC:Language*/104/1001, 104/1011/1001, \
....
/* DC:Author */116/1001, \
/* SubjectPhrase */ 90/202/1001, \
/* DeweyNumber */ 90/201/1001) \
with(fldid, pos) |
[DC:Type]
index = 101
routine = \
ORG.oclc.pears.IndexRoutines.Words
OccurrenceRoutine = \
ORG.oclc.pears.Bartlett.wordfield
tagpath* = 101/1001
tagpath* = 101/1011/1011 |
/* DC:Type */
index(101): sparse words() \
from(101/1001, 101/1011/1001) \
with(fldid, pos) |
[DC:Description]
index = 103
routine = \
ORG.oclc.pears.IndexRoutines.
PluralWords
OccurrenceRoutine = \
ORG.oclc.pears.Bartlett.wordfield
tagpath* = 103/1001
tagpath* = 103/1011/1001
|
/* DC:Description
*/
index(103): plural words() \
from(103/1001, 103/1011/1001) \
with(fldid, pos) |
[DC:Identifier]
index = 156
routine = \
ORG.oclc.pears.IndexRoutines.Words
startOffset= 0
OccurrenceRoutine = \
ORG.oclc.pears.Bartlett.wordfield
tagpath* = 106/1001
tagpath* = 106/1011/1001
|
/* DC:Identifier
*/
index(156): substr1(0) \
from(106/1001, 106/1011/1001) \
with(fldid, pos) |
[DC:Title]
index = 158
routine = \
ORG.oclc.pears.IndexRoutines.Phrase
tagpath* = 108/1001
tagpath* = 108/1011/1001 |
/* DC:Title */
index(158): phrase2() \
from(108/1001, 108/1011/1001) |
[stopwords]
index = 0
routine = \
ORG.oclc.pears.IndexRoutines.
StopwordEnforcer
tagpath = none
stopword* = a
stopword* = an
stopword* = and
stopword* = are
stopword* = as
stopword* = at
stopword* = be
stopword* = but
stopword* = by
stopword* = for
....
stopword* = was
stopword* = which
stopword* = with
stopword* = you |
begin stopwords
a
an
and
are
as
at
be
but
by
for
....
which
with
you
end stopwords |
Examples
from the MARCCat (MARC Catalog) Database Framework
Pears
|
Newton
|
[rule2]
index = 201
routine = \
ORG.oclc.pears.IndexRoutines.Words
tagpath* = 1
startOffset = 3
|
/* ID */
index(201): sparse substr1(3) from \
(1) with(fldid, pos) |
[rule4]
index = 1
routine = \
ORG.oclc.pears.IndexRoutines.Phrase
tagpath* = 245
subfield* = 1
subfield* = 2
joinFieldsWith = \u0020\u0020
|
index(01): combab()
from(\
/* 245 */245/01\
) with(fldid, POs) |
[rule5]
index = 2
routine = \
ORG.oclc.pears.IndexRoutines.
PluralWords
tagpath* = 130/01
tagpath* = 130/20
tagpath* = 130/14
tagpath* = 130/16
tagpath* = 130/11
tagpath* = 130/19
tagpath* = 130/07
tagpath* = 130/04
tagpath* = 211/01
tagpath* = 212/01
tagpath* = 214/01
tagpath* = 222/01
tagpath* = 222/02
...
tagpath* = 830/01
tagpath* = 830/20
tagpath* = 830/14
tagpath* = 830/16
tagpath* = 830/11
tagpath* = 830/19
tagpath* = 830/04
tagpath* = 830/22
tagpath* = 873/01
|
/* Titles */
index(02): plural words() from(\
/* 130 */130/01, 130/20, 130/14, \
130/16, 130/11, 130/19, 130/07, \
130/04 \
/* 211 */ 211/01\
/* 212 */ 212/01\
/* 214 */ 214/01\
/* 222 */ 222/01, 222/02\
...
/* 830 */ 830/01, 830/20, 830/14, \
830/16, 830/11, 830/19, 830/04, \
830/22 \
/* 873 */ 873/01\
) with(fldid, POs) |
#---LCCLASS---#
[rule11]
index = 18
routine = \
ORG.oclc.pears.IndexRoutines.LCClass
tagpath* = 050/01
tagpath* = 050/04
tagpath* = 090/01
tagpath* = 055/01 |
/* LC-type Class
Numbers */
index(18): lcclass() from(\
/* 050 */050/01, 050/04\
/* 090 */ 090/01\
/* 055 */ 055/01\
) with(fldid, POs) |
#---DeweyDecimalClass---#
[rule12]
index = 19
routine = \
ORG.oclc.pears.IndexRoutines.Phrase
collapse = ?!,:;&_-<>[]@
tagpath* = 82/01
tagpath* = 092/01 |
/* Dewey Decimal
Class Numbers */
index(19): ddc() from(\
/* 082 */082/01\
/* 092 */ 092/01\
) with(fldid, POs) |
#---Publication
date ---#
[rule23]
index = 30
routine = \
ORG.oclc.pears.IndexRoutines.
PublicationDate
tagpath1 = 008
startOffset = 7
maxLength = 4 |
/* Publication
date */
index(30): notrangeable pubdate() from(008) |
#---YearRestrictor---#
[restrictor3]
index = 30
routine = \
ORG.oclc.pears.Bartlett.termrest |
restrict(30):
year_strct(1) from(008) mask(0000001111111111)\
norm(1077) |
[rule25]
index = 31
routine = \
ORG.oclc.pears.IndexRoutines.
MarcLanguage
tagpath* = 008
startOffset = 35 |
/* Language */
index(31): sparse marcla() from (008) |
#---LanguageRestrictor---#
[restrictor1]
index = 31
routine = \
ORG.oclc.pears.Bartlett.termrest
parameters = english german french
|
restrict(31):
lang_strct(1) from(008)\
mask(1100000000000000)\
terms("English","German","French") |
#---FormatRestrictor---#
[Restrictor2]
index = 4
routine = \
ORG.oclc.pears.Bartlett.termrest
parameters = bks map mix com sco \
ser rec vis |
restrict(254):
marc_rectyp_strct(1) from(00) mask(0011110000000000)\
terms("bks","ser","med","map","mss","rec","sco",\
"mrf","dx") |
Return
to Contents
See Also
Converting
a Newton Database to a Pears Database
Creating a New Pears Database
Pears Database Description Configuration
File
Pears-Newton Indexing Routine Comparison
Pears Record Handlers
Pears Indexing Routines
|