Main -> Documentation -> Database Builder – Newton -> Creating a New SiteSearch Database -> Creating a Database Tag Definition (.dtd) File -> Database Tag Definition (.dtd) File Example

Database Tag Definition (.dtd) File Example

The Open SiteSearch Database Builder 4.0.x/4.1.x software uses the sgmlconv utility program to convert source data formatted in Standard Generalized Markup Language (SGML) to ASN.1/BER formatted data as part of the database creation process. Sgmlconv needs a database tag definition (.dtd) file to serve as a data map of the SGML source data it will convert. This document includes three example files that illustrate the input and output of the SGML conversion process: example.sgml, example.dtd, and example.ber. The example.sgml is the SGML source data file; example.dtd is the corresponding .dtd file; and the example.ber file contains the ASN.1/BER record after converting the SGML source data to ASN.1/BER format. The introductory material at the beginning of each example below points out particular aspects of each file that you should pay particular attention to when creating and editing your files.

Example

This example contains an SGML source data file, example.sgml. Identify the fields below and their structural relationships between one another. Compare this hierarchy to the list of fields included in the .dtd file, example.dtd.


<rec>

<obj_blk>

<objid>dbb_50-00-10t</objid>

<author>John Doe</author>

<status>Draft</statusApproved

<datemod>12/31/98

<datecomp>10/14/1997 2:02:52 PM</datecomp>

<product>dbb</product>

<page>Task</page>

<pagedescr>Includes the procedure for creating a SiteSearch database 

from Standard Generalized Markup Language (SGML) source data. Provides 

additional links with each step for more detailed 

instructions.</pagedescr>

<search>Creating a SiteSearch Database from SGML Data</search>

<para>Main -> Documentation -> Database Building -> Creating a 

SiteSearch Database -> Creating a SiteSearch Database from SGML 

Data</para>

<history_blk>

<para>None.</para>

</history_blk>

</obj_blk>

<title>Creating a SiteSearch Database from SGML Data</title>

<intro_blk>

<para>Creating a SiteSearch database from Standard Generalized Markup 

Language (SGML) source data requires an understanding of the tagging 

structure used to organize your data. Before you create the database, 

it is important to plan the database because you will be responsible 

for defining the structure and relationships between data in the first 

step, creating a .dtd file. Below is the procedure used to convert SGML 

data into ASN.1/BER format and to build a SiteSearch database from this 

BER data.</para>

</intro_blk>

<proc_blk>

<stitle>Procedure</stitle>

<para>The following steps describe how to create a SiteSearch database 

from existing SGML source data:</para>

<table_blk>

<step_blk>

<step>1. Create a .dtd file.</step>

<step>This step contains guidelines for creating a database tag 

definition (.dtd) data map file that defines the hierarchical 

relationships between the tagged instances within your SGML source 

data. SSDOT, a menu-driven interface designed to automate the creation 

and maintenance of databases, will use this file during the conversion 

process to translate the source data records into ASN.1/BER 

format.</step>

</step_blk>

<step_blk>

<step>2. Convert your SGML source data to ASN.1/BER format.</step>

<step>This step discusses using SSDOT to convert your SGML-formatted 

source data into ASN.1/BER format so that the records are usable by the 

SiteSearch system.</step>

</step_blk>

<step_blk>

<step>3. Create a database description (.dsc) file.</step>

<step>This step instructs you how to create a .dsc file. The .dsc file 

is a text file that defines how the SiteSearch database building 

software constructs your database. Attributes defined in the .dsc file 

include:</step>

<ul>

<li>database size</li>

<li>specific indexes to generate</li>

<li>index characteristics</li>

</ul>

</step_blk>

<step_blk>

<step>4. Build the database.</step>

<step>In this step, SSDOT converts the BER data into a searchable 

SiteSearch database.  This step is referred to as the database build 

process.</step>

</step_blk>

</table_blk>

</proc_blk>

<ref_blk>

<stitle>See Also</stitle>

<para>Planning a Database</para>

<para>Creating a SiteSearch Database</para>

<para>Creating a SiteSearch Database from MARC Data</para>

<para>Creating a SiteSearch Database from Other Source Data</para>

</ref_blk>

</rec>

Example

This example contains the example.dtd file you would use to convert the SGML data into ASN.1/BER format. Notice how the structural relationships, or "nested" elements, that you identified in example.sgml above are represented in the .dtd file below. For example, the <objid> field occurs only within the <obj_blk> field in the SGML source data. This relationship is recognized in the .dtd file by labeling obj_blk with 2 and objid with 21.

You might also note that the <rec> and </rec> tags used in example.sgml are not present in the .dtd file below. The <rec> and </rec> tags mark the beginning and ending of a record to be added to the database. Sgmlconv recognizes this notation when converting the SGML source data into ASN.1/BER formatted data. These tags are not defined in the .dtd file because they are recognized and translated during the ASN.1/BER conversion process to mark a record for addition to the database and will not be indexed for searching purposes.


obj_blk  2

objid  21

author  22

status  23

datemod  24

datecomp  25

product  26

page  27

pagedescr  28

search  29

history_blk  3

ol  31

ul  32

li  33

para  4

title  5

descr_blk  6

stitle  7

intro_blk  8

req_blk  9

proc_blk  10

step_blk  11

step  111

struct_blk  12

table_blk  13

ttile  131

thdr  132

tdata1  133

syntax_blk  14

syntax  141

ref_blk  15

image_blk  16

ititle  161

image_source  162

image  163

note_blk  17

example_blk  18

example  181

Example

The example below shows the ASN.1/BER encoded record, example.ber, which was created by the sgmlconv utility when converting the SGML source data file, example.sgml, into ASN.1/BER format using the example.dtd file above. example.ber was formatted using the ber2txt utility to present a human-readable format. Refer to the Ber2txt Utility reference document for additional information about viewing formatted ASN.1/BER records.


tag=0, Class=1, form=1, count=5

  tag=2, Class=2, form=1, count=11

    tag=21, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=13

        data=dbb_50-00-10t

             6665332332337

             422F50D00D104

    tag=22, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=8

        data=John Doe

             46662466

             AF8E04F5

    tag=23, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=5

        data=Draft

             47667

             42164

    tag=24, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=8

        data=10/14/97

             33233233

             10F14F97

    tag=25, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=21

        data=10/14/1997 2:02:52 PM

             332332333323333333254

             10F14F199702A02A5200D

    tag=26, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=3

        data=dbb

             666

             422

    tag=27, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=4

        data=Task

             5676

             413B

    tag=28, Class=2, form=1, count=1

      tag=1, Class=2, form=0, count=198

        data=Includes the procedure for creating a 
SiteSearch database ..from 46667667276
62776666776266726766766626256765667662
6676667620067669E3C54530485002F3545250
6F20325149E701039453512380414121350DA6
2FD Standard Generalized Markup Langua
ge (SGML) source data. Provid257666676
24666766676624676772466676662254442276
776626676225767660341E4124075E521C9A54
0D12B500C1E751750837DC903F523504141E00
2F694es ..additional links with each s
tep for more detailed ..instruc6720066
66766666266667276762666627767266726676
2667666662006677776530DA144949FE1C0C9E
B307948051380345006F20DF25045419C540DA
9E34253tions. 766672 49FE3E tag=29, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=45 data=Creating a SiteSearch Database from SG
ML Data4766766626256765667662467666762
67662544424676325149E70103945351238041
412135062FD037DC04141 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=129 data=Main -> Documentation -> Database Buil
ding -> Creating a ..SiteS466622324667
66676766622324676667624766666622324766
76662620056765D19E0DE04F35D5E4149FE0DE
0414121350259C49E70DE0325149E7010DA394
53earch Database -> Creating a SiteSea
rch Database from SGML ..Dat6676624676
66762232476676662625676566766246766676
2676625444200467512380414121350DE03251
49E70103945351238041412135062FD037DC0D
A414a 6 1 tag=3, Class=2, form=1, count=1 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=5 data=None. 46662 EFE5E tag=5, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=45 data=Creating a SiteSearch Database from SGML
Data47667666262567656676624676667626766
2544424676325149E70103945351238041412135
062FD037DC04141 tag=8, Class=2, form=1, count=1 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=510 data=Creating a SiteSearch database from St
andard Generalized Markup 476676662625
67656676626676667626766257666676246667
66676624676772325149E70103945351238041
412135062FD0341E4124075E521C9A540D12B5
00..Language (SGML) source data requir
es an understanding of the t0046667666
22544422767766266762767767672662766677
7666666266276627DAC1E751750837DC903F52
350414102515925301E05E452341E49E70F604
8504agging ..structure used to organiz
e your data. Before you create66666620
07777677762776627626766667627677266762
2466676276726766761779E70DA34253452505
35404F0F271E9A509F5204141E0256F2509F50
325145 the database, ..it is important
to plan the database because yo276626
67666762200672672667677667276276662766
266766676266667762760485041412135C0DA9
409309D0F241E404F00C1E0485041412135025
3153509Fu will be responsible ..for de
fining the structure and relations7276
66266276776676666200667266666666276627
77767776266627666766675079CC02502530FE
392C50DA6F204569E9E70485034253452501E4
025C149FE3hips between data in the fir
st ..step, creating a .dtd file. Bel66
77266776662667626627662667772007767226
76676662622676266662246689030254755E04
14109E04850692340DA3450C0325149E7010E4
44069C5E025Cow is the procedure used t
o convert SGML ..data into ASN.1/BER f
67267276627766667762776627626667677254
44200667626676245423244526F70930485002
F3545250535404F03FE6524037DC0DA414109E
4F013EE1F25206ormat and to build a Sit
eSearch database from this ..BER data.
67667266627626766626256765667662667666
762676627667200445266762F2D1401E404F02
59C40103945351238041412135062FD048930D
A25204141E tag=10, Class=2, form=1, count=3 tag=7, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=9 data=Procedure 576666776 02F354525 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=98 data=The following steps describe how to cr
eate a SiteSearch database566266666766
62776772667676662667276267667626256765
6676626676667648506FCCF79E703450304533
292508F704F032514501039453512380414121
35..from existing SGML source data:200
67662676776662544427677662667630DA62FD
0589349E7037DC03F523504141A tag=13, Class=2, form=1, count=4 tag=11, Class=2, form=1, count=2 tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=22 data=1. Create a .dtd file. 3224766762622676266662 1E0325145010E444069C5E tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=409 data=This step contains guidelines for c
reating a database tag .5667277672
66676667267666666672667267667666262
66676667277762048930345003FE419E307
5945C9E5306F20325149E70104F35D5E404
9050D.definition (.dtd) data map fi
le that defines the hierarchic06666
66766622267622667626672666627667266
66667276626667676666A4569E949FE08E4
449041410D10069C50481404569E5304850
8952123893al ..relationships betwee
n the tagged instances within your
66200766676667667726677666276627666
66266776666727676662767721C0DA25C14
9FE389030254755E0485041775409E341E3
53079489E09F520SGML source ..data.
SSDOT, a menu-driven interface desi
gned 544427677662006676225544522626
66726767662667676666266766666237DC0
3F52350DA4141E0334F4C010D5E5D42965E
09E4526135045397E540to automate the
creation ..and maintenance of data
bases, wil7626776667627662676676662
00666266667666666266266766676722766
4F0154FD14504850325149FE0DA1E40D19E
45E1E350F60414121353C079Cl use this
file during the conversion ..proce
ss to translate62776276672666626776
66276626667677666200776667727627766
76676C053504893069C504529E7048503FE
65239FE0DA02F353304F0421E3C145 the
source data records into ASN.1/BER
..format.27662767766266762766676726
67624542324452006676672048503F52350
41410253F24309E4F013EE1F2520DA6F2D1
4E tag=11, Class=2, form=1, count=2 tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=53 data=2. Convert your SGML source data to
ASN.1/BER format.32246676772767725
44427677662667627624542324452667667
22E03FE652409F52037DC03F52350414104
F013EE1F25206F2D14E tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=157 data=This step discusses using SSDOT to
convert your SGML-formatt5667277672
66767776727766625544527626667677276
77254442667667748930345004933533530
539E70334F404F03FE652409F52037DCD6F
2D144ed ..source data into ASN.1/BE
R format so that the records a66200
76776626676266762454232445266766727
62766727662766676726540DA3F52350414
109E4F013EE1F25206F2D1403F048140485
0253F24301re usable by the ..SiteSe
arch system.76277666626727662005676
566766277776622505312C502904850DA39
45351238039345DE tag=11, Class=2, form=1, count=3 tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=45 data=3. Create a database description (.
dsc) file.3224766762626676667626676
767766622267622666623E0325145010414
121350453329049FE08E4339069C5E tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=215 data=This step instructs you how to crea
te a .dsc file. The .dsc 5667277672
66777767727672667276267667626226762
66662256622676248930345009E34253430
9F508F704F0325145010E433069C5E04850
E4330file ..is a text file that def
ines how the SiteSearch databa66662
00672627677266662766726666667266727
66256765667662667666 tag=11, Class=2, form=1, count=2 tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=22 data=4. Build the database. 3224766627662667666762 4E0259C40485041412135E tag=111, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=141 data=In this step, SSDOT converts the BER
data into a searchable 462766727767
225544526667677727662445266762667626
2766766666629E0489303450C0334F403FE6
5243048502520414109E4F01035123812C50
..SiteSearch database. This step is
referred to as the data005676566766
266766676222566727767267276667766276
267276626676DA3945351238041412135E00
48930345009302565225404F013048504141
base build ..process. 667626766620077666772 21350259C40DA02F3533E tag=15, Class=2, form=1, count=5 tag=7, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=8 data=See Also 56624676 35501C3F tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=19 data=Planning a Database 5666666626246766676 0C1EE9E701041412135 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=30 data=Creating a SiteSearch Database 476676662625676566766246766676 325149E70103945351238041412135 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=45 data=Creating a SiteSearch Database from MARC
Data47667666262567656676624676667626766
2445424676325149E70103945351238041412135
062FD0D12304141 tag=4, Class=2, form=1, count=1 tag=1, Class=2, form=0, count=53 data=Creating a SiteSearch Database from Othe
r Source Data476676662625676566766246766
67626766247667256776624676325149E7010394
5351238041412135062FD0F485203F523504141

See Also

Creating a Database Tag Definition (.dtd) File
Creating a New SiteSearch Database


[Main][Documentation][Support][Technical Reference][Community][Glossary][Search]

Last Modified: