Main -> Documentation -> Database Builder – Newton -> Creating a New SiteSearch Database -> Creating a Database Tag Definition (.dtd) File

Creating a Database Tag Definition (.dtd) File

A database tag definition (.dtd) file is a standard text file that maps the relationships between different types of information (e.g., author, title, subject, isbn, etc.) in SGML data. A .dtd file serves as an interpreter for the Open SiteSearch Database Builder 4.0.x/4.1.x SGML conversion utility, sgmlconv, to map the SGML tagged data into ASN.1/BER format.

The .dtd file used during the conversion process is not a traditional SGML DTD. In this instance, sgmlconv uses the .dtd file as a map to identify tags within the source data and replace the tags with the corresponding numbers for use in converting data to ASN.1/BER as defined in the .dtd file. These numbers are then converted into ASN.1/BER tag paths. For more information about the contents of the .dtd file, refer to the Procedures and Tip sections below.

Example

Use the following sample SGML data as a reference while you review how to develop a .dtd file in the Procedures section below.


<rec>

<title>A Practical Guide to the UNIX System</title>

<edition>Third Edition</edition>

<author>

<last_name>Sobell</last_name>

<first_name>Mark</first_name>

<middle_initial>G.</middle_initial>

</author>

<publisher>Addison-Wesley Publishing Company</publisher>

<location>

<city>Menlo Park</city>

<state>California</state>

<country>USA</country>

</location>

<copyright>1995</copyright>

<isbn>0-8053-7565-1</isbn>

</rec>

Note:

Notice that most of the tags above are made up of one word that describes the type of data labeled (i.e., <edition>Third Edition</edition>). If you have to include a tag that includes two words, separate the words with an underscore ('_') instead of a space (e.g., <first_name>, <last_name>, and <middle_initial>) so that the data can be processed by the sgmlconv utility program.

Procedure

Use the following steps to create a .dtd file for your SGML source data. The .dtd file will be used by sgmlconv to convert your source data into ASN.1/BER format.

1. Create a list of the fields in your source data.

The example below shows a list of the fields used to identify the data within the sample file shown above.

Note:

The <rec> tag is not listed below and will not be included in the .dtd file because this tag is only used during the ASN.1/BER conversion process to mark a record for addition to the database.


title

edition

author

last_name

first_name

middle_initial

publisher

location

city

state

country

copyright

isbn

2. Assign each field a unique number.

The example below identifies each of the fields shown in step 1 and uses a numbering scheme to label each of the fields. It is recommended that you organize the numbers in sequential order for ease of use, but this is not required for the conversion process to run successfully.


title  1

edition  2

author  3

last_name  4

first_name  5

middle_initial  6

publisher  7

location  8

city  9

state  10

country  11

copyright  12

isbn  13

Tip: When assigning numbers, you can use an expressive notation to show the structural relationship ("nesting") among the fields. To show the relationships between the source data visually and numerically, you would list the data in step 1 as shown in the example below. Compare this example with the list above to see the differences in notation.


title  1

edition  2

author  3

last_name  31

first_name  32

middle_initial  33

publisher  4

location  5

city  51

state  52

country  53

copyright  6

isbn  7

3. Save the list of fields and field numbers (as shown in step 2) in a text file, dbname.dtd, where dbname is any valid filename. It is recommended that you name the .dtd file after the database you are building for identification purposes at a later time.

To view an example source data file, .dtd file, and ASN.1/BER output after conversion, refer to the Database Tag Definition (.dtd) File Example.

Tip

If you are unfamiliar with your SGML source data, it might be helpful to use the sgmlconv utility program to create a .dtd file. When you run the sgmlconv utility from the UNIX command prompt, you do not need to include a .dtd file. Instead, the utility can create an initial .dtd file for you by identifying all of the fields within the source data. The utility program simply recognizes a tag and assigns it the next available number. In essence, the utility can perform step 1 and 2 above for you by identifying every tag in your source data.

However, this primitive .dtd file does not contain an expressive numbering notation as shown in the second example within step 2 above. To include a notation that reflects "nested" data, you must edit the .dtd file created by the sgmlconv utility using a text editor.

Note:

If your source data is very simple and does not contain nested data, you can use the .dtd file generated from sgmlconv as the final .dtd file for the conversion process.

To create a .dtd file using the sgmlconv utility program, refer to the Sgmlconv Utility reference documentation. It is recommended that you execute this utility program from the UNIX command prompt from within the dbs directory of your database directory.

See Also

The Sgmlconv Utility
Creating a New SiteSearch Database
Database Tag Definition (.dtd) File Example


[Main][Documentation][Support][Technical Reference][Community][Glossary][Search]

Last Modified: