Main -> Documentation -> Database Builder – Newton -> Introduction to Using SGML Source Data to Build a Database

Introduction to Using SGML Source Data to Build a Database

What is SGML?

Standard Generalized Markup Language (SGML) is an international standard (ISO 8879) that describes the relationship between a document's content and its structure. Unlike other common document file formats that represent both content and presentation, SGML represents a document's content data and structure (interrelationships among the data).

How is SGML data used by the Database Builder software?

Open SiteSearch Database Builder 4.0 requires that data exist in ASN.1/BER format so that it can be used and searched as a local database. Sgmlconv, a conversion utility included in the Database Builder software, creates ASN.1/BER formatted data from the SGML source data.

What elements make up an SGML source data file?

Below is a simple example of an SGML source data file. Notice the use of tags around the record content that create a field of data (e.g., <city>Menlo Park</city>). These record fields will be indexed within the database to allow searching.

Example


<rec>

<title>A Practical Guide to the UNIX System</title>

<edition>Third Edition</edition>

<author>

<last_name>Sobell</last_name>

<first_name>Mark</first_name>

<middle_initial>G.</middle_initial>

</author>

<publisher>Addison-Wesley Publishing Company</publisher>

<location>

<city>Menlo Park</city>

<state>California</state>

<country>USA</country>

</location>

<copyright>1995</copyright>

<isbn>0-8053-7565-1</isbn>

</rec>

What is a Database Tag Definition (.dtd) File?

The database tag definition (.dtd) file, a standard text file, is used by the Database Builder software to assign a number to each SGML tag in the SGML source data for its mapping to ASN.1/BER format.

If you are familiar with SGML, the .dtd file used by the SiteSearch system is not a standard .dtd file, but rather a basic data map. For more detailed information about how to write a .dtd file and how it works, refer to Creating a Database Tag Definition (.dtd) File.

Note:

The .dtd file does not determine the hierarchy of the SGML elements. The hierarchy and relationships between the different types of tagged SGML data is determined by the relative position of the tags in the SGML source data. Refer to Database Tag Definition (.dtd) File Example to see a sample of an SGML source data file and its corresponding .dtd file.

Why are the <rec>, <reprec>, and <delrec> tags important to the Database Builder software?

Each SGML record that will be converted to ASN.1/BER format needs to begin and end with a <rec>, <reprec>, or <delrec> tag. If one set of these tags does not surround your SGML data as shown with the <rec> tag in the example above, add the appropriate beginning and ending tag to each record in the data using a text editor before you run the conversion utility. These tags represent the task the software should complete on a specific record as the database is being built.

  • The <rec> tag labels the record as a new record that needs to be added to the database.
  • The <reprec> tag labels the record as a revised record that needs to replace a previous version of the record.
  • The <delrec> tag labels the record as a record that needs to be deleted from the database.
Note:

The <rec>, <reprec>, and <delrec> tags are not included in the database tag definition (.dtd) file, but are included in the source data file. These tags serve as markers for the Database Builder software and are not indexed as record content.

How can I learn more about SGML source data and the Database Builder software?

As you begin to work with the Database Builder tools and utilities, you will learn more about your SGML source data. The following resources will provide you with the steps necessary to create a new database from your data and will give you tips for understanding, planning, and organizing your data in the process.

Creating a New SiteSearch Database
The Sgmlconv Utility
Creating a Database Tag Definition (.dtd) File
Database Tag Definition (.dtd) File Example


[Main][Documentation][Support][Technical Reference][Community][Glossary][Search]

Last Modified: