Document Creation, Publishing, and Markup Languages,
Including: SGML, HTML and XML
Phillip A. Covington


Personal Web Site
Purpose of This Introduction
and Intended Audience


For someone thinking about using SGML for document creation, publishing, or management, the introduction that follows can help you determine if SGML is the tool for you, and serve as a launching pad to more technical resources. It is assumed that most readers will not actually be using SGML themselves, but that you are probably looking for more information about what SGML is, and how it might fit into and benefit your organization. To help you determine how beneficial this introduction and SGML might be for you and your organization, below is a quick summary of the major applications SGML is best suited for:
[Home]

This page last updated
20-Apr-2001



Applications SGML Is Best Suited For
  • Long documents (i.e., 100s or 1,000s of pages)
  • Complex, formal, structured documents (i.e., technical and instruction manuals, specifications, rules and regulations)
  • Documents that require constant/repetitive revision
  • Those who must exchange documents with other SGML users
  • Creating one document (instead of multiple, different copies) that contains different versions of the same text so that different users can print or view only the version they need to see (i.e., the instructor's versus student's copy of a book or training manual)
  • Organization-wide compatibility and standardization
  • Anyone wanting to get a head start on getting their documents into the format likely to become the dominate standard (like Microsoft Word is today)



Note: This introduction is designed to be read starting at the beginning and moving to the end. Due to its length, however, an index is provided for the subject headings found on this page:

Table of Contents (This Page)


A Brief Description

SGML, HTML, and XML

The Process of Evolving Technology

The "Old" and New of The Internet

GML (Generalized Markup Language)

The Disadvantages of Non-Exchangeable Data

An Early Example Utilizing Limited Hardware

The Advantages of Text Markup Languages

The "Standard" In "Standardized"

Text Processing

The Limitations of Traditional Word Processing

How SGML Goes Beyond Simple Word Processing

Software Needed To Use SGML

Skills Required To Use SGML

A Simple Example of How SGML Works

A Sample Letter Marked Up

Drawbacks

Conclusion

SGML/XML Resources and Vendors




A Brief Description

As applied to computers the term "text markup" gets its roots from the handwritten proofreading and editorial remarks still in use today by writers, editors, typesetters, and publishers when noting within a text what needs to be done i.e., "put this in italics," "delete this word," "move this sentence down here," and so on. This handwritten editorial and proofreading "markup language" is fairly standardized so that anyone familiar with the process knows what the marks mean. The concept behind using a markup language to process text with computers is the same, standardization, but, of course, much more powerful.

SGML, HTML, and XML

The computer text markup language that enjoys the most widespread use is SGML, which stands for: "Standard Generalized Markup Language." SGML is what is known in the computer field as a "metalanguage." This means that in addition to being a very powerful language in its own right, SGML can also be used as the basis for the creation of new text markup languages. The more recent HTML (Hypertext Markup Language) was designed because of the need for a language specialized for the Internet. HTML is based upon and is a less powerful version of SGML, but has evolved into a separate standard that is no longer directly compatible.
Because the explosive growth of the Internet and networks in general has overtaxed the capabilities of HTML, even newer languages are in the making to bring greater capabilities and standardization to Internet programming. One of the most promising is XML (Extensible Markup Language). XML is designed to address the limitations and shortcomings of HTML. XML is also the first step in an attempt to bring all software for processing textual information, including on the Internet, closer to a global standard that will allow access to or the exchange of documents regardless of the type of system they are created with or stored on.
Unlike HTML, which is based upon but different than SGML, XML is a fully compatible subset of SGML. In other words any program written in XML will also be compatible with, and be able to be run by, a SGML system. However, because XML is a subset of SGML (not the full version) it will not have all of the capabilities of SGML. You might be wondering: "If SGML is the most widespread standard why not just adapt SGML for use on the Internet?" Well, that's a good question. That's also where the "Xtensible" part of XML comes in.
Being designed as an extensible language means XML's capabilities can be more easily extended over time as needed. It takes time for those who use any system to adapt to something new. So while it is true that SGML is the most powerful markup language in widespread use, more powerful systems also require a greater learning curve, require more computing resources, and are usually more expensive. XML is the most promising language as a logical next step in bridging those gaps and working toward the ultimate solution. However, because XML has not yet been approved as an official standard language it is understandable that many have chosen to wait before committing to it.

The Process of Evolving Technology

The evolution of the most widely used operating systems, those made by Microsoft, has taken a similar path and serves as an analogy to the above. An operating system based upon Microsoft's Windows NT will almost surely become the next "standard" operating system. Microsoft has felt that way for a long time, and so have many in the computer field. However, at the time this was first known in the early 1990s many organizations (especially smaller ones) didn't have the necessary computer capacity or in-house expertise needed to utilize an operating system slated as a mainframe-strength replacement for the Unix operating system, let alone individual users. So, first Windows 3.1 was introduced, which worked in conjunction with the old DOS and still gave users the choice of running either. Then came Windows 95, which brought Windows closer to the NT operating system but didn't require as much processing power or memory, was easier to user, more compatible, cheaper, etc.
As the end of the 1990s approach, though, increasingly more powerful processors and falling prices mean that most newer system are now capable of running Windows NT. In addition, because Windows 95 (or the "98" version, etc.) and Windows NT now look much alike and operate much the same, it won't be that hard for most users to adapt to the full Windows NT operating system.
That's why it's been rumored that within a few years it is possible Microsoft may merge Windows 95/98/?? with Windows NT, after which there will no longer be two separate operating systems, just one with capabilities equal to or more powerful than NT. In the same way, even though the ultimately powerful SGML has been there all along, the creation of HTML, and now the progression to XML (and others), is affording a transition period to those who need time to adapt technologically and to acquire the necessary training and expertise.

The "Old" and New of The Internet

As hinted to in the section above, like the Internet, text markup languages such as SGML, HTML, and XML, are "new" technologies that in reality have actually been around for quite some time in concept if not in use.

The core computer network that was and is the Internet was being used by scientific, academic, business, and government users years before most people even knew what it was. That's because in those early years the Internet could only be accessed by those with a fair degree of computer knowledge and resources or via an institution that provided them. Then, some brilliant people got together, formed the company now known to everyone, Netscape Communications, and with their Netscape Navigator browser software have made the Internet available to the world. Text markup languages share a similar history. Because, while terms like HTML and SGML are quickly becoming increasingly hotter topics, and enjoying incredible growth in widespread use, the underlying technology and preliminary versions of such languages have been around since the 1960s.

GML (Generalized Markup Language)

Today's SGML is the direct descendent of an earlier non-"Standardized" version, that was called simply, GML (Generalized Markup Language), invented in 1969 by Charles Goldfarb, an IBM researcher. Even that early in the history of computers Goldfarb and others recognized the benefits of having a standardized format for the creation and exchange of documents. In fact, it wasn't long before most of the documents at IBM, followed by other large organizations, ended up being processed with SGML or a similar language. Like the early Internet, however, because these resources were for years available only on mainframes, it is only in recent years that text markup languages are starting to see widespread use on PCs.

The Disadvantages of Non-Exchangeable Data

As you will soon see, for many kinds of documents the lack of a standardized method for creating, processing, and exchanging them is inefficient for any organization, especially a larger one that processes huge numbers of documents. I learned typing (and computers) at an early age and my personal output has increased steadily over the years to a current rate of as many as 1,000 documents a year or more. So even without being a large organization I recognized early the advantages of text markup languages. Years before Microsoft Word or WordPerfect which now have the ability to processes data created by other software or even WordStar, the thought of having to convert data to another system was not fun! That's why I developed a primitive text markup language for my own use. By the way, I can assure you this is not as impressive as it sounds; many did the same. In fact, in a later example we'll create an extremely basic markup language that will give you a pretty good idea of the types of techniques used by me, and others, before better solutions were available.

An Early Example Utilizing Limited Hardware

There are many advantages to using a text markup language, one of which is the ability to process text on many different types of computers, whether mainframe, or PC, or Apple, etc. That flexibility came in handy for me in 1987 when I wanted to write a book over 200 pages long on a computer with VERY limited capacity running the old DOS operating system. The computer was a Zenith Z-183 laptop which despite being "state of the art" by having the world's first bright, full-sized screen, and a hard drive, had only 10 megabytes of hard disk storage space!
The screen was text only and, even though fast for a laptop at that time, the Z-183 had less than 1/20th of the processing power of 1990s PCs. But that's the beauty of text markup languages. As long as the software runs on your computer, regardless of what kind, it has only to be concerned about the details of processing the text. Even a computer with a screen limited to displaying text only and no graphics can still utilize a text markup language. Remember, it wasn't until the late 1980s that the ability to display graphics on monitors became common.
For the above book project I used a markup language called PowerText, from a New York company started by a programmer/writer who had worked with SGML on mainframes. While the above illustrates an extreme example in relation to the modern business world and even most people at home, who now enjoy powerful, affordable computers, the concept is still just as valid. Thanks to the Internet-based markup language HTML people around the world have access to information, even if they happen to be using one of the thousands of older computers still in operation.

The Advantages of Text Markup Languages

For the modern company or organization with the latest in computer technology the advantages of using a standardized markup language are as significant today, and more so, as they were when computers' capabilities were more limited. Despite the fact that computers are now more standardized, a variety of different machines and operating systems still exists, often even within the same organization.
What if you need to move hundreds or thousands of documents from one of your machines running the Unix operating system to one running Windows? Perhaps you still have machines running OS/2, or how about Apple? What about going from a Digital VAX to a Sun, or moving data from an old mixed network to a new IBM AS/400? If you work within a large organization you may be familiar with these different types of computers, but if not, that's OK. What's important is that the need to be able to exchange text between many different types of computers is just as important as ever, and growing every day as companies create more and more documents.

The "Standard" In "Standardized"

The term "Standardized" as applied to a markup language such as SGML or HTML means that a recognized overseeing body, such as the ISO (International Organization for Standardization) or ANSI (American National Standards Institute), has issued a standard specification for the language which all software claiming to be in compliance must adhere to. This assures that regardless of the types of computers you want to exchange documents between, as long as your software adheres to the applicable standard, you can be certain the same documents will be usable on the machines at both ends. That's why many companies and departments of the government now require that any documents created for or exchanged with them be SGML compliant.

Text Processing

A feature of advanced markup languages that is just as important as the ability to exchange documents with others is the ability to performing processing operations on those documents. The popular term "word processing" has actually become somewhat inaccurate. It implies that word processing software processes words in much the same way a computer programming language processes computer instructions. The major difference is that once a computer program is completed, the resulting software requires little if any further intervention from the original programmer. In contrast, most word "processing" is the result of the operator formatting the text and preparing it for any necessary further processing. Once the needed instructions have been programmed into a markup-language based text system, though, those instructions can perform the same processing operations on the text day in and day out without any requirement for further programming or formatting by an operator.

The Limitations of Traditional Word Processing

Let's look at two similar examples to illustrate the above point. We are all familiar with the computerized form letter. Using word processing combined with a database of names and addresses the sender can easily "personalize" the same letter so that each is not only addressed to each recipient, but their name and/or that of their company appears in various places throughout each letter as well. Even though the letter may have been sent to 10,000 people each will see their own name and related information.
You may also be familiar with document "templates," or "stylesheets," which are now also supported to some degree by most popular word processors. Form letters are created by adapting one standard document to numerous individuals. Templates are the exact opposite in that from a choice of numerous standard documents one is selected and adapted for the creation of a letter directed to a single individual. For instance, among your templates you might have one prepared in the format of a standard letter, while others might be set up with the appearance of a fax, or memo. When using a template the appearance, margins, spacing, commonly used wording, name and address of the sender, etc., have already been set up by you or someone else so instead of having to format and type an entirely new document you just apply the new text needed to fill in the blanks. These features and others like them do save considerable amounts of time and increase efficiency, but there are limitations.

How SGML Goes Beyond Simple Word Processing

Now let's look at an example using advanced markup language techniques that would be difficult or impossible to accomplish using the above word-processing software. In the above example small pieces of personalized information were inserted into a form letter where the content stays the same. What if, though, you needed the content of the document to also change depending on whom it is being sent to? One of the best examples is that of the changing sales tax amounts and legal requirements encountered from state to state. If your organization is nationwide or does business in multiple states you almost surely have people creating different versions which are identical except for different wording required by different state laws. In certain states you might be required to include a disclaimer or notice that consists of an entirely different sentence or paragraph which in other states does not need to be in the document.

It might be a phrase like:

"This product meets or exceeds all applicable emissions standards as required by the State of California and is equipped with a spark arrestor. There may be penalties for altering or removing such devices. Please consult your owner's manual for more information."

Large organizations, and law offices and legal departments especially, are routinely faced with document publishing and management tasks such as those above that require "boilerplate" text (sections of standard text that can be used over and over in different documents).

Without SGML, most organizations handle this by storing a separate copy of every possible variation of each document. Retrieving an appropriate document is fairly easy, but the process of creating and maintaining all those separate documents requires considerable man-hours and can be a major problem. Remember, every time a change needs to be made to ANY of the text, ALL of the documents (form letters, contracts, etc.) which contain that wording must be revised as well. Maintaining large numbers of documents in such a way is error prone, inefficient, costly, and sometimes results in wording being included or omitted for which a company could be legally liable.
With SGML, only one version of such a document needs to be created and maintained because all of the combinations of text that may need to be included or excluded can be kept within the same document. When it comes time to print or view the document SGML can make sure that users in different locations see only the version of the document which is applicable to their state, or to their different situation. There are many, many other situations where SGML offers significant advantages. However, the object here is not to cover every aspect of SGML, but to provide you with a sound understanding of what SGML is, how it is used, and the advantages of using it.

Software Needed To Use SGML

Because of the nature of SGML, and for maximum flexibility, it is available as a collection of separate, stand-alone components or can be purchased as part of an integrated document management software application designed to provide all necessary tools in one convenient package. The latter, an all-in-one package, seems to be becoming the most popular option for overall document creation and management purposes. For sophisticated applications, or those which require custom programming and/or system integration, separate components are still needed in addition to or in place of an all-in-one package.
The separate components of SGML include: an editor (essentially like a word processor but designed for the input of text marked up in SGML); a "parser" (which processes the edited text marked up in SGML, checks for errors, and can perform other operations); and a "formatter" (which is the program that takes the marked-up text processed by the parser and formats it so that it can actually be printed). SGML systems are also sometimes used in conjunction with a database from which various information in addition to that contained within the text may be accessed.

In much the same way that word processors like WordPerfect and Microsoft Word, or Internet browsers like Internet Explorer or Netscape Navigator, combine various components into one easier to use software package, SGML applications like those by Interleaf, SoftQuad, FrameMaker+SGML by Adobe, etc.do much the same. The biggest difference is that due to the power of SGML packages they tend to be larger, more complex, and more expensive. And, unlike word processors, they are primarily designed for use by those with a technical background. Notably, SGML capability is a standard built in feature of newer versions of WordPerfect. Microsoft, in conjunction with Interleaf, offers an add-on SGML product called SGML Author, which is designed to work with Microsoft Word.

Skills Required To Use SGML

The term "text markup language" can be confusing to both technical and non-technical users because the word "language" generally applies more to traditional computer programming languages like C++, or Visual Basic. However, the process of marking up text, either by hand or computer, can take place entirely absent any programming commands or instructions. If you mark a particular word to indicate it should be in bold type, for instance, no related supporting structure or instructions are required for that to occur. SGML, however, requires that in many instances programming take place beforehand to make possible the advanced text processing operations it is capable of. Currently, a technically skilled person is required to utilize those features of SGML. As mentioned above, though, the lines between what is and is not "technical" can be confusing.
In general, someone whose only experience has been with word processing, regardless of how skilled, would probably be in over their heads to try SGML programming without the necessary background and training. To handle the complexities and capabilities of SGML requires someone with a background programming in either text markup or some other applicable computer language. While over that point there is no confusion, it does leave many organizations with a dilemma when it comes to hiring someone. That's because the whole purpose of SGML is document creation and management, and, other than for the simplest documents, that often requires a writer.
The problem is that most writers aren't programmers and most programmers aren't writers. Good writers might not be good at learning SGML, and certainly most programmers aren't writers! So, until SGML tools become more like word processors, giving anyone able to use a word processor the ability to produce SGML-compliant documents, employers may face difficulty finding technical people with the necessary combination of skills. The wait may not be that long, though, since due to the increasing popularity of SGML more software companies are working on easier to use programs that operate more like word processors. However even when word-processing-like ease of use becomes a reality for the day-to-day routine writing that can be handled by word processing employees or writerstechnically skilled people will still be needed to setup, program, and maintain such systems.

A Simple Example of How SGML Works

Since most people reading this introduction probably aren't planning to start programming in SGML I debated over whether to include a programming example. But, since a picture is worth a thousand words, I think ending this introduction with an example is a good idea. Besides, even if you don't understand the first two parts of the example, that's OK, because the superiority of SGML over other approaches to handling text will be obvious enough that you won't miss it!

The main difference in actually using SGML is that it is what is known as a "descriptive" language, while most other languages and methods of processing text are "procedural." Procedural methods and languages include word processors like WordPerfect and Microsoft Word, and other text markup languages, like HTML. When writing code in a procedural language, as the name implies, you are providing the instructions (procedures) which tell the computer how to process almost every element of text (a sentence, heading, paragraph, special appearance such as bold or italics, etc.). It is up to you to type in the necessary codes (or press the right key combinations) to tell the program what to do with each element of text that requires a different appearance, and what processing needs to take place. In a procedural language the information and instructions needed for formatting and processing text are placed within each document. In other words, you are specifying each procedure necessary to describe what the document should look like.

In a descriptive language like SGML most of the instructions and code you type into a document are not concerned about what should be done to the text, or even what it is supposed to look like. Instead, the purpose is to describe and identify each of the different types of text within the document. The actual instructions, procedures, programming, etc., required to process the document are stored at the beginning of the file, and/or in an entirely separate file called a DTD (Document Type Definition). It is fairly straightforward for an employee accustomed to word processing, a writer, editor, etc., to learn how to markup the text. And, as discussed earlier, software is available that makes marking up text with SGML very similar to and almost as easy as using a word processor. However, the writing of the actual procedures and, if needed, DTDs that instruct the program on what to do with the text after if is marked up is the part that usually requires a skilled technical person.

A Sample Letter Marked Up

For our example we'll use a short letter that is typical of a letter a company would want their customers to see first when they open a newly purchased product. We'll be looking at the same letter done in three different markup languages: WordPerfect, HTML, and SGML. Incidentally, as you may be aware ALL word processors perform text markup using their own proprietary markup language. In most cases the codes are simply hidden so as not to interfere with your view of the text you are working with. The markup language is always there right along with your text, you usually just don't see it. That's what word processors are for. I chose WordPerfect for this example because Microsoft Word does not contain features that make it easy to display all of the markup language codes hidden within the text. Note also that for clarity the margins have been widened for this example, so you may need to enlarge your browser's window if you've been browsing so far with your window minimized.

How The Letter Will Look When It Is Printed



BESTGEN CORPORATION



Dear: Valued Customer,

BestGen Corporation would like to thank you for your recent purchase of a quality, BestGen gas-powered generator.

Whether you plan to use your BestGen generator for home or business we want to welcome you to the family of thousands of satisfied BestGen customers and assure you that our commitment is to provide you with the best possible customer service and support.

Should you have any questions or have any problem unpacking and using your BestGen generator, please feel free to call our toll-free customer assistance line listed below. Our friendly customer service representatives are available to take your call Mon. thru Fri., 8:00 AM to 5:00 PM Eastern Time.

Customer Service

800-BestGen

1109 Best Drive

Power City, TX 30234

For your safety, and to avoid damage to your generator, please read and retain the enclosed instructions before attempting to operate the unit.

Sincerely,

Sue Forrestor

Vice President

Customer Service



Next we'll look at the actual markup language programming code necessary to produce the above letter using WordPerfect, followed by the coding needed to produce the letter in HTML (which is what the entire document you are currently viewing is programmed in).
Remember, the purpose of these examples is not to teach you how to program. Nor is it necessary for you to understand the workings of what you are looking at. The purpose is to let you graphically see how much more programming and formatting is required to produce the desired results, compared to creating the same document using SGML.



Code Required To Produce The Letter In WordPerfect

(You may view the "behind-the-scenes" code in WordPerfect using WP's "Reveal Codes" option)


[Center Pg][Just:Left][Center][BOLD][VRY LARGE]BESTGEN CORPORATION[vry large][bo
ld][HRt]
[HRt]
[HRt]
[HRt]
[HRt]
[HRt]
[BOLD]Dear: Valued Customer[bold],[HRt]
[HRt]
[BOLD]BestGen Corporation[bold] would like to thank you for your recent purchase
[SRt]
of a quality, [BOLD]BestGen[bold] gas-powered generator.[HRt]
[HRt]
Whether you plan to use your  [BOLD]BestGen[bold] generator for home or business
[SRt]
we want to welcome you to the family of thousands of satisfied [BOLD]BestGen[bol
d][SRt]
customers and assure you that our commitment is to provide you with[SRt]
the best possible customer service and support.[HRt]
[HRt]
Should you have any questions or have any problem unpacking and using[SRt]
your [BOLD]BestGen[bold] generator, please feel free to call our [ITALC]toll-fre
e[italc][SRt]
customer assistance line listed below. Our friendly customer service [SRt]
representatives are available to take your call Mon. thru Fri., 8:00 [SRt]
AM to 5:00 PM Eastern Time.[HRt]
[HRt]
[->Indent]Customer Service[HRt]
[->Indent]800[-]BestGen[HRt]
[->Indent]1109 Best Drive[HRt]
[->Indent]Power City, TX 30234[HRt]
[HRt]
For your safety, and to avoid damage to your generator, please read[SRt]
and retain the enclosed instructions before attempting to operate the[SRt]
unit.[HRt]
[HRt]
Sincerely,[HRt]
[HRt]
Sue Forrestor[HRt]
Vice President[HRt]
Customer Service[HRt]
	



Code Required To Produce The Letter Using HTML

(Note: There are numerous ways to achieve the same output. Coding style (what a program looks like) is a matter of individual preference and varies from programmer to programmer. The appearance of this HTML code, and codes used, are likely different than those which might be used by another programmer.)


	

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
<HEAD>

<TABLE BORDER=0 WIDTH=600>	<!-- BEGIN TABLE -->
	<TR>
	<TD WIDTH=20><BR></TD>
	<TD WIDTH=460 ALIGN=CENTER>	<!-- Total = 400 -->
	<FONT SIZE=5>

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		<B>BESTGEN CORPORATION</B>
	</FONT>
	</TD>
	</TR>
</TABLE>	<!-- END TABLE -->



<TABLE BORDER=0 WIDTH=600>	<!-- BEGIN TABLE -->
	<TR>
	<TD WIDTH=20><BR></TD>
	<TD WIDTH=460>	<!-- Total = 400 -->
	<FONT SIZE=3>

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		<B>Dear: Valued Customer,</B>

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		<B>BestGen Corporation</B> would like to thank you for your recent
		purchase of a quality, <B>BestGen</B> gas-powered generator.

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Whether you plan to use your <B>BestGen</B> generator for home or
		business we want to welcome you to the family of thousands of
		satisfied <B>BestGen</B> customers and assure you that our
		commitment is to provide you with the best possible customer
		service and support.

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Should you have any questions or have any problem unpacking
		and using your <B>BestGen</B> generator, please feel free to
		call our <I>toll-free</I> customer assistance line listed below.
		Our friendly customer service representatives are available
		to take your call Mon. thru Fri., 8:00 AM to 5:00 PM Eastern Time.

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Customer Service
	<!-- New Paragraph -->
	<BR><IMG VSPACE=0 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		800-BestGen
	<!-- New Paragraph -->
	<BR><IMG VSPACE=0 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		1109 Best Drive
	<!-- New Paragraph -->
	<BR><IMG VSPACE=0 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Power City, TX 30234

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		For your safety, and to avoid damage to your generator, please read
		and retain the enclosed instructions before attempting to operate
		the unit.

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Sincerely,

	<!-- New Paragraph -->
	<BR><IMG VSPACE=10 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Sue Forrestor
	<!-- New Paragraph -->
	<BR><IMG VSPACE=0 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Vice President	<!-- New Paragraph -->
	<BR><IMG VSPACE=0 SRC="../pac_sharedfiles/dot_clear.gif"><BR>
		Customer Service

	</FONT>
	</TD>
	</TR>
</TABLE>	<!-- END TABLE -->

</BODY>
</HTML>

	
	



Code Required To Produce The Letter Using SGML

(Note: The goal for the example below was to make it easy to follow; for instance, as you saw in the WordPerfect example, the spaces between lines aren't really necessary. Programming-related information required at the top of a SGML document and/or in a separate file is also not shown.)


	

<FORMLTR>

<LTRHEAD>&logo1;

<P>&greeting1;

<P>&coname2; would like to thank you for your recent purchase of a quality,
&coname1; gas-powered generator.

<P>Whether you plan to use your &coname1; generator for home or business we want
to welcome you to the family of thousands of satisfied &coname1; customers and
assure you that our commitment is to provide you with the best possible
customer service and support.

<P>Should you have any questions or have any problem unpacking and using your
&coname1; generator, please feel free to call our <I>toll-free</I> customer
assistance line listed below. Our friendly customer service representatives are
available to take your call &custsvc_hours;.

<INBLOCK1>
<P>&custsvc_title;
<P>&custsvc_phone;
<P>&custsvc_address;
<P>&custsvc_citystate;
</INBLOCK1>

<P>For your safety, and to avoid damage to your generator, please read and retain
the enclosed instructions before attempting to operate the unit.

<P>&closing1;

<P>&custsvc_headof;
<P>&custsvc_headoftitle;
<P>&custsvc_title;

<![ %ms_ca_notice1; [
<P>&ca_notice1;
]]>

</FORMLTR>
	
	



Understanding The Examples Above

Even if you're not programmer, it probably didn't take you long to notice that the last example, the one in SGML, required less markup. Each bracketed item, "[]" in WordPerfect and "<>" in HTML and SGML, is referred to as a "tag." The terms have similar meanings but "markup" can refer to a document overall whereas "tagging" refers just to the process of placing the tags where needed to mark and identify each portion of text.

In the WordPerfect and HTML examples you'll notice that tags are actually required to instruct the computer as to how to process the text. In the WordPerfect example "[HRt]" indicates a carriage return (a new line), while "[BOLD]" and "[ITALC]" cause the bracketed text to appear in bold and italic print, respectively. The corresponding tags in the HTML example are "<BR>" (break to the next line), "<B>" for bold, and "<I>" for italic. When it comes to the SGML example, however, the structure is much simpler because the focus is not on arranging and formatting the text, but describing it.

Because the SGML example is really the one we're concerned with here, let's take a look at that example section by section.

<FORMLTR>

Most of the actual programming that provides information on the structure of a SGML document and how it should be formatted is stored in the separate file that we talked about earlier: the DTD (Document Type Definition). Storing the most critical programming information in the DTD means documents created using it (regardless of who or where in an organization) will all be consistent to the degree determined by the rules stored in the DTD. The above "tag" tells SGML to use the DTD that has been created for the creation of form letters.
(Technically, not all of the codes we'll be talking about are called "tags," but we're keeping things simple here.) Numerous DTD's have already been created for the most common types of applications, with the most notable being HTML. (Remember, HTML is derived from SGML, so whenever a Web page is created it is essentially an SGML document using the HTML DTD!) In the example above the fictitious "BestGen" company had their SGML staff create their own DTD to be used for all form letters.

Ok, if you've stuck with me so far that's as technical as we're going to get!

<LTRHEAD>&logo1;

This is the tag BestGen uses to create a simple letterhead on plain paper by centering the company name at the top of the letter in big, bold, capital letters. The operator creating the letter doesn't have to worry about any of that. What the heading is supposed to look like and where it is supposed to go on the page have been taken care of by the SGML programmer. All the person writing the letter has to do is describe the text by tagging it with <LTRHEAD> so that SGML knows what to do with it. Also, instead of typing in the company name literally a special "name tag," "&logo1;," is used in its place. This tag, and all of the others like it, have also already been set up by the programmer.

Why is this better than just typing the company name? Well, in the case of a long name it can save a lot of typing. But, more importantly, BestGen is considering merging with another company that manufacturers small engines and lawnmowers. The details of what the name will be after the merger haven't been worked out. Imagine having to manually go in and change the company name on hundreds, even thousands, of documents? But, because BestGen's documents are written using SGML, the company's SGML programmer has only to make such a change once and any new or existing documents using the "&logo1;" tag will automatically appear with the new name when they are viewed or printed out! You'll see similar tags throughout the remainder of the letter.

<P>&greeting1;

You can already see that this tag simply replaces having to type out the greeting (and put it in bold) and makes it easy to change. The "<P>" is simply a standard SGML tag which identifies the text which follows as a paragraph. You're probably starting to see now why we aren't concerned about what the text looks like. What a paragraph is, what it looks like, spacing, etc, have already been programmed into SGML. We have only to tell SGML that the text is a paragraph, it does the rest. That's why as pointed out earlier, it wouldn't make any difference if there were no spaces at all in the example. Each different type of text, due to its tag, will be spaced and formatted accordingly.

For instance, if you wanted every paragraph in your document to start with a drop cap, as this example paragraph does, you don't have to concern yourself at all with the formatting or programming required to do so. As long as the text is tagged as a paragraph, SGML can then be instructed to format every paragraph so that each begins with a drop cap. Or, as is more commonly the case, you could specify that only paragraphs at the beginning of a new chapter start with a drop cap, and so on.

Now that you're getting the hang of it, let's skip down to the third paragraph, where we see the only example of where the operator has actually provided instructions to SGML on what to do with the text, instead of describing what the text is. In this case, the operator has instructed that the phrase "toll-free" appear in italics by surrounding it with "<I>," and "</I>."
(When tagging the first tag tells SGML where to start, while the second tag that starts with a "/" tells SGML where to stop. Some types of text that are obviously always the same, like regular paragraphs, which always end with a carriage return, don't need the end tag to tell SGML where they stop.)
It was pointed out earlier that this is just a hypothetical example. Both from a programming standpoint and from the standpoint of what might be most efficient for the company and staff there are many other ways this same letter could be set up. For instance, if the company changes to a customer service number that isn't toll-free, a staff person will have to update every document by searching for and removing the phrase "toll-free" just as would have to be done on a non-SGML word processing system. If a special tag like those used for the company name, etc., were used instead of "toll-free," then that information could also be updated organization-wide via one central change.

Coming next to the contact information for the customer service department, here also tags have been used so that the information can easily be updated. If after the merger the customer service department is moved under the roof of the other company's Illinois offices, that information can also be updated centrally without having to retype a single document. The tags surrounding the customer service information tell SGML how it should appear, indented, in this case.

Now we'll move to the section just before the last:

<![ %ms_ca_notice1; [

<P>&ca_notice1;

]]>

Remember earlier that varying state regulations require that customers be notified of different things depending upon the state? Remember the emissions notice that was required just for the state of California? (Click here to go back and review.) Well, the tagging above takes care of both. We don't want the text typed in by hand because someone could make an omission or other mistake. Also, if we did that then a separate copy of the letter would have to be kept to accommodate each state that requires a different notice.
Speaking of notice, what if changing regulations, etc., mean the text has to be changed? Perhaps it becomes mandatory that the word "Notice:" appear at the begging of the statement. Here again, the advantages of SGML are obvious. Not only do these three lines of SGML code handle this task for BestGen for the state of California, but, using the same SGML technique, notices, disclaimers, etc., for yet other states and situations can all be provided for in the same document. When the document is viewed or printed only the added text needed for that state or situation will appear.

Drawbacks

In accordance with the familiar saying, I suppose no discussion would be complete without at least touching on the negatives. Fortunately, when it comes to SGML, there are few. Many will discover one potential negative when, being sold on the benefits, they inquire into purchasing an SGML product for themselves or their company: price. Especially in light of SGML's mainframe heritage where software costs routinely ran into the thousands, today's packages (thanks to the popularity of the PC) are much more affordable. However, as touched on earlier, SGML packages still tend to be larger and more complex than word processors, and therefore are still more expensive.
Few full-blown authoring packages (those designed for the development of complete SGML applications) start at less than $1,000, and prices can still range well into the thousands. For a company purchasing for its technical staff for development purposes that's not so bad, but providing full-blown copies to large numbers of people creating or viewing documents throughout an organization can still be an expensive proposition. However, most employees wouldn't need software with every available SGML capability. Also, inexpensive viewers are available for those who need only to view, and not create, SGML documents. In addition, as with most computer-related technologies, as more SGML software is sold the lower the prices gradually become. Lastly, there is the fact that it is probably only a matter of time before at least some SGML capabilities become a standard, built-in feature (or low-cost add-on) of the most popular software packages, starting with, of course, word processors.

Also as pointed out earlier, many organizations may experience difficulty finding people with the technical background it requires to utilize the more powerful features of SGML. And, likewise, those non-technical employees who are the most obvious candidates for training to develop SGML applications, writers, editors, etc., may not always be interested in, or able to, perform the functions of a developer/programmer, etc. However, this is really no different than the shortage organizations are experiencing in general when it comes to finding people experienced in, or good at learning, the latest technologies.

About the only other drawbacks of SGML are that since it doesn't yet enjoy mass-market use educational and learning resources (while certainly available to large organizations) aren't as commonly available either. So, unlike Microsoft Word, or WordPerfect, or Windows, or Visual Basic, etc., for which you can turn to any good bookstore, computer store, catalog, etc., for books, CDs, and other resources, the same is not true of SGML. Also, of those materials that are available most tend to favor a more technical audience.

Conclusion

This introduction has only scratched the surface of the benefits and capabilities of SGML, and, it may not be for everyone, yet. But there is enough enthusiasm and momentum surrounding its benefits so that, like WordPerfect and then Microsoft Word, a point will likely come when any progressive, competitive, organization or professional will want to have SGML to stay current and to facilitate the exchange of information with others. A number of links are provided below to resources that can help you learn more about SGML, as well as where to obtain SGML software.

About The Links Below

While there is introductory information on the Web about SGML most is fairly technical and assumes at least some familiarity with text markup languages. Nonetheless, I've tried to provide you with a few of the very best SGML resources to serve as a launching pad for further information gathering. Some of the resources and primers which are non-technical are designed to assist the management- or administrative-level person in making decisions regarding the implementation of SGML within his or her organization. Robin Cover's SGML/XML Web Page offers the most comprehensive database of information and the largest index to other resources.
Since I have know way of knowing what platform you might be considering an SGML product for (Windows, Unix, etc.) links are provided to the home pages of the various SGML vendors and not directly to any specific product's page (except in the cases of Microsoft and WordPerfect). That way you can get an overall picture of each company's complete offerings and pursue those best suited to your needs.



SGML/XML Resources and Vendors


An Introduction to the Extensible Markup Language (XML)

A Gentle Introduction To SGML (U of M)

Robin Cover's XML Site

Adobe

ArborText

Bright Path Solutions (an outstanding organization!)

BroadVision (formerly Interleaf)

Corel (WordPerfect)

DocBook.org

Microsoft's XML Developer's Site

i4i (XML-Enables Microsoft Word)

OASIS

SoftQuad



Return To Top (Page Contents)

Return To Doc. Mgmt. Contents

Return To Homepage

Copyright © 1998 Phillip A. Covington