What is SGML?
SGML is an acronym for Standard Generalized Markup Language and first appeared as a published standard in the form of ISO 8879 in 1986. Importantly, SGML is not a computer program and does not stand for Slaphappy Geeks Markup Language.
To understand what SGML does and why it is important we will need to take a brief look at the history of electronic document publishing.
What is a ‘markup language’?
The term markup comes to us from the printing and publishing business, and from the days before electronic documents became common place. When a document was sent to the printer it would be marked up with notations telling the typesetter which words should be in bold type, or italic, and what font and ink color should be used and indicating the layout of pages.
When electronic documents came into common use it was similarly necessary to include information telling how they should be displayed or printed. The term markup was also applied to the electronic counterpart of print markup notations.
But there was no standard markup method and most electronic documents were consequently limited to use on a single system or by a single software application.
What was needed in a standard
What was needed was not just some new, but equally arbitrary standard, but a standard which could express all the ‘primitive’ ideas of all the other markup languages. Thus, it would be possible to re-express any of those existing documents into the standard form, and then into any other form.
The standard that met this need was SGML – the Standard Generalized Markup Language.
Now, you might think that SGML defined a set of tags or markup expressions to define fonts, colors, images and such. But that is exactly what SGML does not do!
The genius of SGML is that it only defines a very few primitive concepts that must be common to any markup language. As such it is a language for writing markup languages!
Key concept: Separation of structure from presentation
The key underlying concept behind SGML is that it only expresses the structure of documents and is never concerned with how a document may be rendered.
For example, SGML tells you how to express the ideas of headings or paragraphs, or any other element, but it says nothing about how those elements should be rendered – that is solely a function of any device that may use the document!
This idea carries over into XML which is actually just a subset of SGML, and CSS which is a method of expressing the rendering styles for structured documents.
What are the simple ideas behind SGML?
A study of any markup language expressed in SGML will show that only three basic concepts are necessary:
- The idea of a markup entity
- The idea of a markup element and associated attributes
- The idea of a document type
At the most basic level all text is simply composed of streams of symbols, characters, electronic data bytes or marks on a page. These symbols are the entities of SGML.
At the next higher level, texts are composed of groups of entities arranged according to some linguistic or functional definition. These are the elements of SGML and are the structural components of a document.
There are always rules, or a grammer, which define how these elements may be combined and where they may appear in a coherent document. This level is the document type.
It may be shown that these three concepts are sufficient to describe all the complexities of any marked up text of any kind and for any purpose!
How would I use SGML?
The truth is that unless you are a publishing or production printing professional, you will probably never encounter SGML directly!
But if you run a web site you will certainly use HTML, XHTML, XML, CSS or other markup language, and any of those will certainly be derived from SGML and the simple but powerful concepts on which it is based!
You will also use the syntax of SGML when using other markup languages as the <element attribute=“value” /> notation we know as tags and the &entity; notation we usually call HTML entities are instances of the elements and entities of SGML... now you know!
Credit where due: The ideas for this article, although not the text, were heavily influenced by an article “What is SGML and How Does It Help?” by Lou Burnard 1996, mirrored from http://sable.ox.ac.uk/ota/teiedw25/ but not available at time of this writing. 
|