Wednesday, August 19, 2015

Introduction

This web site provides information for working with S1000D XML using widely applied techniques and tools for automating production processes. Examples include XSLT transformations and other forms of code that can be applied to increase the capacity of the production organization. These are provided as-is without warranty and the user is responsible for ensuring that they work within the context of the production processes of their organization.

What Is XML?


If I had a dollar for every time I've given this speech, I might take up computer science  )  What follows is the short story.

Take a look at HTML. It is a markup application language. SGML is the parent language (Standard Generalized Markup Language). XML is a cut-down version of SGML for reasons a bit technical to describe here. XML is a meta-language. It defines the characters to be used in the language, how they are arranged and how a program called an XML parser behaves when passing those characters to an application, for example, a formatting engine that does the actual layout of the page. This is important: XML does NOTHING. It is a set of rules. The application of XML is to declare the application language such as HTML 4.0.

Right click on this web page. You should see a menu item (View Source). If you click on that you should see the markup under the web page. It will look like computer code and some of it in the script tags is. Those tags are data for the page controls such as menus. Originally HTML was much simpler but that was thirty years ago when the web was young.

Who decides? Sometimes an application development shop, sometimes a standards organization. Usually an organization such as ATA or ISO or the W3C. In the case of the application language in most use locally, that is S1000D. It is a set of tags used to markup a technical manual. It is defined as a set of data modules, for example, description modules, procedure modules, parts modules, etc. The S1000D tags surround the text to be displayed and there are tags that include by reference the locations of external items such as graphics. At this point, all you have is a heap of tagged text and links.

To render this to a readable page, a formatting language such as XSL is applied to add the formatting information. This is called a style sheet language. It may or may not also be an XML language. Again, as in S1000D the formatting information is just data.

The magic happens when a rendering system takes both of these and combines them into a rendered image. This application is often a browser such as the one you are using right now to see this page. It may be a print application, an interactive web page or both.

The point is an XML application language such as S1000D is a way to structure the data for processing by a computer program.   That's it.   Good practice separates the content, for example the steps in a repair procedure from the formatting information that renders it as say a nested list of steps in a font such as Arial or Times Roman.

Why not put the formatting information in the same structures as the procedural steps as is done in other languages such RTF?

The reasons to do this are the same information can be created and given to different devices using different style sheets. Thus the same web page or technical manual pages can be rendered to your web browser on a wide screen and to a hand held device such as you smart phone and only the style sheet has to change. Another reason is the long term preservation of information is better and cheaper. For example, US Army missile systems such as Patriot can be around many decades longer than the systems originally used to print the manuals and in fact have been. Updates are also easier because information can be added, changes can be tracked and so on without worrying about the final rendering.

It is important to understand these concepts when creating S1000D deliverables.   While it is technically possible to use a document program such as Microsoft Word to create S1000D modules, it is a bad idea and a bad practice.   It stores technical information in a file format that is proprietary and worse, difficult to transform into a different document system.  Thus reuse of the information for different mission objectives becomes more expensive.  It requires the customer to keep that proprietary system up to date with that by which the information was created and this creates an unhealthy contract relationship between contractors and the customers.  It is called entrapment.

Entrapping a customer in a contract is an old, maligned but often pursued goal in business.  Caveat emptor.

Purpose

This information is being provided in response to observations over a number of US DoD contractor sites where the techniques applied to XML production for S1000D projects are ineffective for enabling cost-effective rapid response to mission objectives for that information.
Too many organization rely on obsolete processes inherited from years of creating and managing technical information with older SGML-based products. Further in too many cases the management and planners for the projects have fallen behind and failed to master even the earliest production potentials of XML. Basic technologies such as XSLT are neglected and instead these organizations rely on tagging armies and excessive man months to do work that basic automation can do much more effectively and cheaply.

Overview

Several technologies and standards will be used as part of this tutorial including:
  • S1000D XML
  • MIL-STD-3031A
  • XSLT
  • IADS
  • Oxygen Author/Editor
  • XML Creator and Utilities for S1000D
  • Notepad ++
  • Microsoft Excel
  • Microsoft Visual Basic
  • Microsoft Access
  • Microsoft SharePoint

The examples given here are intended to be self-contained and explanatory. Readers who wish to work or apply the examples directly will in most cases require some or all of these technical implementation. However it is not the intent of the author to market these and no fiduciary relationship to the makers of these packages exists. All of these in some form are standard office desktop products.
While S1000D production systems dedicated to creating S1000D products are available on the market, this presentation is not intended to explain them. Many are expensive enterprise systems and for large organizations managing multiple projects, these are well-worth the expense. This weblog is provided to teach nuts and bolts techniques that do not require these enterprise systems.
Also, some examples given can be improved with newer applications of the enabling standards such as XSLT. In such cases if the reader has alternatives that provably improve the examples, they are invited to cite where these can be obtained with alternative example code that can be included in this web site. Code fixes for errors in the examples are also welcome as long as they are open source and unencumbered by intellectual property claims.
In the following article the role of each of the technologies listed above is explained in the context of this presentation.

No comments:

Post a Comment