YAWC Pro : One Click Publishing from Word YAWC Pro : One Click Publishing from Word
 

!

Description

Documentation

FAQ

DTDs

Clients

 

YAWC Pro Webmaster's Guide

The YAWC Pro Webmasters Guide is intended for those users who wish to customise YAWC for their own website. This guide assumes some knowledge of HTML and XML markup standards. In addition, because YAWC is intended to facilitate the creation of high-quality web pages that meet current public standards for accessibility for people with disabilities, and searchability for search engines, some knowledge of WAI (Web Accessibility Initiative) Guidelines and the Dublin Core Metadata Element Set (DCMES) is also assumed. This guide only describes how to modify the supplied HTML configuration to suit your needs. Please consult the Administration Guide for a more detailed description of how YAWC works.

YAWC is installed with a generic configuration for converting MS-Word documents into HTML. The files required for this configuration are located in the directory C:\Program Files\YAWC\HTML, on English editions of Windows. On non-english systems, the local language equivalent replaces "Program Files", e.g. "Programme" in German. This directory contains the following files.

YAWC configuration files for web pages

Filename

Description

HTML-en.dot

A Word template used to reference the YAWC conversion engine, and provide authoring assistance to users

yawcHTMLTemplate.htm

A template for web pages generated by YAWC, which is well-formed XHTML

yawcHTML.ini

The YAWC configuration fileused to define a mapping between Word styles and HTML element, and other configuration details

gettingstarted.doc

An initial Word document to test correct YAWC installation

yawcHTML.xsl

An XSLT transformation script to transform raw XML content from Word into a finished web page

xhtml1-strict.dtd

The XHTML 1.0 Strict DTD with which to validate the HTML template

HTMLSample.doc

A sample Word document demonstrating Word features supported by YAWC for conversion, including tables and images

smallcover.gif

Sample image linked to HTMLSample.doc

Begin the customisation by making a copy of the generic HTML configuration directory and naming it after your own site, e.g. C:\Program Files\YAWC\YourSite. Then carry out the following steps in the files in this directory, rather than the default HTML directory. There are three main configuration tasks required before YAWC can generate pages for your website. These are:

Each of these tasks is described in detail below.

Word Template Configuration and Installation

The Word template HTML-en.dot contains some default metadata values that you should modify for your site, using the following steps.

  1. Start Microsoft Word, and open the template file using the File>Open command.
  2. Use the YAWC>Edit Metadata command to open Metadata dialog box.
  3. In the Mandatory tab group, replace the Publisher field with the name of your organisation.
  4. In the Recommended tab group, replace the Language and Rights fields with values appropriate to your organisation.
  5. Click OK to close the dialog box, File>Save to save the template, and File>Close to close the file.

Before proceeding, you may wish to rename the template to reflect your local language, e.g. HTML-de.dot for German. Create a shortcut to this template and place it in your Microsoft Office Templates directory (usually C:Program Files\Microsoft Office\Templates), using the following steps.

  1. In Windows Explorer, select the YourSite directory to view all the files, and select the file HTML-en.dot.
  2. Choose File>Create Shortcut to create a shortcut to the file in this directory.
  3. Select the shortcut, and rename it to a suitable name, e.g. YourSite Web Page, using the command File>Rename.
  4. Drag the file to the Microsoft Office Templates directory and drop it there.
  5. In Microsoft Word, choose File>New and check that you can see the file in the list of templates available.

This completes the basic configuration required for Word. You may wish to carry out some more advanced configurations, and these are described below. Otherwise, proceed to the next section.

  • To share the YAWC configuration files among a number of users, copy the C:\Program Files\YAWC\HTML directory to a location on a shared drive, e.g. N:\YAWC\YourSite.
  • To enable users to access a common templates directory, open the Tools>Options dialog box, File Locations tab group, and set the "Workgroup templates" file type to a location on a shared drive, e.g. N:\Microsoft Office\Templates.
  • To localise the YAWC command menu to your own language, choose View>Toolbars>Customize (or its equivalent in your language of Word), click on the YAWC menu item, and right-click on the menu item you wish to translate. This brings up a further menu, which contains a text field containing the command, which you can edit. Feel free to send us any translations you make, so that we can share them with other speakers of your language.

HTML Template Configuration

This section describes how to set up your own HTML template, which will be used as the wrapper into which content from converted Word documents is placed. It usually contains the site navigation graphics and links, logos, banners, etc. - everything that is consistent across all pages on your site, or perhaps a section of your site. Your site probably already has a template defined for it, which you can modify. If not, copy a page from your existing site, and save it as yawcHTMLTemplate.xml. Edit this file using an XML-aware editor such as XML Writer (a cheap XML text editor) or XMetaL (a WYSIWYG XML editor). The template file must be modified to make it a well-formed XML document. Ideally, it should validate against the XHTML 1.0 Strict DTD (xhtml1-strict.dtd) as well. XMetaL can open HTML documents and save them as XHTML, which saves a bit of work.

Using your XML editor, fix any empty elements so that they are XML-compliant, e.g. <br> becomes <br/>. Other changes are also required.

  • Elements must be properly nested e.g. <b><i>YAWC Pro</i></b>
  • Paragraph elements must be closed e.g. <p> .... </p>
  • Image elements must be closed e.g. <img src="images/banner.png" alt="banner image"/>

After making the file well-formed, try to make the file valid by adding the following lines to the top, and checking for validity.

<?xml version="1.0"?>
<!DOCTYPE html "xhtml1-strict.dtd">

This requires that you remove all font elements, and replace them with entries in a CSS (Cascading Style Sheet) file, or in a style element at the top of your file. This is worth doing, as it improves the accessibility of your web pages for people with disabilities, a requirement for public sector websites.

Once you have made the file as compliant as you can, you must add some special markup strings that will be used by YAWC. During the conversion process, YAWC replaces each of these strings with content from the Word file. These are shown in the table below.

YAWC markup strings

YAWC markup string

Location

Replaced by

<?yawc insert-title?>

Inside the HTML title element. Example:

<title> YourSite - <?yawc insert-title?></title>

The contents of the Title paragraph in the Word document.

<?yawc insert-meta?>

Inside the HTML head element. Example:

<head> <?yawc insert-meta?> </head>

The contents of the metadata dialog box, as a series of meta-tags.

<?yawc insert-content?>

Inside the main body of the template. Example:

<body> ...<td> <?yawc insert-content?> </td>

The contents of the entire Word document.

Finally, you should modify any links to images or other web pages in the template so that they use absolute rather than relative paths, e.g. <a href="/index.htm">Home</a> instead of <a href="index.htm">. This ensures that even if you have files in directories at many different levels, the links will still point to the correct location.

Your template is now ready for use. Rename it as yawcHTMLTemplate.htm.

YAWC Configuration File

This section describes how to configure the YAWC Configuration file for your site.

Metadata

The metadata used is based on the Dublin Core Metadata Element Set (Dublin Core). This is an open forum devoted to the development of "interoperable online metadata standards that support a broad range of purposes and business models".

Default values for metadata elements may be specified in the configuration file. These are included inside the [METAInformation].....[/METAInformation] section. Certain of these metadata elements are site specific and should be modified by the Webmaster. The default values may be overwritten in the "Edit Document Information" dialog box in Microsoft Word by the user.

DC.Creator and DC.Type are both available as drop-down lists in the Word Document. The default values, [Creator-1|Creator-2|.....] and [Type-1|Type-2|......] respectively, for these lists should be overwritten with site specific values in the configuration file.

The format for the DC.Date.created and DC.Date.modified defaults to the ISO 8601 (W3CDTF) format of YYYY-MM-DD.

DC.Identifier, DC.Publisher and DC.Rights contain values specific to each site and should be defined by the webmaster in the configuration file. The default value for DC.Format should only be changed for non-html output while DC.Language should be given the appropriate value for non-english language documents.

XSLT Transformation

YAWC converts Word documents into XML, and then converts them into HTML using the XML scripting language XSLT. The ConversionOptions section of the configuration file specifies the XSLT transformation options. These options are stored as a well-formed XML fragment, as shown below.

<yawcXSLTTransformation pass="1">

<yawcStylesheet>yawcHTML.xsl</yawcStylesheet>

<yawcProcessor>MSXML3</yawcProcessor>

<yawcOutputFileExtension>.htm</yawcOutputFileExtension>

<yawcStripHTMLContentType>yes</yawcStripHTMLContentType>

<yawcKeepTempFile>no</yawcKeepTempFile>

#<yawcFTPHost>ftp.yawcpro.com</yawcFTPHost>

#<yawcFTPRoot>/htdocs</yawcFTPRoot>

#<yawcFTPLogin>username</yawcFTPLogin>

#<yawcFTPPassword>password</yawcFTPPassword>

<param name="HTMLTemplate" select="'yawcHTMLTemplate.htm'" />

<param name="deleteEmptyParagraphs" select="'yes'" />

<param name="stripClassAttributeSpaces" select="'yes'" />

</yawcXSLTTransformation>

Three types of parameter are defined: global YAWC options, HTML parameters and FTP upload parameters. The use of this section is described in more detail in the Administration Guide. This guide describes only the parameters you need to change for HTML support.

The global YAWC options and the HTML parameters need not be changed. However, you may wish to modify the default file suffix for HTML files from .htm to .html. For trouble-shooting problems during the configuration process, you may wish to keep the initial XML file that YAWC generates, by setting the yawcKeepTempFile parameter to yes. The temporary file is stored in the directory specified by your TEMP environment variable. If your Word file is named index.doc, the temporary file will be named yawcindex.xml.

The FTP upload parameters specify values for uploading files to a website. If you wish to enable YAWCs "one-click web publishing from Word" feature, these need to be modified as follows.

  1. Remove the hash (#) symbol from the beginning of each line.
  2. Replace the parameter value (in bold above), with the correct value for your website.

One-click Web publishing

This section describes how this feature works. YAWC uses the Identifier metadata field and the FTP upload parameters specified in the configuration file to decide the exact location in which to upload a document. If the value of the Identifier field is http://www.yoursite.com/news/2001/0413.htm, then YAWC replaces the hostname part of the URL with the FTP Host and Root parameters, and retains the rest of the directory path, as follows: ftp.yoursite.com/htdocs/news/2001/0413.htm.

Warning: This feature depends on all your Word documents having the correct value for the Identifier field. We recommend that you only enable this feature when all the files on your website have been created with the correct metadata, and you wish to allow individual users to maintain them directly afterwards.

Security Warning: This feature requires that you publish the username and unencrypted password to your FTP site in a file that is readable by anyone with access to the file, either on your own PC or a shared drive.

Valid HTML 4.0!WAI CompliantValid CSS!