A Technical Writing Tip

As a technical writer myself I’m often in a situation where I need to take a Word document and turn it into a Framemaker or XML-DITA package.

This is not usually a huge problem, except when there are a lot of images in the document (screencaps mostly). You see, Word incorporates images by copying them right into the document (embedding); with Framemaker it’s preferable to simply link the image as you would on a web page, and with XML-DITA linking is the only way to include images. If you simply import a Word document into a Framemaker document the images become embedded, but then you are limited as to how you can interact with those images. Scaling, particularly, does not seem to be supported for embedded images.

So you’ll want to get the images out of the Word doc and into separate image files. One way you can do this is to go through the original file, and every time you see an image right-click on it, then save the image file. This can be really time-consuming (and boring).

There is, of course, a better way.

This is not common knowledge but Word files are made up of several different, smaller files compressed together using the ubiquitous .zip format. So the trick to getting all the images at once is simply to unzip the Word document.

Here’s how to do this:

  1. Find your Word document in the windows explorer and change its extension from .doc (or .docx) to .zip.
  2. Extract the files to a new folder.
  3. Open the folder and navigate to the word\media subdirectory, then open it.

You should see all the document’s images there. They will have generic names which is semantically not the best, but on the plus side it’s then easy to figure out the order in which the images should be shown.