Office automation: Converting doc to docx

With the advent of Office 2007, Microsoft switched over to its OpenXML standard for office documents - which is quite a subject in itself, one which I will blog about sometime in the future.

This post however, is about converting older word documents to this new format. I've seen a few sites that actually offer a conversion service (for a fee) - wonder if that's even legal, seeing as Microsoft provides a free tool (ofc.exe) as part of its migration planning manager, which is available from this link.

It's a rather funny utility, which works in conjunction with the Office 2007 compatibility pack.

The compatibility pack mainly enables us to open OpenXML documents in older versions of Office; minus all the new functionality in Office 2007. Click here

Getting back to the ofc tool, you will notice a file called ofc.ini; this file contains a number of settings which you will need to set. Most notably the following highlighted options.

 
[ConversionOptions] section.
[ConversionOptions]
; FullUpgradeOnOpen: if set to 1, Word documents will be fully converted to the OpenXML format
;                    if set to 0 (default), Word documents will be saved in the OpenXML format in compatibility mode
; Not applicable to Excel or PowerPoint files.
FullUpgradeOnOpen=0
 
[FoldersToConvert]
; The Converter will attempt to convert all supported files in the specified folders
; (do not include if specifying FileListFolder)
;fldr=C:\Documents and Settings\Administrator\My Documents
fldr=c:\abc
 

We can alternatively do this programmatically using the Office 2007 Interop assemblies, available here if we want to do a bit more than merely convert it to new standards.

In this example, we're simply going to convert a folder containing older documents, to the new docx:

 
using Word = Microsoft.Office.Interop.Word;
using System.Reflection;
using System.IO;
class Program
{
    static void Main(string[] args)
    {
        Word._Application application = new Word.Application();
        object missing = Missing.Value;
        object fileformat = Word.WdSaveFormat.wdFormatXMLDocument;
        DirectoryInfo directory = new DirectoryInfo(@"c:\abc");
        foreach (FileInfo file in directory.GetFiles("*.doc", SearchOption.AllDirectories))
        {
            if (file.Extension.ToLower() == ".doc")
            {
                object filename = file.FullName;
                object newfilename = file.FullName.ToLower().Replace(".doc", ".docx");
                Word._Document document = application.Documents.Open(ref filename,
                    ref missing, ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing, ref missing);
                document.Convert();
                document.SaveAs(ref newfilename, ref fileformat, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing);
                document.Close(ref missing, ref missing, ref missing);
                document = null;
            }
        }
        application.Quit(ref missing, ref missing, ref missing);
        application = null;
    }
}
 

Notice "document.Convert()", this method tells the interop assembly that the documents need to be fully converted to the new OpenXML format - something you might want to omit if you're planning to provide support for previous versions of office using the compatibility pack.






Post comment

Name *
Email
Title
Body *
Security Code
*
* Required fields

Latest Posts

Top 5 posts

Simple WYSIWYG Editor


Creating a WYSIWYG textbox for your website is actually quite simple.
2007-02-01 12:00:00

Moving items between listboxes in ASP.net/PHP example


Move items between two listboxes in ASP.net(C#, VB.NET) and PHP
2008-06-12 17:07:43

Cross Browser Issues: Firefox Word Wrapping


Firefox word wrapping issues
2008-06-09 09:51:21

Populate a TreeView Control C#


Populate a TreeView control in a windows application.
2009-08-27 16:01:03

What time will bring



2007-02-22 12:00:00