30AUG 2010

SP 2010: Developing with the Word Automation Services in SharePoint Server 2010


Posted by Tobias Zimmergren

Author: Tobias Zimmergren
http://www.zimmergren.net | http://www.tozit.com | @zimmergren 

Introduction

In SharePoint 2010, there is a new Service Application called Word Automation Services. This Service Application is used to convert documents from Word to different formats.

Word Automation Services can open about the same formats as Windows Word 2010 can:

  • Filetypes it can open include:
    .docx, .docm, .dotx, .dotm, .doc, .dot, .rtf, .mht, .mhtml, .xml (Word xml)

  • Filetypes it can save as include:
    .docx, .docm, .dotx, .dotm, .doc, .dot, .rtf, .mht, .mhtml, .xml (Word xml), PDF, XPS

In this article you’ll see an example of how you can utilize the Word Automation Services in order to build a custom solution that takes care of converting documents (as listed above) into PDF documents.

Programmatically work with Word Automation Services in SP 2010

Note: You cannot deploy a solution working with the API’s in the Word Automation Services in the Sandbox. Rather you’ll need to target your application as a Farm Solution.

There’s not really a whole lot to it. Just follow along with these few steps and you’ll be fine!

In the API for Word Automation Services you’ll find a few different ways to convert documents including:

In the following example I’ll demonstrate how to use the AddLibrary() method in order to convert the contents of an entire document library into PDF documents! (Yes, that is way awesome)

1. Add the Word Automation Services API reference to your project

The following reference needs to be added to your project:
image

It can be found here:
<path>14ISAPIMicrosoft.Office.Word.Server.dll

2. Add the proper using-statements

 using  Microsoft.Office.Word.Server.Service;
 using  Microsoft.Office.Word.Server.Conversions;

3. Create a job to convert an entire Document Library to PDF’s

First, fetch the WordServiceApplicationProxy object (so we don’t have to hard-code the service app name..). This first line requires the Word Automation Service Application to be added to the default proxy group.

Second, we instantiate a new ConversionJob class and shoots in your WordServiceApplicationProxy as a parameter:

 var  wordAutomationProxy = 
     (WordServiceApplicationProxy )
     SPServiceContext .Current.GetDefaultProxy(typeof  (WordServiceApplicationProxy ));
                 
 var  conversionJob = new  ConversionJob (wordAutomationProxy);

Next we need to specify a UserToken for the job to tell the job under what credentials to run.
We also need to specify a
name for the job.
Finally you can add whatever
Settings you want for your job, I’ve chosen that I want my files to be output as PDF’s.

 conversionJob.UserToken = SPContext .Current.Web.CurrentUser.UserToken;
 conversionJob.Name = "Zimmergren.SP2010.WordAutomationDemo Conversion Job" ;
 conversionJob.Settings.OutputFormat = SaveFormat .PDF;

Next we will simply specify a library where the original .doc or .docx files reside and point out the destination library like this, and start the job by adding it to the timer job queue:

 conversionJob.AddLibrary(origLib, destinationLib);
 conversionJob.Start();

This is the sample code in one snippet

 protected  void  btnConvert_Click(object  sender, EventArgs  e)
 {
     try 
     {
         SPList  origLib = SPContext .Current.Web.Lists[ddlLibraries.SelectedValue];
         SPList  destinationLib = SPContext .Current.Web.Lists["PDFLibrary" ];

         var  wordAutomationProxy = 
             (WordServiceApplicationProxy )
             SPServiceContext .Current.GetDefaultProxy(typeof  (WordServiceApplicationProxy ));
                 
         var  conversionJob = new  ConversionJob (wordAutomationProxy);

         conversionJob.UserToken = SPContext .Current.Web.CurrentUser.UserToken;
         conversionJob.Name = "Zimmergren.SP2010.WordAutomationDemo Conversion Job" ;
         conversionJob.Settings.OutputFormat = SaveFormat .PDF;

         conversionJob.AddLibrary(origLib, destinationLib);
         conversionJob.Start();

         Label1.Text = "Conversion job started!" ;
         Label1.Visible = true ;
     }
     catch (Exception  ex)
     {
         Label1.Visible = true ;
         Label1.Text = "Error:<br/>" ;
         Label1.Text += ex.Message;
     }
 }

So what’s the results?

When you’ve created a new ConversionJob, it’ll be added to the Timer Job schedule to be run by SharePoint. When the job has been run, it can look like this:

Original Document Library, filled with some Word documents:
image

These files will then be converted to PDF’s and put into my PDFLibrary like this:
image

Summary

To get started with the Word Automation Services, you don’t really need to do a lot of coding. Just specify the references, hook up a reference to your service app and create a new ConversionJob and you’re up and running.

This article demonstrated how to convert an entire library of documents to PDF’s with a single click.

Awesome. Enjoy!

  • Basel

    Hi,
    i am trying to create new branding to our website using custom Masterpages, Pagelayouts, to bring our new look & feel using sharepoint 2010,
    now i want to use word automation service to convert specific word 2010 template .dotx file to webpage using our custom pagelayout,
    after configuring the launcher conversion and load balancer servers on the farm and enabling it on the webapplication where i am saving the document along with new pagelayout, the default converter from UI is now enabled and working fine…
    Now i need to create me own converter with our custom pagelyout..
    for that i created new content type and added the correct template from advanced setting, then clicked on to “Manage Documetn conversion for this content type”, checked only convert word doc to webpage and configure it by taking the new custom pagelayout … and click ok till now its fine.

    after adding the content type to Documents Library , the library shown Error to view AllItems.aspx!

    any missing steps i forgot through this process!.
    N.B: even using (Article Page Image on Right ) pagelayout will show same Error!

    • http://www.zimmergren.net/ Tobias Zimmergren

      Basel, your first strike of luck would most likely come if you checked the ULS logs. Use a tool like ULSViewer or SPTraceView to review the ULS logs runtime. It will aid you in finding out what might be going on.

      Regards,
      Tobias

      • Basel

        first thanks to your reply, i installed SPTraceViewer which has shown lots of information that i was not aware about it .. that’s nice!
        but for my issue the error was:
        System.NullReferenceException: Object reference not set to an instance of an object. at Microsoft.SharePoint.WebPartPages.ListViewWebPart.PrepareContentTypeFilter(SPList list, Hashtable[] excludedTransformers) at Microsoft.SharePoint.WebPartPages.ListViewWebPart.GenerateDocConvScriptBlock(SPWeb web, SPList list) at Microsoft.SharePoint.WebPartPages.XsltListViewWebPart.OnPreRender(EventArgs e) at Microsoft.SharePoint.WebPartPages.WebPartMobileAdapter.OnPreRender(EventArgs e) at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)

        any leads to get this error resolved

        • http://www.zimmergren.net/ Tobias Zimmergren

          I would attach a debugger and step through the code.
          If the exceptions occur in assemblies other than your own, use Reflector Pro! to enable debugging on those assemblies.

          • Basel

            its works finally, by keep the 4 options of configuring the conversion checked without click on apply button,
            since the ExcludedTransformers then will not be added to the xmldoucment element inside the xmlscheme for the content type.
            which not result of raising error when the function:
            PrepareContentTypeFilter(SPList list, Hashtable[] excludedTransformers) called to loop through unchecked options.

            it seems for me a bug from Microsoft when adding the ExcludedTransformers to the hashtable
            excludedTransformers[index].Add(key, type.Id.ToString());.

            anyway it got solved now and appreciate your leads.

          • http://www.zimmergren.net/ Tobias Zimmergren

            Thank you for posting the results of your investigations. Appreciate it.
            Cheers,
            Tobias.

  • RWESTLE

    We are having a problem converting doc files with a positioned image to docx. the image becomes reduced but is in the right location?

  • http://www.facebook.com/profile.php?id=100003329683296 Edson Catugy

    Amazing article!

    It’s very simple to understand, congratulations!

    • http://www.zimmergren.net/ Tobias Zimmergren

      Thank you, Edson.

  • mohdubaid

    Hi.. can you provide a similar solution for Office 365 environment?

  • http://ralph-bond.tumblr.com/ George Cantu

    The information in your post about SharePoint is more interesting. and this info is more useful for the developers to develop the SharePoint apps. Thanks for share this valuable info.

    • http://www.zimmergren.net/ Tobias Zimmergren

      You are most welcome, George.

  • Pingback: Document Conversion using Word Automation Services | Bobs Blog

  • http://www.zimmergren.net/ Tobias Zimmergren

    Not sure there’s a built-in method for it, but you could always use the API’s to open your documents and merge them together and save as a single file before processing as a PDF. Have a go at the Word automation services API’s first and if it doesn’t exist there you could do something like I just wrote.