Monday, January 25, 2010

Microsoft Word Document Automation using XSL and XML

Here is My first technical blog!!!

First of all, what is document automation?

In a simple term generating word document without the use of Microsoft Office.

Why its required?

Although Microsoft Office is easy to use, and it is very user friendly, many find it difficult to use(I am one of them). At least when the document involves complex formatting.

Consider the example of Invoice. It contains customer information, product information, date, company information, etc. Each Invoice has some common things like same design, company information. Only date, customer name, product information will be different. Now consider a business in which thousands of invoices will be generated during the whole day. Each time if a user has to create new word document, enter data then it will be a time consuming task for user.

Here document automation comes in to picture. What if a user has a software in which user has to enter only data like customer name, product information and document will be generated for the user. It's very easy for user. There are many examples in a real world where document automation is very useful. Basically, we can use document automation where format of document does not change only content to document changes.


I have worked on document automation projects. Here a requirement was to automate course plan and MOM(Minutes of Meeting) document. Course plan document contains all the course information like the title of course, pre requisites for the course, session details, objective of the session, points to be covered in session. MOM document contains meeting information like who are the invites, time of meeting, agenda of meeting. These documents contain complex formatting.

A first step was to create application so that user can enter data. I have created Asp.Net application, where a user can enter data through GUI. Data will be stored in Sql server so that anytime user can re generate a document. Main part of the projects was to create word document from data that user has entered and stored in a sql server. I have decided to use XML and XSL.

Introduction to XML

XML stands for Extensible Markup Language. It is a markup language same is HTML. XML is designed for data communication. There are no predefined tags like HTML in XML. User can define own tags. It's a self descriptive. What does it mean by self descriptive? Author of XML document can specify own tags.

Consider following example of XML document

<companyname>ABC Pvt. Ltd.</companyname>
<companyaddress>XYZ Road,Andheri,Mumbai</companyaddress>

From above XML anyone can get that it is the company information of ABC Pvt. Ltd. which is in mumbai and it has 20 employees.

XML simplifies data sharing. Nowadays, a days many RDBMS systems are available. Data sharing is difficult because each RDBMS has its own format to store data like Sql server uses .mdf file. Using XML data storing will be independent. It also makes data transportation easy. Now a days XML are used at many places like RSS(my next topic),WSDL,Open office XML, etc.

Introduction to XSL

XSL stands for Extensible Stylesheet Language. XSLT stands for XSL Transformations. XSLT is used to convert XML documents to other formats like HTML, word, etc.
There are three major parts of XSL.

1) XSLT- to transform XML document,
2) XPath - to navigate through XML
3) XSL-FO- to format XML document

XSLT is more important for us. Xslt is used to transform XML document to another type of document like HTML. It is supported by all major browsers.

Consider CSS( cascading style sheet) is used to transform HTML document. It is used to give a look and feel to HTML tags. Consider below example.


Above example gives red back ground color to each span tag. Same way XSLT is used for XML document. Consider above company information XML. If we directly view this XML file in a browser, It does not look good. We can use XSLT to transform this XML document. Like If I want to display company name in bold in align it to center. We can use following XSLT code.

<xsl:template match="CompanyName">
<tr align="center">

It will display CompanyName in bold and align it to center. We can add reference of XSL file in XML file as follows.

Some useful XSL templates

1) <xsl:template>

It is used to build XSL template. It has many attributes. Here we can specify XML element like in an above example, we have specified "CompanyName". match= "\" specifies root tag.

2) <xsl:apply-templates>

It is used to apply formatting. Like in above case we are applying formatting of table row and cell.

3) <xsl:value-of>

It is used to get value of XML element and use it in transformation.

4) <xsl:for-each>

It is used to loop through XML elements.

5) <xsl:if>

It is used to specify a certain conditions while transforming.

There are many more templates are available. For more information, you can visit w3schools website.

So in my project I have written XSL file with XSL Transfomation to get necessary formatting. I have created XML file from data stored in the sql server.

There is one class available in. Net framework 2.0 in System.Xml.Xsl name space, that is XslCompiledTransform. Using it, you can transform XML document as follows.

Dim xslt As New XslCompiledTransform
xslt.Load(Server.MapPath("~\") & "exp.xsl")
xslt.Transform(Server.MapPath("~\") & "testing.xml", Server.MapPath("~\") & "testing.html")
xslt.Transform(Server.MapPath("~\") & "testing.xml", Server.MapPath("~\") & "testing.doc")

Load method will load xsl style sheet. Transform method takes two arguments. First is XML file to be transformed and Output file.

This is how I have implemented word document automation. There is a limitation of it. We cannot create an Office 2007 word file that is .docx file. Because in Office 2007 Microsoft has introduced a totally new format that is Open Office XML. That I will discuss in my future blogs.

Monday, January 4, 2010

Exam in Train


This is my first blog. Although I am a technical person but this is not a technical blog. I want to share one of my experiences. I have participated in Visual Studio ALM challenge from Microsoft. Luckily, I have cleared first two rounds. My slot for third round was on 3rd January 5 PM. On the same day, my train was on 11:15 PM for Mumbai. On sharp 5:00 PM, I was ready for exam. However, unfortunately something goes wrong and my exam could not be started. I got a message that you have already taken the test. I was so frustrated. After coming so far if you miss an opportunity just like that how you will feel? After securing a place in top 100 from more than 10,000 participants, I did not want to let got this opportunity. So I sent a mail to Microsoft Officials regarding this. I was not confident that I will get one more opportunity. I waited for sometime for a reply but no reply. So I logged off my system and went for shopping. I came back around 8:00 PM. I have checked my mail, and I am surprised. They have refreshed my time slot, and I have one more opportunity to take the test. And problem started. Now my train was on 11:15 PM. I have to leave around 9:30 from home, otherwise there is no bus or auto I can get for railway station. Exam was of one hour. I booked the last slot that is 11:00 PM. I thought I will leave home early, and if I could reach the station before 10:00 PM, I will change my slot to 10:00 PM. However, I was not able to make it on time I reached on station at 10:30 PM. Now there was only last hope. I thought if train starts late then I can complete test. Train was on the platform. I quickly moved to my seat and started the test on 11:00 PM. However, train started sharp on 11:15. Now there was one more problem. I have reliance data card. If the network goes in between test then there was nothing I can do. However, fortunately this time nothing goes wrong.Reliance Network stayed with me all time. Thank you very much Reliance. I could complete the test. I was very happy. What an experience this was for me. While writing this blog, I have not gotten a result of test. Whatever result may but the experience was memorable.

This is just starting. I will write more blogs, and yes those blogs will be technical blogs.