You are on page 1of 25

Import XML data to Google App Engine

Upload bulk data stored locally in XML files to Google App Engine's persistent object database
Skill Level: Intermediate Mr Joseph P. McCarthy (josemcca@ie.ibm.com) Software Developer IBM

07 Sep 2010 The Google App Engine launched in April 2008, and included a method to upload bulk data stored in CSV files using Python. Java language support was added a year later. To date, the App Engine lacks Java-native support for bulk uploads and CSV remains the only data storage medium supported by the bulk uploading tool. Explore the various methods to store data from XML documents on the App Engine persistent database in this article.

Background
Launched in April 2008, Google App Engine (GAE) (see Resources for a link) is a free web application hosting service from Google. Initially only supporting applications developed in Python, Java language support was added in April 2009. Frequently used acronyms
API: Application Programming Interface CPU: Central processing unit CSV: Comma-separated values HTML: HyperText Markup Language IDE: Integrated development environment REST: REpresentational State Transfer

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 1 of 25

developerWorks

ibm.com/developerWorks

SAX: Simple API for XML SOAP: Simple Object Access Protocol UI: User Interface URI: Uniform Resource Identifier URL: Uniform Resource Locator WSC: Web Services Connector WSDL: Web Services Description Language W3C: World Wide Web Consortium XML: Extensible Markup Language

The supplied development environment for applications creates a local database to persist data during development, and the site itself allows data to be stored as persistent objects, or entities. These entities are created using Plain Old Java Objects (POJOs) annotated with Java Data Object (JDO) annotations. However, the environment has no way to upload data directly between the two (local and deployed) databases. The Python environment allows for bulk upload of data stored in CSV format. It does not officially support native bulk upload of data in the Java language. The recommended method is to upload the data using a Python version of the application and access the data using Java classes, but this requires a working knowledge of Python and depends on the ability of the data to be represented in CSV format. XML is as a flexible, text based format. In recent years, online and offline applications increasingly store data as XML for use in multiple ways. Despite XML's ubiquity across the internet, GAE does not provide a bulk upload service for data stored in XML documents. SAX is a serial access parser API for XML. When you write a SAX-based parser for an XML document, you can use several callback methods that trigger when they encounter various elements of the document while parsing the document (such as the beginning of the document, the beginning of an XML element, the end of an element, characters, and so on).

Simple XML persistence


The simplest method to add the data from an XML document to the datastore on GAE is to upload the document as part of the application and use a custom SAX-based parser to create a class in the application based on each entry in the document. Let's take a simple XML document containing the list of employees at an

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 2 of 25

ibm.com/developerWorks

developerWorks

organisation, add it to a GAE project, create a class for the elements, and store them as entities. The XML document employees.xml (see Listing 1) has a simple format. Each employee has a single attribute (id), and four elements: firstName, surName, emailAddress, and hireDate. The employees.xml file will be used throughout this article. You can download the file and all the other source files described in this article from Downloads. Listing 1. employees.xml
<?xml version="1.0" encoding="UTF-8"?> <employees> <employee id="1"> <firstName>Rickey</firstName> <surName>Torres</surName> <emailAddress>rickey.torres@employer.com</emailAddress> <hireDate>1996-09-17</hireDate> </employee> <employee id="2"> <firstName>Karisa</firstName> <surName>Moore</surName> <emailAddress>karisa.moore@employer.com</emailAddress> <hireDate>1996-04-08</hireDate> </employee> <employee id="3"> <firstName>Aaron</firstName> <surName>Wilson</surName> <emailAddress>aaron.wilson@employer.com</emailAddress> <hireDate>2000-01-05</hireDate> </employee> </employees>

Listing 2 shows how to represent these as the following POJO class and then annotate with JDO annotations (the appropriate Java imports, and the set functions are assumed in the annotated class). Listing 2. Employee.java
POJO Employee.java
Public class Employee { private private private private private Long id; String firstName; String surName; String emailAddress; Date hireDate;

Annotated Employee.java
@PersistenceCapable(identityType = IdentityType.APPLICATION) public class Employee {

@PrimaryKey @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY) private Long id; public Employee() { @Persistent private String firstName; } @Persistent private String surName; public void setFirstName(String firstName)@Persistent { this.firstName = firstName; private String emailAddress; } @Persistent public void setSurName(String surName) { private Date hireDate; this.surName = surName; Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved. Trademarks Page 3 of 25

developerWorks

ibm.com/developerWorks

} public Employee() { public void setEmailAddress(String emailAddress) { } this.emailAddress = emailAddress; } } public void setHireDate(Date hireDate) { this.hireDate = hireDate; } public void setId(Long id) { this.id = id; } }

For more complex XML documents, you might also use Simple. See Resources for a link to the tutorial.

To create a SAX Parser in the Java language, extend the org.xml.sax.helpers.DefaultHandler class and override the methods needed to parse the document as in Listing 3. Listing 3. EmployeeHandler.java
package com.xmlimport.employee; import import import import import import java.text.ParseException; java.text.SimpleDateFormat; java.util.ArrayList; java.util.Stack; java.util.logging.Level; java.util.logging.Logger;

import javax.jdo.PersistenceManager; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import import import import org.xml.sax.Attributes; org.xml.sax.InputSource; org.xml.sax.SAXException; org.xml.sax.helpers.DefaultHandler;

import com.xmlimport.XMLImportPersistenceManagerFactory; public class EmployeeHandler extends DefaultHandler { private static final Logger log = Logger.getLogger(EmployeeHandler.class.getName()); private static final SimpleDateFormat hireDateFormat = new SimpleDateFormat("yyyy-MM-dd"); private Stack<Employee> employeeStack; private ArrayList<Employee> employees; private PersistenceManager pm = null; private String characters; public EmployeeHandler() { SAXParserFactory factory = SAXParserFactory.newInstance(); try { pm = XMLImportPersistenceManagerFactory.get().getPersistenceManager(); SAXParser saxParser = factory.newSAXParser();

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 4 of 25

ibm.com/developerWorks

developerWorks

saxParser.parse(new InputSource("./employees.xml"), this); pm.makePersistentAll(employees); } catch (Throwable t) { t.printStackTrace(); } finally { pm.close(); } } public void startDocument() throws SAXException { employeeStack = new Stack<Employee>(); employees = new ArrayList<Employee>(); } public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes attributes) throws SAXException { if (qualifiedName.equals("employee")) { Employee employee = new Employee(); employee.setId(Long.parseLong(attributes.getValue("id"))); employeeStack.push(employee); } } public void endElement(String namespaceURI, String simpleName, String qualifiedName) throws SAXException { if (!employeeStack.isEmpty()) { if (qualifiedName.equals("employee")) { employees.add(employeeStack.pop()); } else if (qualifiedName.equals("firstName")) { Employee employee = employeeStack.pop(); employee.setFirstName(characters); employeeStack.push(employee); } else if (qualifiedName.equals("surName")) { Employee employee = employeeStack.pop(); employee.setSurName(characters); employeeStack.push(employee); } else if (qualifiedName.equals("emailAddress")) { Employee employee = employeeStack.pop();

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 5 of 25

developerWorks

ibm.com/developerWorks

employee.setEmailAddress(characters); employeeStack.push(employee); } else if (qualifiedName.equals("hireDate")) { Employee employee = employeeStack.pop(); try { employee.setHireDate(hireDateFormat.parse(characters)); } catch (ParseException e) { log.log(Level.FINE, "Could not parse date {0}", characters); } employeeStack.push(employee); } } } public void characters(char buf[], int offset, int len) throws SAXException { characters = new String(buf, offset, len); } }

Each time you parse a new <employee> element, you create a new Employee object and add it to a Stack. When each of the other elements are parsed, you pop the Employee object off the stack, call the relevant set function, and push the object back on to the stack. When the employee close element is parsed, you pop the completed object from the stack, and add it to a List object. When the entire document is parsed, you persist each object in the List with the PersistenceManager. According to the Using the DataStore with JDO guide, creating the PersistentManager object is quite costly in terms of CPU time. The guide recommends you use a static final variable to create the object once at application startup and then get this object when needed. Listing 4 shows the PersistenceManagerFactory class. Note that this same class, with changes for package name and class name, can be used in each of your GAE projects. Listing 4. XMLImportPersistenceManagerFactory.java
package com.xmlimport; import javax.jdo.JDOHelper; import javax.jdo.PersistenceManagerFactory; public final class XMLImportPersistenceManagerFactory { private static final PersistenceManagerFactory pmfInstance = JDOHelper. getPersistenceManagerFactory("transactions-optional");

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 6 of 25

ibm.com/developerWorks

developerWorks

private XMLImportPersistenceManagerFactory() {} public static PersistenceManagerFactory get() { return pmfInstance; } }

Finally, you need to create a servlet to call the EmployeeHandler and create the Employee objects. Then we add the servlet definition to the web.xml file so the /CreateEmployee URL will redirect to it. Listing 5 shows the web.xml file. Listing 5. web.xml
<?xml version="1.0" encoding="utf-8"?> <web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" version="2.5"> <servlet> <servlet-name>CreateEmployeeServlet</servlet-name> <servlet-class>com.xmlimport.servlet.CreateEmployeeServlet</servlet-class> </servlet> <servlet-mapping> <servlet-name>CreateEmployeeServlet</servlet-name> <url-pattern>/CreateEmployee</url-pattern> </servlet-mapping> </web-app>

Listing 6 shows the servlet. Listing 6. CreateEmployeeServlet.java


package com.xmlimport.servlet; import java.io.IOException; import import import import import javax.servlet.RequestDispatcher; javax.servlet.ServletException; javax.servlet.http.HttpServlet; javax.servlet.http.HttpServletRequest; javax.servlet.http.HttpServletResponse;

import com.xmlimport.employee.EmployeeHandler; public class CreateEmployeeServlet extends HttpServlet { private static final long serialVersionUID = 1L; public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { new EmployeeHandler(); PrintWriter out = response.getWriter(); out.println(Employees Created); }

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 7 of 25

developerWorks

ibm.com/developerWorks

Add these files to a standard GAE Java project and launch the test server. Enter http://localhost:8080/CreateEmployee/ in a browser and, after a brief period, the message Employees Created will appear on screen. If you open the local datastore viewer at http://localhost:8080/_ah/admin/datastore, you can view the newly created Employee objects.

Persisting manually entered XML data


This solution is not very practical for a deployed site. To create a new set of Employee objects, you need to create the employees.xml file and deploy it to appspot.com every time. Let's change the behaviour of the handler slightly. Instead of parsing an existing file, change the handler to parse text input from a form in the servlet. First change the servlet to open a Java Server Page (JSP) with a form including a textarea input box and a submit button. Click the button to send the text entered in the input box to the EmployeeHandler. Exactly as before, the text will be parsed and each new Employee object will be made persistent by the PersistenceManager. Change the doGet method to redirect to a JSP called createEmployee.jsp, as in Listing 7. Listing 7. Refactored doGet method in EmployeeServlet.java
public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { RequestDispatcher view = request. getRequestDispatcher("/createEmployee.jsp"); view.forward(request, response); }

In the root folder of the war directory, create the createEmployee.jsp file. See Listing 8: Listing 8. createEmployee.jsp
<%@ page language="java" contentType="text/html; charset=ISO-8859-1" pageEncoding="ISO-8859-1"%> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <title>Create all employee</title> </head>

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 8 of 25

ibm.com/developerWorks

developerWorks

<body> <form action="/CreateEmployee" method="POST"> <textarea name="employeeXML" cols="25" rows="25"> </textarea> <input type="submit" value="Create Employee(s)"/> </form> </body> </html>

Since the form uses the POST method, you need to add the doPost() function to the CreateEmployeeServlet class (see Listing 9) to be called when a user clicks the Create Employee(s) button. Listing 9.Add the doPost method to EmployeeServlet.java
public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { new EmployeeHandler(request.getParameter("employeeXML"); RequestDispatcher view = request. getRequestDispatcher("/createEmployee.jsp"); view.forward(request, response); }

In this method, you extract the text entered into the textarea on the form, send it to the EmployeeHandler. After the text is parsed and the Employee objects are created, redirect the user back to the same page with an empty form. To parse the text, make a final change to the constructor of the EmployeeHandler to accept the text from the textarea instead of parsing the text from a File object. As in Listing 10, you add the text to a StringReader object, and parse this new object. Listing 10. EmployeeHandler constructor using String parameter
public EmployeeHandler(String employeeXML) { SAXParserFactory factory = SAXParserFactory.newInstance(); try { pm = XMLImportPersistenceManagerFactory.get().getPersistenceManager(); SAXParser saxParser = factory.newSAXParser(); saxParser.parse(new InputSource (new StringReader(employeeXML)), this); pm.makePersistentAll(employees); } catch (Throwable t) { t.printStackTrace(); } finally { pm.close(); }

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 9 of 25

developerWorks

ibm.com/developerWorks

Now, you can add any employee files to the web application without having to upload to appspot.com first.

Using web services to upload XML data


This solution is restricted by both the maximum of characters you can enter into a textarea, and the 30 second timeout enforced by Google on requests sent to GAE. If the document is not parsed and the objects made persistent within 30 seconds, then the server will throw an exception and the objects will not be created. SOAP is a protocol that allows XML messages to be sent and received over the internet. To create each employee, you will use a SOAP service running on GAE, one at a time. You can reuse the same handler class as before, but rather than run it on the server, you will use it as a client. Instead of adding the Employee object to a List, the relevant information is sent to a SOAP service to create the same object on GAE, and then persisted. Spring is an open source application framework developed by SpringSource containing (among other modules) a remote access framework allowing RPC-style export and import of Java objects over networks supporting RMI, CORBA, and HTTP-based protocols including SOAP. The Force.com Web Service Connector (WSC) is a high performing web service client stack implemented using a streaming parser. WSC also makes it much easier to use the Force.com API (Web Services/SOAP or Asynchronous/REST API). WSC can be used to invoke any doc literal-wrapped web service. A version for use with the GAE is availablesee Resources for the link. The cloudwhiz blog provides details about how to implement a SOAP web service on GAE (see Resources for a link to a three-part article), and the final part of this article will describe how to use a SOAP service to create an object and persist it. First, you need to define the web service using a Web Service Definition Langage (WSDL) File. This specifies the objects and operations that can be handled by the web service. Note the targetNameSpace definition on line 7: http://xmlimport.appspot.com. This will be used later as the qualified name of the custom unmarshaller. For the purposes of this article, you need only one service, CreateEmployeeService, and one operation, createEmployee. If you examine the complexType createEmployeeRequest, it is basically an XML Schema Definition (XSD) of the entries in the employees XML file except that the id is an element instead of an

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 10 of 25

ibm.com/developerWorks

developerWorks

attribute. (See Listing 11. Listing 11. employeeService.wsdl


<?xml version="1.0"?> <definitions name="CreateEmployeeService" targetNamespace="http://xmlimport.appspot.com/" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:tns="http://xmlimport.appspot.com/" xmlns="http://schemas.xmlsoap.org/wsdl/"> <types> <schema targetNamespace="http://xmlimport.appspot.com/" xmlns="http://www.w3.org/2001/XMLSchema"> <element name="createEmployeeRequest" type="tns:createEmployeeRequest"/> <element name="createEmployeeResponse" type="tns:createEmployeeResponse"/> <element name="getEmployeeRequest" type="tns:getEmployeeRequest"/> <element name="getEmployeeResponse" type="tns:getEmployeeResponse"/> <complexType name="createEmployeeRequest"> <sequence> <element name="firstName" type="string"/> <element name="surName" type="string"/> <element name="emailAddress" type="string"/> <element name="hireDate" type="date"/> <element name="id" type="int"/> </sequence> </complexType> <complexType name="createEmployeeResponse"> <sequence> <element name="success" type="boolean"/> </sequence> </complexType> <complexType name="getEmployeeRequest"> <sequence> <element name="successful" type="boolean"/> <element name="firstName" type="string"/> <element name="surName" type="string"/> <element name="emailAddress" type="string"/> <element name="hireDate" type="date"/> <element name="id" type="long"/> </sequence> </complexType> <complexType name="getEmployeeResponse"> <sequence> <element name="id" type="long"/> </sequence> </complexType> </schema> </types> <message name="createEmployeeRequest"> <part name="parameters" element="tns:createEmployeeRequest"/> </message> <message name="createEmployeeResponse"> <part name="parameters" element="tns:createEmployeeResponse"/> </message> <message name="getEmployeeRequest"> <part name="parameters" element="tns:getEmployeeRequest"/> </message> <message name="getEmployeeResponse"> <part name="parameters" element="tns:getEmployeeResponse"/> </message> <portType name="EmployeeService">

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 11 of 25

developerWorks

ibm.com/developerWorks

<operation name="createEmployee"> <input message="tns:createEmployeeRequest"></input> <output message="tns:createEmployeeResponse"></output> </operation> <operation name="getEmployee"> <input message="tns:getEmployeeRequest"></input> <output message="tns:getEmployeeResponse"></output> </operation> </portType> <binding name="EmployeeServicePortBinding" type="tns:EmployeeService"> <soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/> <operation name="createEmployee"> <soap:operation soapAction=""/> <input> <soap:body use="literal"></soap:body> </input> <output> <soap:body use="literal"></soap:body> </output> </operation> <operation name="getEmployee"> <soap:operation soapAction=""/> <input> <soap:body use="literal"></soap:body> </input> <output> <soap:body use="literal"></soap:body> </output> </operation> </binding> <service name="CreateEmployeeService"> <documentation>Create Employee Service</documentation> <port name="EmployeeServicePort" binding="tns:EmployeeServicePortBinding"> <soap:address location="http://localhost:8080/soap/"/> </port> </service> </definitions>

Use the GAE version of the Force.com WSC to build a jar file from the WSDL (with the necessary classes to send and receive SOAP messages) and add both to the lib folder using the command:
java -classpath wsc-gae-16_0.jar com.sforce.ws.tools.wsdlc <WSDL input file> <JAR output file>

Add both the created output jar and wsc-gae.jar file to the lib folder of your project. Download the Spring framework jar files (see Resources for a link). Add the following jar files to the lib folder of your project: org.springframework.aop.jar org.springframework.asm.jar org.springframework.aspects.jar

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 12 of 25

ibm.com/developerWorks

developerWorks

org.springframework.beans.jar org.springframework.context.jar org.springframework.context.support.jar org.springframework.core.jar org.springframework.expression.jar org.springframework.instrument.jar org.springframework.instrument.tomcat.jar org.springframework.jdbc.jar org.springframework.jms.jar org.springframework.orm.jar org.springframework.oxm.jar org.springframework.test.jar org.springframework.transaction.jar org.springframework.web.jar org.springframework.web.portlet.jar org.springframework.web.servlet.jar org.springframework.web.struts.jar spring-oxm.jar spring-oxm-tiger.jar spring-ws-core.jar spring-ws-core-tiger.jar spring-ws-security.jar spring-ws-support.jar spring-xml.jar While researching this article, I used the final release of SpringSource 3.0 and did not test any later versions. The GAE does not allow writing to the local file system, which is a prerequisite of the Spring AxiomSoapMessageFactory class. Fortunately a simple workaround is to extend the class and override the afterPropertiesSet() method to do nothing as in Listing 12. This will satisfy the Spring check for write permissions.
Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved. Trademarks Page 13 of 25

developerWorks

ibm.com/developerWorks

Listing 12. Custom AxiomSoapMessageFactory class


package com.xmlimport.service.soap; import org.springframework.ws.soap.axiom.AxiomSoapMessageFactory; public class XMLImportMessageFactory extends AxiomSoapMessageFactory { public void afterPropertiesSet() throws Exception { // Do nothing. // This is because the method checks for write access, which GAE does not allow } }

In order to use the Force.com WSC, you need a custom marshaller and unmarshaller as shown in Listing 13 Listing 13. Custom Marshaller class
package com.xmlimport.service.soap; public class EmployeeServiceMarshaller extends TransformerObjectSupport implements Marshaller, Unmarshaller { public final void marshal(Object graph, Result result) throws XmlMappingException, IOException { try { XMLizable xmlObject = (XMLizable)graph; ByteArrayOutputStream xmlBuffer = new ByteArrayOutputStream(); // Assumes all services under same name space at present. QName qName = new QName("http://xmlimport.appspot.com/", StringUtils. uncapitalize(xmlObject.getClass().getSimpleName())); // Use the Force.com WSC API to generate the XML from the given object. XmlOutputStream xout = new XmlOutputStream(xmlBuffer, true); xout.startDocument(); xmlObject.write(qName, xout, new TypeMapper()); xout.endDocument(); xout.close(); // Setup an XMLStreamReader to parse the generated XML buffer. XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new StringReader(xmlBuffer.toString())); // Copy the contents of the XMLStreamReader into the StaxResult. XMLStreamWriter xmlStreamWriter = ((StaxResult)result).getXMLStreamWriter(); org.apache.axiom.om.util.CopyUtils.reader2writer(xmlStreamReader, xmlStreamWriter); } catch (XMLStreamException xse) { throw new MarshallingFailureException( "Failed to copy generated object XML into StaxResult.", xse); } } public final Object unmarshal(Source source) throws XmlMappingException, IOException { XMLizable xmlObject = null; if (source != null) {

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 14 of 25

ibm.com/developerWorks

developerWorks

try { XMLStreamReader xmlStreamReader = ((StaxSource)source).getXMLStreamReader(); xmlStreamReader.next(); // Use localName of top element to work out the name of the java class. StringBuilder className = new StringBuilder("com.sforce.soap."); className.append(StringUtils.capitalize(xmlStreamReader.getLocalName())); // Create an instance of this class to bind to the XML. xmlObject = (XMLizable)Class.forName(className.toString()).newInstance(); // Transform the StaxSource into a StreamResult //so that we get the XML String. StringWriter out = new StringWriter(); transform(source, new StreamResult(out)); // Use the XML String with the Force.com WSC //to populate the properties of the object. XmlInputStream xin = new XmlInputStream(); xin. setInput(new ByteArrayInputStream(out.toString().getBytes()), "UTF-8"); xmlObject.load(xin, new TypeMapper()); } catch (ClassNotFoundException cnfe) { throw new UnmarshallingFailureException( "A Force.com WSC generated class was not found that matches the XML message.", cnfe); } catch (IllegalAccessException iae) { throw new UnmarshallingFailureException( "Failed to instantiate instance of the Force.com WSC generated class.", iae); } catch (InstantiationException ie) { throw new UnmarshallingFailureException( "Failed to instantiate instance of the Force.com WSC generated class."); } catch (ConnectionException ce) { throw new UnmarshallingFailureException( "Failed to parse XML String using Force.com pull parser.", ce); } catch (PullParserException ppe) { throw new UnmarshallingFailureException( "Failed to parse XML String using Force.com pull parser.", ppe); } catch (TransformerException te) { throw new UnmarshallingFailureException( "Failed to transform StaxSource to StreamResult.", te); } catch (XMLStreamException xse) { throw new UnmarshallingFailureException( "Failed to parse top level element in message payload.", xse); } } return xmlObject; } /** * Assumes that all marshalling and unmarshalling is handled by this implementation. */ @SuppressWarnings("unchecked") public boolean supports(Class clazz) { return true; } }

The marshall function is used to convert the result received from the web service to a javax.xml.transform.Result. Note that the qName in the marshal function is the same as the targetNameSpace from the WSDL. The unmarshall code is used for the reverseto convert the
Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved. Trademarks Page 15 of 25

developerWorks

ibm.com/developerWorks

javax.xml.transform.Source received over HTTP into a Java object to be used by the web service. The name of the class to be created is extracted from the XML stored in the source parameter. The default name of the package of any class created using the WSC code, com.sforce.soap, is prepended to this to create the fully qualified name of the class (for example, com.sforce.soap.CreateEmployeeRequest). Reflect is used to create an instance of this class and finally its attributes are set using the remainder of the xml in the source. The supports function specifies that the service will handle all classes. Create the Spring framework configuration file, ws-servlet.xml, as in Listing 14 and place it in the WEB-INF folder in the war directory of the GAE project. This file contains the class name for the service, the marshaller and unmarshaller that the class uses, the location of the WSDL for the service, and the custom MessageFactory you created above (see Listing 12). Ensure that the WSDL file is in the same folder as defined in the constructor-arg element of the service bean (employeeService). This file is placed in the WEB-INF folder of the project, with the web.xml file. Listing 14. ws-servlet.xml
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd"> <bean class="org.springframework.ws.server.endpoint.mapping.SimpleMethodEndpointMapping"> <property name="endpoints" ref="createEmployeeService"></property> </bean> <bean id="createEmployeeService" class="com.xmlimport.service.soap.CreateEmployeeService"> <property name="marshaller" ref="createEmployeeServiceMarshaller" /> <property name="unmarshaller" ref="createEmployeeServiceUnmarshaller" /> </bean> <bean id="createEmployeeServiceMarshaller" class="com.xmlimport.service.soap.EmployeeServiceMarshaller"></bean> <bean id="createEmployeeServiceUnmarshaller" class="com.xmlimport.service.soap.EmployeeServiceMarshaller"></bean> <bean id="employeeService" class="org.springframework.ws.wsdl.wsdl11.SimpleWsdl11Definition"> <constructor-arg value="/WEB-INF/wsdl/employeeService.wsdl"/> </bean> <bean id="messageFactory" class="com.xmlimport.service.soap.XMLImportMessageFactory"> <property name="payloadCaching" value="false"/> <property name="attachmentCaching" value="false"/> </bean> </beans>

Add the servlet definition to web.xml as in Listing 15, so all requests to the /soap/*
Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved. Trademarks Page 16 of 25

ibm.com/developerWorks

developerWorks

URI will be passed to the Spring framework and, from there, to the service class defined in the framework configuration. The servlet name is ws and, by convention, the framework configuration file is <servlet-name>-servlet.xml, hence the name ws-servlet.xml in the previous step. Listing 15. SOAP servlet definition web.xml
<servlet> <servlet-name>ws</servlet-name> <servlet-class>org.springframework.ws.transport.http.MessageDispatcherServlet </servlet-class> <init-param> <param-name>transformWsdlLocations</param-name> <param-value>true</param-value> </init-param> </servlet> <servlet-mapping> <servlet-name>ws</servlet-name> <url-pattern>/soap/*</url-pattern> </servlet-mapping> </web-app>

Finally, write the web service class itself. As each request is received by the unmarshall function, you create the corresponding object and passed it to the relevant function in the service. In the handlecreateEmployeeRequest function, you extract the necessary information to create an Employee object from the CreateEmployeeRequest object, and then make the new object persistent. Note that the name of the function is handlecreateEmployeeRequest (with a lower case c), not handleCreateEmployeeRequest (with an upper case C as is the convention in the Java language). Write the service class, CreateEmployeeService (see Listing 16). As each request is received, the necessary information to create an Employee object is extracted from the CreateEmployeeRequest object, and then the new object is made persistent. Note that the name of the function is handlecreateEmployeeRequest (with a lowercase "c" for "create"). Listing 16. CreateEmployeeService.java
package com.xmlimport.service.soap; import org.springframework.ws.server.endpoint.adapter.MarshallingMethodEndpointAdapter; import import import import import com.sforce.soap.CreateEmployeeRequest; com.sforce.soap.CreateEmployeeResponse; com.xmlimport.employee.Employee; javax.jdo.PersistenceManager; com.xmlimport.XMLImportPersistenceManagerFactory;

public class CreateEmployeeService extends MarshallingMethodEndpointAdapter { Logger log = Logger.getLogger(CreateEmployeeService.class.getName());

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 17 of 25

developerWorks

ibm.com/developerWorks

public CreateEmployeeResponse handlecreateEmployeeRequest(CreateEmployeeRequest createEmployeeRequest) { Employee employee = new Employee(); employee.setFirstName(createEmployeeRequest.getFirstName()); employee.setSurName(createEmployeeRequest.getSurName()); employee.setEmailAddress(createEmployeeRequest.getEmailAddress()); employee.setId(createEmployeeRequest.getId()); employee.setHireDate(createEmployeeRequest.getHireDate().getTime()); PersistenceManager pm = XMLImportPersistenceManagerFactory.get().getPersistenceManager(); try { pm.makePersistent(employee); } finally { pm.close(); } CreateEmployeeResponse createEmployeeResponse = new CreateEmployeeResponse(); createEmployeeResponse.setSuccess(true); return createEmployeeResponse; } }

Compile and deploy the project to appspot.com. Confirm you can access the WSDL for the application <application name>.appspot.com/soap/wsdl/employeeServices.wsdl from a browser. I use the soapUI tool (see Resources for the download link) to test SOAP and REST services. Open the WSDL file using soapUI from the same URL as before, and soapUI automatically creates SOAP request messages with empty elements for firstName, surName, and so on. Enter some values for these and click the green arrow to submit the request. The response should have the success element set to true. From the GAE application dashboard, use the Data Viewer to confirm the object is created.

Bulk upload from an XML document


While soapUI is fine for creating single entries, to create each of the entries in the employees XML file, you need a client program to parse the file and call the service with each entry. This client uses the same jar file created during the creation of the web service, but uses the client version of the WSC jar file instead of the GAE-specific version. Listing 17 is the new version of EmployeeHandler. Listing 17. EmployeeHandler.java
package com.xmlimport.client;

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 18 of 25

ibm.com/developerWorks

developerWorks

import import import import import import import

java.text.ParseException; java.text.SimpleDateFormat; java.util.ArrayList; java.util.Calendar; java.util.Stack; java.util.logging.Level; java.util.logging.Logger;

import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import import import import import import import import org.xml.sax.Attributes; org.xml.sax.InputSource; org.xml.sax.SAXException; org.xml.sax.helpers.DefaultHandler; com.sforce.soap.Connector; com.sforce.soap.SoapConnection; com.sforce.ws.ConnectionException; com.sforce.ws.ConnectorConfig;

public class EmployeeHandler extends DefaultHandler { private static final Logger log=Logger.getLogger(EmployeeHandler.class. getName()); private static final SimpleDateFormat hireDateFormat = new SimpleDateFormat("yyyy-MM-dd"); private Stack<Employee> employeeStack; private String characters; public EmployeeHandler() { SAXParserFactory factory = SAXParserFactory.newInstance(); try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse(new InputSource("./employees.xml"), this); } catch (Throwable t) { t.printStackTrace(); } } public void startDocument() throws SAXException { employeeStack = new Stack<Employee>(); } public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes attributes) throws SAXException { if (qualifiedName.equals("employee")) { Employee employee = new Employee(); employee.setEmployeeId(new Integer(attributes.getValue("id"))); employeeStack.push(employee); } }

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 19 of 25

developerWorks

ibm.com/developerWorks

public void endElement(String namespaceURI, String simpleName, String qualifiedName) throws SAXException { if (!employeeStack.isEmpty()) { if (qualifiedName.equals("employee")) { sentEmployeeSOAPMessage(employeeStack.pop()); } else if (qualifiedName.equals("firstName")) { Employee employee = employeeStack.pop(); employee.setFirstName(characters); employeeStack.push(employee); } else if (qualifiedName.equals("surName")) { Employee employee = employeeStack.pop(); employee.setSurName(characters); employeeStack.push(employee); } else if (qualifiedName.equals("emailAddress")) { Employee employee = employeeStack.pop(); employee.setEmailAddress(characters); employeeStack.push(employee); } else if (qualifiedName.equals("hireDate")) { Employee employee = employeeStack.pop(); try { employee.setHireDate( hireDateFormat.parse(characters)); } catch (ParseException e) { log.log(Level.FINE, "Could not parse date {0}", characters); } employeeStack.push(employee); } } } public void characters(char buf[], int offset, int len) throws SAXException { characters = new String(buf, offset, len); } private void sentEmployeeSOAPMessage(Employee employee) { ConnectorConfig config = new ConnectorConfig(); config.setServiceEndpoint("http://xmlimport.appspot.com/soap/"); Calendar hireDate = Calendar.getInstance(); hireDate.setTime(employee.getHireDate());

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 20 of 25

ibm.com/developerWorks

developerWorks

System.out.println("Creating employee " + employee.getFirstName() + " " + employee.getSurName()); try { SoapConnection soapConnection = Connector.newConnection(config); boolean success = soapConnection.createEmployee( employee.getFirstName(), employee.getSurName(), employee.getEmailAddress(), hireDate, employee.getEmployeeId()); System.out.println(success?"Success":"Failure"); } catch (ConnectionException e) { // TODO Auto-generated catch block e.printStackTrace(); } } public static void main(String[] args) { new EmployeeHandler(); } }

As you parse each entry, you print a message to the standard output with the name of the current employee being created and whether or not the creation was success or failure. This solution does not have the same timeout restrictions as the previous solutions, but if the client XML file is large enough, creating and persisting each object can exceed the daily quota of CPU time on the appspot.com servers.

Conclusion
In this article, I demonstrated various methods to create objects from the data in an XML document and persist them on the DataStore available to GAE developers. With the final, SOAP-based client and server method, a method to upload bulk XML data is now available to Java developers.

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 21 of 25

developerWorks

ibm.com/developerWorks

Downloads
Description
Application and service source code SOAP client source code Information about download methods

Name
XMLImport-GAEProject.zip XMLImportSOAPClient.zip

Size
44KB 24KB

Download method
HTTP HTTP

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 22 of 25

ibm.com/developerWorks

developerWorks

Resources
Learn Eclipse IDE: Access the home page for the Eclipse development environment. Using the DataStore with JDO guide: See the guide and learn more about storing data in a scalable web application. Google App Engine: Read more about how to build and host web apps on the same systems that power Google apps. GAE Datastore API guide: Read the documentation for a schemaless object datastore, with a query engine and atomic transactions. The Bulk Data Upload for Python: Learn about an alternative,non-Java bulk loader tool can upload and download data to and from your application's datastore. W3C XML definition: Learn about this simple, flexible text format that handles large-scale electronic publishing and the exchange of a wide variety of data on the Web and elsewhere. Exposing SOAP Service on GAE: Explore this three-part article on how to implement a SOAP web service on GAE using Spring. The SAX Project: Learn how XML parsers can pass information efficiently from XML documents to software apps. Using Simple for XML serialization: Really does make it simple to go from Java objects to XML (Brian Carey, developerWorks, November 2009): Understand how to convert an XML document to POJO using Simple. W3C SOAP Specification: Learn about SOAP and how to use its representative message structures and message exchange patterns. Issue 5: wsc-gae-16.0.jar throws null pointer exception if no parent directory specified for jar output file: Read about a known issue with the procedure for creating the web service from the WSDL file. W3C WSDL definition: Learn more about WSDL, an XML format that describes network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. XML area on developerWorks: Get the resources you need to advance your skills in the XML arena. My developerWorks: Personalize your developerWorks experience. IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 23 of 25

developerWorks

ibm.com/developerWorks

XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks. Also, read more XML tips. developerWorks technical events and webcasts: Stay current with technology in these sessions. developerWorks on Twitter: Join today to follow developerWorks tweets. developerWorks podcasts: Listen to interesting interviews and discussions for software developers. Get products and technologies JDK for GAE (including the Eclipse plugin): Download and start to build web apps using standard Java technologies and run them on Google's scalable infrastructure. Spring Framework homepage: Download this platform to build and run enterprise Java applications. Force.com Web Service Connector for GAE: Download the high performing web service client stack implemented using a streaming parser from Google. wsc-gae-16_0.jar: Download the GAE version of the Force.com Web Service Connector (WSC) from Force.com. The soapUI tool: Download this open source Functional Testing Tool, mainly used for web service testing. wsc-17_0.jarDownload the standalone version of the WSC jar file. IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2, Lotus, Rational, Tivoli, and WebSphere. Discuss XML zone discussion forums: Participate in any of several XML-related discussions. developerWorks blogs: Check out these blogs and get involved.

About the author


Mr Joseph P. McCarthy

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 24 of 25

ibm.com/developerWorks

developerWorks

Joseph McCarthy is a Java Developer in the Dublin Software Lab. He joined IBM in July 2002 after graduating from the University of Limerick with a Bsc in Computer Systems, and a Graduate Diploma in Computer Engineering.

Trademarks
IBM, the IBM logo, ibm.com, DB2, developerWorks, Lotus, Rational, Tivoli, and WebSphere are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol ( or ), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. See the current list of IBM trademarks. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.

Import XML data to Google App Engine Copyright IBM Corporation 2010. All rights reserved.

Trademarks Page 25 of 25