You are on page 1of 90

Endeca® Navigation Platform

Data Indexing API Guide


Copyright and Disclaimer
Product specifications are subject to change without notice and do not
represent a commitment on the part of Endeca Technologies, Inc. The
software described in this document is furnished under a license
agreement. The software may not be reverse assembled and may be
used or copied only in accordance with the terms of the license
agreement. It is against the law to copy the software on any medium
except as specifically allowed in the license agreement.

No part of this document may be reproduced or transmitted in any form


or by any means, electronic or mechanical, including photocopying and
recording, for any purpose without the express written permission of
Endeca Technologies, Inc.

Copyright © 2003-2005 Endeca Technologies, Inc. All rights reserved.


Printed in USA.
®
Corda PopChart and Corda Builder™ Copyright 1996-2005 Corda
Technologies, Inc.

®
Outside In SearchML © 1992-2005 Stellent Chicago, Inc. All rights
reserved.

®
Rosette Globalization Platform Portions Copyright © Basis
Technology Corp. 2003-2005. All rights reserved.

Teragram Language Identification Software Portions Copyright ©


1997-2005 Teragram Corporation. All rights reserved.

Trademarks
Don't Stop At Search, Endeca, Endeca InFront, Endeca Navigation
Engine, Guided Navigation, and ProFind are registered trademarks, and
Endeca Data Foundry and Endeca Latitude are trademarks of Endeca
Technologies, Inc.

Basis Technology and Rosette are trademarks of Basis Technology Corp.

All other trademarks or registered trademarks contained herein are the


property of their respective owners.

Endeca Data Indexing API Guide • August 2005


Contents

Preface
Contacting Endeca Standard Customer Support . . . . . . . . . . . . . . . . xii

Chapter 1 Overview of the Data Indexing API


About the Data Indexing API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data Indexing API Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
WSDL File for the Data Indexing API . . . . . . . . . . . . . . . . . . . . . . . 17
Overview of Data Indexing Implementation Process . . . . . . . . . . . . . 18

Chapter 2 System Setup


Installing the Data Indexing API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Starting and Stopping the Web Service . . . . . . . . . . . . . . . . . . . . . . . . 20
Changing the Web Service Permissions . . . . . . . . . . . . . . . . . . . . . . . 20
Web Service Role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Web Service User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Creating the Update Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Creating the Partial Update Pipeline . . . . . . . . . . . . . . . . . . . . . . . 21
Creating the Record Adapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Creating the Record Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Creating the Update Adapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Creating the Dimension Components . . . . . . . . . . . . . . . . . . . . . . 26
Provisioning the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Chapter 3 Writing Java Client Programs


Java Client Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
iv

Using the Java WSDP Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30


Creating a Client Configuration File . . . . . . . . . . . . . . . . . . . . . . . . 31
Generating Client Stubs with the wscompile Tool . . . . . . . . . . . . . 32
Modifying the Stub Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Using Apache Axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Generating Client Stubs with WSDL2Java . . . . . . . . . . . . . . . . . . . 35
Generating Client Stubs with an Ant Task . . . . . . . . . . . . . . . . . . . 36
Writing the Java Client Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Sample Java Application Program . . . . . . . . . . . . . . . . . . . . . . . . . 36
Invoking the Data Indexing Web Service . . . . . . . . . . . . . . . . . . . . . 36
Location of the Source Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Format of the Source Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Note on Formats of Input and Output Record Files . . . . . . . . . 40
Creating Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Queueing the Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Starting the Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Monitoring the Update Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Catching Data Indexing Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 49
Clearing the Update Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Chapter 4 Writing .NET Client Programs


.NET Client Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Creating the DataIndexingService Library . . . . . . . . . . . . . . . . . . . . . . 54
Producing the Client Stub Class . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Building the DataIndexingService Library . . . . . . . . . . . . . . . . . . . 55
Writing the .NET Client Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Adding Reference Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Sample .NET Client Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Connecting to the Data Indexing Web Service . . . . . . . . . . . . . 58
Starting the Baseline Update . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Monitoring and System Error Methods . . . . . . . . . . . . . . . . . . . 60
Catching Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
v

Chapter 5 Endeca Data Indexing API Reference


DataIndexing Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
addContent(String handle, Record[] records) . . . . . . . . . . . . . 66
clearContent(String[] handles) . . . . . . . . . . . . . . . . . . . . . . . . . 67
getSystemStatus() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
startBaselineUpdate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
startPartialUpdate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
stopBaselineUpdate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
PVal Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
PVal(String name, String value) . . . . . . . . . . . . . . . . . . . . . . . . 70
PVal() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
getName() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
getValue(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
setName(String name) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
setValue(String value) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Record Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Record(PVal[] values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Record() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
getValues(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
setValues(PVal[] pval). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Status Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
getSystemErrors() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
getSystemState(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
SystemError Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
getComponent() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
getErrorMsg() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
vi

getRecordSpec() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
getSeverity() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
DIException Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
DIInvalidOperation Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
DIInvalidParameter Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
DISystemOperation Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Appendix A Sample Java Client Code


Client.java Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Client2.java Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Index
Preface

The Endeca® Navigation Platform is the foundation for


building applications based on Endeca Navigation Engine®
technology. With the Endeca Navigation Platform, you can
build solutions that allow your users to quickly, precisely,
and easily search and navigate through large data sets,
avoiding all the traditional problems associated with
information overload and finding information online.
Endeca applications generate precise, relevant results with
sub-second response times, even across very large data sets.

The Endeca Navigation Platform allows you to build Guided


Navigation® functionality into your Web applications. The
Endeca Guided Navigation solution puts the results of all
search, navigation, and analytic queries in an organized
context that shows users precisely how to refine and
explore further. This helps solve the problems associated
with information overload by guiding users as they quickly,
precisely, and easily navigate through large data sets. The
Endeca Navigation Platform is based on technology that
makes it possible to scale to very large data sources and
user loads while running on low-cost hardware.
viii

About This Guide


This guide describes the classes and methods of the
Endeca Data Indexing API, and how to use them to
implement baseline and partial updates for your Endeca
system.

Who Should Use This Guide


This guide is intended for developers who are building
applications using the Endeca Navigation Platform.

Symbols and Conventions


IMPORTANT: Text marked as important requires special attention.

Note: Notes provide related information, recommendations, and


suggestions.

The Endeca documentation set uses the following


symbols and conventions:
1. Numbered lists, when the order of the items is
important.
a. Alphabetical lists, when the order of secondary
items is important.
• Bulleted lists, when the order of the items is
unimportant.

Data Indexing API Guide Endeca Confidential


ix

Italic text represents variables you should substitute a


value for, such as:
C:\RootDirectory\MyDirectory\MyFile

Italic text may also indicate new terms that appear in the
Endeca Glossary.

Courier text indicates code snippets or commands that


you should enter exactly as they are written in the
documentation.

Endeca Documentation Set


Note: In addition to the documentation deliverables listed
below, you can find useful information, including the Endeca
Performance Tuning Guide, in the knowledge base on the
Endeca Customer Support site at https://customers.endeca.com.

The Endeca documentation set consists of the following:


• Endeca Installation Guide for UNIX and Endeca
Installation Guide for Windows describe how to install
Endeca software.
• Endeca Migration Guide provides information on
migrating from previous versions of Endeca software.
• Endeca Concepts Guide introduces the critical
concepts you should understand before learning how
to build an Endeca application. The information in this
guide is the foundation upon which all the other
Endeca documentation depends.

Endeca Confidential Preface


x

• Endeca Developer's Guide for Java, Endeca Developer's


Guide for COM, and Endeca Developer's Guide for
.NET provide an overview of the Endeca development
process as well as procedures and code snippets for
all non-advanced Endeca development tasks.
• Endeca Advanced Features Guide provides procedures
for implementing advanced Endeca features such as
the Content Acquisition System and partial updates.
• Endeca Administrator's Guide for UNIX and Endeca
Administrator's Guide for Windows provide
information on using Endeca's administrative and
logging tools to configure and manage your Endeca
implementation, and create logging reports.
• Endeca Tools Guide provides information on
configuring and administering Endeca tools, including
the Endeca Manager, Endeca Developer Studio, and
Endeca Web Studio.
• Endeca Developer Studio Help provides online
information for developing data pipelines using the
Endeca Developer Studio.
• Endeca Web Studio Help provides online information
for the administrative tasks, as well as search and
merchandising configuration, that you can do using
Endeca Web Studio.
• Endeca Javadocs provide online access to class and
method descriptions for the Java version of the
Presentation and Logging APIs.
• Endeca API Guide for COM, Endeca API Guide for
Perl, and Endeca API Guide for .NET provide class and

Data Indexing API Guide Endeca Confidential


xi

method descriptions for the COM, Perl, and .NET


versions of the Presentation and Logging APIs. The
Endeca API Guide for .NET is in an online format.
• Endeca Security Guide for Java and Endeca Security
Guide for .NET and COM describe how to implement
user authentication and how to structure your data to
limit access to only those users with the correct
permissions. The Java version of this guide also
provides information on using SSL certificates and
encryption to secure your Endeca application.
• Endeca Performance Tuning Guide provides
guidelines on monitoring and tuning the performance
of the Endeca Navigation Engine. It also contains tips
on resolving associated operational issues.
• Endeca Content Adapter Developer's Guide describes
the Content Adapter Development Kit (CADK), a
framework that provides developers with a flexible
and simple mechanism to extract data from a data
source and load it into Forge. The CADK is only
available from Endeca customer support.
• Endeca Data Indexing API Guide provides class and
method descriptions of the Data Indexing API and
describes how to use the API to move source data to
the Forge directory and run updates.
• Endeca Forge API Guide for Perl provides online
information for the class and method descriptions of
the Perl Manipulator component. You can use a Perl
manipulator within a data pipeline to perform record
manipulation.

Endeca Confidential Preface


xii

• Endeca XML Reference provides detailed, online


reference information for the XML files used in a Data
Foundry pipeline.
• Endeca Glossary defines terms used in the Endeca
Navigation Platform documentation set.
• Release Announcement describes the major new
features changes for the release.
• Release Notes detail the changes specific to the release,
including bug fixes and new features.
• Endeca Third-Party Software Usage and Licenses
provides copyright, license agreement, and/or
disclaimer of warranty information for the third-party
software packages that Endeca incorporates.

Contacting Endeca Standard Customer Support


You can contact Endeca Standard Customer Support
through the online Endeca Support Center
(<https://customers.endeca.com>).

The Endeca Support Center provides registered users


with important information regarding Endeca software,
implementation questions, product and solution help,
training and professional services consultation as well as
overall news and updates from Endeca.

Data Indexing API Guide Endeca Confidential


Chapter 1
Overview of the Data Indexing API

This chapter provides an overview of the Data Indexing


API, a framework that provides developers with a flexible
mechanism to move data from a data source to the Forge
incoming directory and to stop and start updates
programmatically.

The chapter contains the following sections:


• About the Data Indexing API
• Data Indexing API Components
• Overview of Data Indexing Implementation Process

IMPORTANT: This document assumes that you are already familiar with
Endeca components and terminology as discussed in the Endeca
Concepts Guide and are comfortable programming in languages that can
access Web services, such as Java and C#.
14

About the Data Indexing API


The Endeca Navigation Engine platform was designed
from the beginning to support rapid application
development and easy integration. To that end, the
platform is based on open standards. End-user
applications are easily built and integrated around
multiple well-defined APIs, such as the Data Indexing
API.

Among the open standards supported by Endeca is the


support for XML-based Web Services standards, including
Simple Object Access Protocol (SOAP) and Web Services
Description Language (WSDL). Endeca’s Web services
support makes system-to-system integration easier than
ever before, enabling customers to build innovative
distributed applications that can be shared between and
within enterprises in a way that is easy to maintain as
business processes change.

The Data Indexing API allows users to invoke the Endeca


Data Indexing Web service to programmatically modify
the content of an Endeca system, without going through
the overhead of a baseline update each time a change is
made. Because this Web service is automatically installed
with the Endeca Manager, users are spared the trouble of
setting up the service.

Data Indexing API Guide Endeca Confidential


Chapter 1
15

Among the major tasks that can be accomplished via the


Data Indexing API are the following:
• Adding source data records to the system so that they
can be processed by Forge and uploaded to the
Navigation Engine.
• Starting a partial update with the newly-added records.
The new records are automatically loaded in the
Navigation Engine by the update process.
• Deleting records from the Navigation Engine via a
partial update.
• Modifying records in the Navigation Engine via a
partial update.
• Starting a baseline update, using the existing source
records in the Forge incoming directory.
• Monitoring the progress of a baseline or partial
update, including the retrieval of detailed system error
information if the update fails.
• Retrieving system status from the Endeca Manager,
such as whether the system is idle or is processing an
update.

Because it is defined by a WSDL file, the Data Indexing


API is language-agnostic. That is, it can be used with any
programming language that has Web services support.

The API thus lets software developers choose their


favorite development environment (Java, Visual Studio
.NET, etc.) on which applications can be written to
consume their data update functions as a Web service.

Endeca Confidential Overview of the Data Indexing API


16

Samples of writing client applications in the Java and C#


languages are provided in Chapter 3 (“Writing Java Client
Programs”) and Chapter 4 (“Writing .NET Client
Programs”).

Data Indexing API Components


When you install the Endeca software, the components
that comprise the Data Indexing API are the following:
• Endeca Data Indexing Web Service. This service,
which is installed as part of the Endeca Manager, runs
automatically under the Endeca Manager and does not
need to be configured.
• DataIndexing.wsdl file, which defines the Data
Indexing API. See the next section for more details.
• This guide, which describes the API and provides
instructions for writing programs to add or modify
data content for the system.

The tomcat-users.xml file lets you change the default


user permissions for the Endeca Data Indexing Web
service. See Chapter 2 for details on changing this file.

In addition, you use these tools to set up the


implementation components:
• Endeca Developer Studio allows you to create and
modify the baseline and partial update pipelines that
are used to perform updates.

Data Indexing API Guide Endeca Confidential


Chapter 1
17

• Endeca Web Studio lets you provision the Endeca


Manager with the components that are utilized by
Data Indexing applications.

WSDL File for the Data Indexing API

To create any kind of application that consumes a Web


service, you need the Data Indexing WSDL file, which
describes the API. The file is named DataIndexing.wsdl
and is located as follows:
• On UNIX: $ENDECA_ROOT/lib/services

• On Windows: %ENDECA_ROOT%\lib\services

The WSDL file specifies value types, exceptions, and


available methods in a Web service in a programmatic
fashion. Typically, what a client developer will do is use a
tool that parses the WSDL file and generates client-side
stubs (also called proxy classes) and value types. These
generated files include all the code necessary to serialize
and deserialize SOAP messages and make the SOAP layer
transparent to the client developer.

The DataIndexing.wsdl file can be used with any


language that has Web services support.

Endeca Confidential Overview of the Data Indexing API


18

Overview of Data Indexing Implementation Process


To use the Data Indexing API, you need to follow these
steps:
1. Install the Endeca Navigation Platform, making sure
that you select the “Endeca Manager and Web Studio”
feature.
2. Use Developer Studio to create the project for your
Endeca implementation, including baseline and partial
update pipelines. When you finish, send the instance
configuration to the Endeca Manager.
3. Use Web Studio to provision the system with the
location and configurations of your implementation.
4. Use Web Studio to run a baseline update. This ensures
that the baseline pipeline and the provisioned
components have been set up successfully.
Alternatively, you can wait until after Step 5 to run the
baseline update with the client application.
5. Write the client-side code, using the Data Indexing
classes and methods.
6. Invoke the API to queue records that will be added to
the data content of the system or otherwise modified.
7. Invoke the API to start a partial update, using the
queued records.
8. Invoke the API to query the system for the status of
the update, including any failed records.

Data Indexing API Guide Endeca Confidential


Chapter 1
Chapter 2
System Setup

This chapter describes how to set up your Endeca


implementation so that you can write programs that utilize
the Data Indexing API methods. It contains the following
sections:
• Installing the Data Indexing API
• Starting and Stopping the Web Service
• Changing the Web Service Permissions
• Creating the Update Pipelines
• Provisioning the System

Installing the Data Indexing API


The Data Indexing API (including the Endeca Data Indexing
Web Service) is automatically installed as part of the Endeca
Manager package. The DataIndexing.wsdl file is installed in
the $ENDECA_ROOT/lib/services directory
(%ENDECA_ROOT%\lib\services on Windows).

After installation, you do not need to configure the Web


service in order for it to start up.
20

Starting and Stopping the Web Service


When you start the Endeca Manager, it automatically
starts the Data Indexing Web service. Therefore, this
service will always be running whenever the Endeca
Manager is running. Likewise, when you shut down the
Endeca Manager, it automatically shuts down the Web
service.

In other words, you cannot stop or start the Data


Indexing Web service programmatically or from Web
Studio.

Changing the Web Service Permissions


You set access to the Data Indexing Web service with the
tomcat-users.xml file, located in the $ENDECA_CONF/conf
directory (%ENDECA_CONF%\conf on Windows).

Web Service Role

The Web service uses the ewebservices role to determine


which users have access to it. This role is defined in the
tomcat-users.xml file as follows:

<!-- ewebservices : Controls access to webservices -->


<role rolename="ewebservices"/>

Do not change the name of this role because the Endeca


Manager expects this name.

Data Indexing API Guide Endeca Confidential


Chapter 2
21

Web Service User

By default, the tomcat-users.xml file assigns the


ewebservices role to the webservices user, with a
password of webservices.

<!-- webservices : User is permitted to access webservices -->


<user username="webservices" password="webservices" roles="ewebservices"/>

It is highly recommended that you change the password


for the sake of security.

When you instantiate an instance of the Web service in


your application, you will be setting the username and
password as properties on the object.

Creating the Update Pipelines


If you will be using your Data Indexing application to run
both baseline and partial updates, you must ensure that
the Endeca Developer Studio project has two pipelines: a
baseline pipeline and a partial update pipeline.

Creating the Partial Update Pipeline

Developer Studio allows you to create both pipelines in


the same project. If your project has only the baseline
pipeline, Developer Studio will open the project with an
empty Partial Pipeline Diagram. Use this diagram to
create the partial update pipeline.

Endeca Confidential System Setup


22

An example of a partial update pipeline (as shown by


Developer Studio’s Partial Pipeline Diagram) is as follows:

Data Indexing API Guide Endeca Confidential


Chapter 2
23

In the example, the pipeline components are as follows:


• LoadUpdateData is a record adapter.
• PropDimMapper is a property mapper.
• UpdateManipulator is a record manipulator.
• UpdateAdapter is an update adapter.
• Dimensions and TypeDimension are dimension
adapters.
• DimensionServer is a dimension server.

The following sections provide more details on creating


these components. See the “Implementing Partial
Updates” section in the Endeca Advanced Features Guide
for more information on creating partial update pipelines.

Creating the Record Adapter

When you create the record adapter, the General tab of


the Record Adapter editor should have these settings:
• Direction – Must be Input.
• Format – Must be XML. Regardless of the format of the
source records, they are transformed into an XML
format by the Data Indexing API.
• URL – Enter an input URL as a path with the filename
being a pattern. For example, a URL pattern of
../incoming/updates/partial_data_*.xml means that
Forge will read any file in its updates directory whose
name begins with “partial_data_” and has the xml

Endeca Confidential System Setup


24

suffix. Each file that matches the pattern will be read


in sequence.
Note that the DataIndexing.addContent() method, as
used in the client application, will use a handle that
maps to the file path specified by this URL.
• Multi File – Check this box to specify that Forge can
read data from more than one input file and that the
input URL is to be interpreted as a pattern.

You can leave the other tabs (Sources, Record Index, and
so on) in their default state.

Creating the Record Manipulator

The “Implementing Partial Updates” chapter in the


Endeca Advanced Features Guide has full details on how
to create this component.

In particular, you must pay close attention to the


UPDATE_RECORD expressions in the record manipulator of
your partial update pipeline. The record keys (properties)
that these expressions expect must match the keys of the
records to be added or modified by a partial update
operation. For example, to delete records, an expression
may be looking for a record key named “Remove” with a
value of 1. Any record to be deleted, therefore, must have
this key with a value of 1.

Data Indexing API Guide Endeca Confidential


Chapter 2
25

Creating the Update Adapter

The update adapter is the component that writes out


partial update files that will be loaded into a running
Navigation Engine. The Update Adapter editor must have
at least these settings:
• Output URL (General tab) – Enter the directory to
which Forge writes the partial update files and
processed records. This will typically be the Dgraph
input directory, such as:
../partition0/dgraph_input/updates/

• Output prefix (General tab) – Enter the filename prefix


for the Forge output files. Use the same prefix as in
the indexer adapter for the baseline pipeline.
• Filter unknown properties (General tab) – Set this so it
matches the Filter Unknown Properties setting in the
indexer adapter of the baseline pipeline.
• Record source (Sources tab) – Enter the name of the
record manipulator.
• Dimension sources (Sources tab) – Enter the name of
the dimension server. You need a dimension source if
you are updating dimensions.
• Enable Agraph support (Agraph tab) – Set this so it
matches the Agraph tab settings in the indexer adapter
of the baseline update pipeline

Endeca Confidential System Setup


26

The following is an example of an update adapter for the


partial update pipeline:

Creating the Dimension Components

For information on creating dimension adapters and


dimension servers for the partial update pipeline, see the
“Implementing Partial Updates” chapter in the Endeca
Advanced Features Guide.

Data Indexing API Guide Endeca Confidential


Chapter 2
27

Provisioning the System


Use Web Studio’s Provisioning System page to provision
the Endeca Manager with the Endeca resources that will
be used to perform update operations on your Endeca
implementation.

Make sure that the “Incoming Directory” field of the


Hosts section points to the location of the source data for
baseline updates, not for partial updates. The location of
the source data for your partial updates will be specified
in the Data Indexing client application.

Refer to the Endeca Tools Guide for full details on


provisioning your system.

Endeca Confidential System Setup


28

Data Indexing API Guide Endeca Confidential


Chapter 2
Chapter 3
Writing Java Client Programs

This chapter describes how you write, compile, and build a


program, using the methods in the Data Indexing API. It
contains the following sections:
• Java Client Requirements
• Using the Java WSDP Software
• Using Apache Axis
• Writing the Java Client Application

Java Client Requirements


This chapter describes how to write a Java application that
consumes the Endeca Data Indexing Web service. This
chapter assumes that you are writing the application using
one of two sets of Java Web services tools:
• Java WSDP. For starting information, see the section
“Using the Java WSDP Software” on page 30.
• Apache Axis. For starting information, see the section
“Using Apache Axis” on page 35.

After you have read one of these sections, you can continue
with the section “Writing the Java Client Application” on
30

page 36. The section describes a generic Java sample


program that should apply to both Java WSDL
environments.

It is up to you to select the IDE with which you will


develop the application. For example, you can use the
Eclipse Platform available from the www.eclipse.org Web
site.

Using the Java WSDP Software


If you will be using the Java WSDP software, download
the Java Web Services Developer Pack (Java WSDP),
Version 1.4 or later. This integrated toolkit allows Java
developers to build and test XML applications, Web
services, and Web applications with the latest Web
services technologies and standards implementations.

Note: It is recommended that you create an environment


variable that refers to the directory in which the Java WSDP is
installed. Throughout this chapter, we will use JWSDP_HOME as
the name of the variable

Make sure the JWSDP_HOME/jwsdp-shared/bin directory


has been added to the PATH variable. This allows you to
run Java WSDP programs without having to specify their
complete path.

Data Indexing API Guide Endeca Confidential


Chapter 3
31

Creating a Client Configuration File

You must create an XML configuration file that will be the


primary source of information for the wscompile tool. You
can name the file as you wish, so long as it has the .xml
extension.

Note that the JWSDP_HOME/jwsdp-shared/bin directory


contains a client-config.xml template file.

The following example of a client configuration file


(named config.xml) will work with the Data Indexing
API WSDL file:

<?xml version="1.0" encoding="UTF-8" ?>


<configuration xmlns="http://java.sun.com/xml/ns/jax-rpc/ri/config">
<wsdl location="DataIndexing.wsdl"
packageName="endeca"/>
</configuration>

The configuration file begins with the standard XML


prolog, using UTF-8 for the encoding.

All the elements must be within the main configuration


element. The meanings of the configuration subelements
are as follows:
Element/Attribute Purpose

xmlns Sets the namespace of the configuration file.


You should use the java.sun.com namespace
that is shown in the example.

wsdl Defines the attributes of the WSDL file.

Endeca Confidential Writing Java Client Programs


32

Element/Attribute Purpose

location The location of the DataIndexing.wsdl file.


You can specify a path or a URL.

packageName The name of the package that will contain the


generated stub classes.

Other configuration elements are described in the JWSDP


documentation in the JWSDP_HOME/jaxrpc/docs directory.
However, the above elements are all you will need to
create a Data Indexing Services client.

Generating Client Stubs with the wscompile Tool

The wscompile tool can generate stubs, ties, serializers,


and other files used in JAX-RPC clients and services. For
the Data Indexing API, the tool uses the above
configuration file as input and generates .class and
.java files based on the WSDL definitions.

The tool is named wscompile.sh on UNIX and


wscompile
wscompile.bat on Windows, and is located in the
JWSDP_HOME/jaxrpc/bin directory.

You can display a usage list with the -help option:


wscompile -help

Data Indexing API Guide Endeca Confidential


Chapter 3
33

Of the various options, the two that you will use are the
following:
Option Purpose

-gen:client Generates client artifacts, such as stubs.

-keep Keeps the source files (.java files) for the stubs
and class files. This is necessary because you
will modify one of the generated source files. If
you do not use this option, only .class files will
be produced

The following is an example of using this tool with the


sample config.xml file as input:
wscompile –gen:client –keep config.xml

The tool will create a directory with the name that was
specified with the packageName element in the client
configuration file. The directory will contain a number of
files, including the following:
• The Stub class, named DataIndexing_Stub.class,
representing the Web service proxy and implementing
the DataIndexing interface.
• The Service implementation class, named
DataIndexingService_Impl.class.

• SOAP serializers and deserializers for every method in


the interface.

The next step is to modify the Stub generated source file.

Endeca Confidential Writing Java Client Programs


34

Modifying the Stub Source File

Because multirefs are turned off in Axis (which is shipped


with the product), you will need to manually edit the
DataIndexing_Stub.java file to fix a problem that will
occur if an exception is thrown.

To modify the Stub source file:

1. Use an editor to open the DataIndexing_Stub.java


file.
2. Search for the _readBodyFaultElement method.
3. Add the following call before the switch statement:
deserializationContext.pushEncodingStyle("http://schemas.
xmlsoap.org/soap/encoding/");

The resulting code should look like this:


Object faultInfo = null;
int opcode = state.getRequest().getOperationCode();
deserializationContext.pushEncodingStyle(
"http://schemas.xmlsoap.org/soap/encoding/");
switch (opcode) {
...

4. Add the following call to the default block of the


switch statement:

deserializationContext.popEncodingStyle();

The resulting code should look like this:


default: deserializationContext.popEncodingStyle();
return super._readBodyFaultElement(bodyReader,
deserializationContext, state);
...

Data Indexing API Guide Endeca Confidential


Chapter 3
35

5. Recompile the DataIndexing_Stub.java file.

It is important to keep in mind that the above procedure


modifies a generated file. This means that if you run
wscompile again, it will overwrite the Stub source file
unless you first moved it elsewhere. If the file is
overwritten, you will need to repeat the procedure.

Using Apache Axis


To create a client using Axis, you must download and
install the Apache Axis Java distribution. Make sure that
you put the JAR files into your classpath. If you choose to
use ant, install the axis-ant.jar appropriately by placing
it in <ant_home>/lib.

Generating Client Stubs with WSDL2Java

The Axis WSDL2Java tool can generate stubs based on


the Data Indexing WSDL file as input, as shown in the
following example of running the tool:

java org.apache.axis.wsdl.WSDL2Java -p endeca DataIndexing.wsdl

The -p option maps all namespaces in the WSDL file to


the same Java package name.

The tool will create a directory with the name that was
specified with the -p option.

Endeca Confidential Writing Java Client Programs


36

Generating Client Stubs with an Ant Task

You can also use the Axis-wsdl2java task to create Java


classes from the Data Indexing WSDL file. To do so, you
must build an Ant script that defines this task. Refer to the
Axis and Ant documentation for details.

Writing the Java Client Application


After you have generated the client-side class files, you
can write the client application, using the Data Indexing
API classes and methods described in Chapter 5, “Endeca
Data Indexing API Reference”.

Sample Java Application Program

The complete source code for the sample program used


in this chapter is in Appendix A of this guide. The
program adds new records to the system and then calls
the Endeca Manager to start a partial update.

It is assumed that the Endeca Manager is running and has


been provisioned with Web Studio. After the update is
begun, the application checks the system status and
displays a message when the update finishes.

Invoking the Data Indexing Web Service

Your application must connect to the Data Indexing Web


service. The sample program instantiates a Web service

Data Indexing API Guide Endeca Confidential


Chapter 3
37

object (DataIndexing object) by calling the initService


private method:
DataIndexing server = initService();

The initService method is defined as follows:


private static DataIndexing initService() throws Exception
{
DataIndexingService_Impl locator = new DataIndexingService_Impl();
DataIndexing server = locator.getDataIndexing();

//set the address of our service


((Stub)server)._setProperty(Stub.ENDPOINT_ADDRESS_PROPERTY,
"http://localhost:8888/services/DataIndexing");
//set the user name and password for our service
//uses tomcat container authentication
((Stub)server)._setProperty(Stub.USERNAME_PROPERTY, "webservices");
((Stub)server)._setProperty(Stub.PASSWORD_PROPERTY, "K07YZ17MP1945Q");

return server;
}

The javax.xml.rpc.Stub interface provides a property


mechanism for the dynamic configuration of a stub
instance for authentication purposes. The static constants
Stub.USERNAME_PROPERTY and Stub.PASSWORD_PROPERTY are
set with the Web service’s username and password that
are defined in the tomcat-users.xml file.

The location of the Web service is set with the


Stub.ENDPOINT_ADDRESS_PROPERTY constant. The location
begins with the machine name (which can be localhost
for a local machine) and port on which the Endeca
Manager is running (default port is 8888), plus the
/services/DataIndexing directory.

Endeca Confidential Writing Java Client Programs


38

Note that the locator class may be different depending on


which Web services technology you are using. The above
sample client uses the DataIndexingService_Impl locator
class that was produced by the Java WSDP he wscompile
tool. In contrast, an Axis client might use the
DataIndexingServiceLocator class produced by the Axis
WSDL2Java tool.

Location of the Source Records

The directory where the source data resides is specified


by a private String constant:
private static final String c_strPartialUpdateDataDir =
"C:\\Projects\\partial_updates_data\\";

The name of the file that contains the source records,


prepended with its directory path, is also specified by a
private String constant:
private static final String c_strAddDataInput =
c_strPartialUpdateDataDir+"mexico.txt";

The c_strAddDataInput value will then be passed to the


addContentHelper private method to read in the file.

Format of the Source Records

The sample program expects a Delimited format for the


source records to be added. The records are in a text file
named mexico.txt (which contains information about
airports in Mexico).

Data Indexing API Guide Endeca Confidential


Chapter 3
39

The file begins with a header row of property names (i.e.,


keys), with each property being delimited by the pipe (|)
character:
AirportCode|CityOrAirportName|Country|...|

Each source record is delimited by pipe/sharp (|#)


characters, with its property values delimited by the pipe
character:
|#ACA|ACAPULCO ALVRZ INTL|Mexico|...|

Keep in mind that the source record format must conform


to what is expected by the UPDATE_RECORD expressions in
the partial update pipeline’s record manipulator. In our
sample project, the record manipulator:
• Deletes records that have a Remove key with a value
of 1.
• Updates records that have an Update key with a value
of 1.
• Adds records that do not have Remove or Update
keys.

The addContentHelper private method reads in and parses


the content of the source record text file. The method
application uses a java.io.BufferedReader wrapped

Endeca Confidential Writing Java Client Programs


40

around a java.io.FileReader to read in the stream of


characters from the file:
BufferedReader in = new BufferedReader(new FileReader(strInputfile));
String line;
int iCtr = 0;
String[] strKeys=null;
while ((line = in.readLine()) != null) {
...

The BufferedReader.readLine() method actually reads in


the file content as one line, into the line String variable.
This variable will then be parsed for the property values
for each record.

Note on Formats of Input and Output Record Files

You should keep in mind that there is a difference


between the format of a client program’s input file (which
contains the source data) and the resulting output file
produced by the Data Indexing API.

The input file in our sample client is a simple text file


(.txt extension) that contains delimited records. That is,
delimiter characters separate each record from the next
and also separate the property keys and values. However,
you can use other input formats, such as JDBC records or
XML files. In all cases, it is up to your application to read
and parse the source records.

When the records are transferred via the addContent()


method, the resulting output file will always be in an XML
format, regardless of the format of the input file. You do
not have to worry about this XML transformation because

Data Indexing API Guide Endeca Confidential


Chapter 3
41

it is done automatically by the Data Indexing API. The


resulting XML output is also why you must specify XML
as the format of the record adapter in the partial update
pipeline, as explained in “Creating the Update Adapter”
on page 25.

Creating Records

The heart of the application is the creation of the Record


objects that will be added to the system by the partial
update process. Each Record object will consist of an
array of multiple PVal objects. In turn, each PVal object
will consist of a property name (such as “AirportCode”)
and its corresponding value (such as “ACA”).

Each Record object is created as follows:


PVal[] keyvals = new PVal[strKeys.length];
for (int j=0; j<strKeys.length; j++) {
keyvals[j] = new PVal();
keyvals[j].setName(strKeys[j]);
keyvals[j].setValue(strVals[j]);
}
...
Record r = new Record();
r.setValues(keyvals);
Record[] recs = {r};

The strKeys variable is a String array that contains the


names of the properties (keys), while strVals is a String
array that contains the property values.

Endeca Confidential Writing Java Client Programs


42

The record-building procedure in the sample program is


as follows:
1. Create an array (named keyvals) of PVal objects:
PVal[] keyvals = new PVal[strKeys.length];

The size of the array is the number of elements in the


strKeys variable (that is, the number of properties).

2. Start a For loop that will execute once for each


property:
for (int j=0; j<strKeys.length; j++)

3. Construct an empty PVal object:


keyvals[j] = new PVal();

4. Use the PVal.setName() method to set the name of the


property (such as “CityOrAirportName”) in the PVal
object:
keyvals[j].setName(strKeys[j]);

and the PVal.setValue() method to set the value of


that property (such as “ISLA MUJERES”):
keyvals[j].setValue(strVals[j]);

5. Continue the For loop to construct the remaining PVal


objects.
6. After the For loop finishes, construct an empty Record
object (named r):
Record r = new Record();

7. Use the Record.setValues() method to set the array of


PVal objects in the Record object:

r.setValues(keyvals);

Data Indexing API Guide Endeca Confidential


Chapter 3
43

8. At this point, you have a populated Record object.


However, you cannot queue an individual record; you
must send an array of records. Therefore, you build
the array of records (named recs) as follows:
Record[] recs = {r};

Note that the above sample queues a Record array


consisting of only one record. The reason is to
concentrate on showing how to build Record and PVal
objects. However, a more efficient way is illustrated by
the Client2.java program (see “Appendix A”), which
inserts all the records into the recs array.

Queueing the Records

You use the DataIndexing.addContent() method to


queue the source records to the Endeca Manager for an
update, as in the example from the sample program:
server.addContent(strHandler, recs);

The method takes two parameters:


• An output file handle (strHandler in the example).
• An array of Record objects (recs in the example). The
previous section describes how to create this array.

The file handle value must map to the file path that is
specified in the URL field in the record adapter of the
partial pipeline. For example, the record adapter in the

Endeca Confidential Writing Java Client Programs


44

sample project used by this application has this URL field


setting:

The partial_data_*.xml part of the file path means that


any file handle that beings with partial_data_ will map
to this record adapter. The sample application defines
these three file handles:
private static final String c_strAddHandler = "partial_data_add_data";
private static final String c_strDelHandler = "partial_data_del_data";
private static final String c_strModHandler = "partial_data_mod_data";

All three will map to the partial_data_*.xml URL path of


the record adapter.

Data Indexing API Guide Endeca Confidential


Chapter 3
45

Starting the Update

Use the appropriate method to start the update:


• DataIndexing.startBaselineUpdate() will start a
baseline update.
• DataIndexing.startPartialUpdate() will start a partial
update.

Neither method takes any arguments.

To start the partial update, the sample program calls a


private method (named doUpdate), which starts the
appropriate update:
doUpdate(server, "baseline");
...
private static void doUpdate(DataIndexing server, String strUpdate)
throws Exception
{
System.err.println("Starting "+strUpdate+" update");
if (strUpdate.equals("baseline"))
server.startBaselineUpdate();
else
server.startPartialUpdate();
}

After a baseline update begins, it can be stopped with the


DataIndexing.stopBaselineUpdate() method. Partial
updates, however, have no corresponding stop method.

Endeca Confidential Writing Java Client Programs


46

Monitoring the Update Progress

The following methods allow you to monitor the progress


of the update and to detect system errors:
• DataIndexing.getStatus() retrieves the system status
(as a Status object) from the Endeca Manager. The
returned information includes the status of the update
operation and any error messages relating to data that
is being updated in the system.
• Status.getSystemState() returns a string that
represents the state of the system (that is, what the
system is currently doing):
− “UPDATING” means a baseline or partial update is
in progress.
− “IDLE” means that the system is not performing an
update operation. The system, however, may be
performing other types of operations, such as
searches.
− “SYSTEM_ERROR” means the system is in an error
state caused by an update operation.
• Status.getSystemErrors() returns an array of
SystemError objects.

• SystemError class methods, such as getErrorMsg(),


allow you to get the information in a SystemError
object. This information includes the name of the
Endeca component that reported the error (such as
“FORGE”), the error message, the identifier of the
record that was in error, and the severity level of the
error (which is “ERROR”, “WARNING”, or “FATAL”).

Data Indexing API Guide Endeca Confidential


Chapter 3
47

Keep in mind that a system status of “IDLE” does not


mean that all records were successfully updated. That is,
SystemError objects may be returned even if the system
status is “IDLE” and not “SYSTEM_ERROR”. This happens
when an overall error does not occur, but some records
failed. For example, if a partial update operation
successfully adds 48 out of 50 records but fails to add two
records, a Status.getSystemState() method would return
a status of “IDLE” but Status.getSystemErrors() would
return two SystemError objects.

The sample program uses a while statement to monitor


the update:
while (true) {
System.err.println("Checking status...");
status = server.getStatus();
if (!"UPDATING".equals(status.getSystemState()))
break;
System.err.println("Status is "+status.getSystemState());
Thread.sleep(3000);
}

The two important monitoring actions of the while


statement are:
1. The system status is retrieved with the
DataIndexing.getStatus() method.

2. The system state is retrieved (as a string) with the


Status.getSystemState() method and compared to
the string “UPDATING”. If the two strings are equal,
the loop continues and the system state is displayed; if
they are not equal, the control flow breaks to the next
statement.

Endeca Confidential Writing Java Client Programs


48

The while statement breaks when the system state is


either “IDLE” (which signifies the completion of a
successful update) or “SYSTEM_ERROR” (which means
the update was unsuccessful).

However, because a system status of “IDLE” does not


mean that all records were successfully updated, the
program first prints out the update status and then prints
out the contents of SystemError objects, if any exist:
System.err.println("Update status: " + status.getSystemState());
SystemError[] errors = status.getSystemErrors();
for (int i=0; i<errors.length; i++) {
System.err.println("Error msg: "+errors[i].getErrorMsg()+",
component: "+errors[i].getComponent()+",
record spec: "+errors[i].getRecordSpec()+",
severity: "+errors[i].getSeverity());
}

If an error did occur, the Status.getSystemErrors()


method will get the complete error state and the
SystemError methods will get the error details, such as
the component that reported the error. If no there are no
of SystemError objects, the For loop will not be executed.

By using these methods, you can quickly ascertain what


caused the update to fail.

Data Indexing API Guide Endeca Confidential


Chapter 3
49

Catching Data Indexing Exceptions

An exception represents a more serious problem than the


SYSTEM_ERROR status or SystemError objects. Therefore,
the Data Indexing API provides four Data Indexing
exception classes:
• DIException represents all exceptions thrown by Data
Indexing related classes. This class extends the Java
Exception class (java.lang.Exception). Note that the
DISystemException and DIInvalidOperation
exceptions inherit from DIException.
• DIInvalidParameter represents exceptions thrown
because of parameters that were not valid for a Data
Indexing method.
• DIInvalidOperation represents exceptions thrown
from operations that were not valid for the state in
which the system was in. For example, calling the
addContent() method when a partial update is in
progress will throw this exception. Typically, users can
programmatically recover from this exception (by
waiting for the update to finish, for example).
• DISystemOperation represents exceptions that are
more serious than DIInvalidOperation exceptions.
Typically, you cannot recover programmatically from
this exception, but instead must use some manual
intervention, such as reprovisioning the system.

Each class has a getMessage() method that retrieves a


String object that describes the exception.

Endeca Confidential Writing Java Client Programs


50

See Chapter 5, “Endeca Data Indexing API Reference”, to


find out which exceptions are thrown by the Data
Indexing API methods.

The sample program uses a try block in its main


procedure and four catch clauses for the exceptions:
try {
DataIndexing server = initService();
...
doClearContent(server);
}
catch (DIInvalidOperation de) {
System.err.println("DIInvalidOperation caught.");
System.err.println(de.getMessage());
}
catch (DIInvalidParameter de) {
System.err.println("DIInvalidParameter caught.");
System.err.println(de.getMessage());
}
catch (DISystemException de) {
System.err.println("DISystemException caught.");
System.err.println(de.getMessage());
}
catch (Exception e) {
System.err.println("Exception caught.");
System.err.println(e.getMessage());
}

When the JVM confronts this sequence of multiple catch


clauses, it searches for the appropriate clause from the
sequence's top to its bottom. If it finds a match, that
clause executes.

Note that the final catch clause is for a standard Java


Exception object. This clause should catch exceptions
that are not Data Indexing specific exceptions.

Data Indexing API Guide Endeca Confidential


Chapter 3
51

Although the sample program just prints out the


exception’s message, you can code your application to
examine the exception and attempt to programmatically
recover from the error. As mentioned above, you might
be able to recover from a DIInvalidOperation exception,
such as attempting to start an update while one is already
in progress.

Clearing the Update Records

After an update ends successfully, it is recommended that


your application clear the records that were previously
enqueued by DataIndexing.addContent() methods.

Use the DataIndexing.clearContent() method to clear


the records. The record files are deleted locally (in the
Endeca Manager's update directory) as well as in the
Forge directory.

The sample application clears the records as follows:


server.clearContent(new String[] {c_strAddHandler,
c_strDelHandler, c_strModHandler});

Because the DataIndexing.clearContent() method takes


a String array as an argument, you can clear multiple file
handles, even those that were not used. The sample
program, for example, clears all three file handles, even
though only the c_strAddHandler was used.

Endeca Confidential Writing Java Client Programs


52

Data Indexing API Guide Endeca Confidential


Chapter 3
Chapter 4
Writing .NET Client Programs

This chapter describes how you write, compile, and build a


program, using the C# language with the methods in the
Data Indexing API. It contains the following sections:
• .NET Client Requirements
• Creating the DataIndexingService Library
• Writing the .NET Client Application

.NET Client Requirements


This chapter describes how to write a .NET application that
consumes the Endeca Data Indexing Web service.

The chapter assumes that you will be writing your .NET


application in C# using Microsoft’s Visual Studio .NET
development tool. You must also install the Microsoft .NET
Framework and the Framework SDK.
54

Creating the DataIndexingService Library


Your first task is to create a DataIndexingService.dll file
that you will reference in your application project. The
creation of this DLL file involves two steps:
1. Produce a client proxy stub class.
2. Compile the class into the DLL file.

Producing the Client Stub Class

The Microsoft .NET Framework SDK includes the Web


Services Description Language Tool (wsdl.exe) that can
generate code for XML Web service clients from WSDL
contract files (.wsdl files).

You can display a usage list with the /? option:


wsdl /?

The tool’s /language option (abbreviated as /l) lets you


specify the language of the generated client stub:
Language Option Purpose

/l:CS Generates a C# client stub. This is default.

/l:VB Generates a Visual Basic client stub.

/l:JS Generates a JScript client stub.

The following is an example of using this tool with the


DataIndexing.wsdl file as input:

Data Indexing API Guide Endeca Confidential


Chapter 4
55

C:\NetClient> wsdl /language:CS DataIndexing.wsdl


Microsoft (R) Web Services Description Language Utility
[Microsoft (R) .NET Framework, Version 1.0.3705.0]
Copyright (C) Microsoft Corporation 1998-2001. All rights reserved.

Writing file 'C:\NetClient\DataIndexingService.cs'.


C:\NetClient>

The above example will produce a client stub proxy class


named DataIndexingService.cs in the C# language. The
next step is compile the class to produce a library.

Building the DataIndexingService Library

Use the Microsoft C# compiler (csc.exe) to build the


DataIndexingServices DLL library from the C# source
code in the DataIndexingService.cs file.

The compiler’s /target:library option (abbreviated as


/t:library) builds a DLL library file. You can use the
/out option to specify the exact location where the DLL
should be stored.

The following is an example of using the compiler with


the DataIndexingService.cs file as input:

C:\NetClient>csc /t:library DataIndexingService.cs


Microsoft (R) Visual C# .NET Compiler version 7.00.9951
for Microsoft (R) .NET Framework version 1.0.3705
Copyright (C) Microsoft Corporation 2001. All rights reserved.

C:\NetClient>

Endeca Confidential Writing .NET Client Programs


56

The example will generate the DataIndexingService.dll


library in the current directory. You can then include this
library in your Data Indexing API client application.

Writing the .NET Client Application


The application example in these sections assumes that
you are using the Visual C# .NET development tool to
write and build your client application. Because this
Visual C# project is fairly simple, it uses the Console
Application template.

Adding Reference Libraries

Besides the DataIndexingService.dll library, you will


typically will need to include some extra libraries to your
project in order for it to compile.

To add the libraries to your .NET project:

1. Use Visual C# .NET to open your project.


2. From the Project menu, select Add Reference. This
command opens the Add Reference dialog where you
can add library references to your project.
3. From the .NET tab of the Add Reference dialog, select
the System.Web.dll, System.Web.Services.dll, and
System.Web.RegularExpressions.dll, and click OK.

4. Use the Add Reference dialog to add the


DataIndexingService.dll library to the project.

Data Indexing API Guide Endeca Confidential


Chapter 4
57

Sample .NET Client Application

The C# code for the sample application is as follows.

using System;

namespace ConsoleApplication1
{
/// <summary>
/// Class to start a baseline update.
/// </summary>
class Class1
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
DataIndexingService di=new DataIndexingService();
di.Url="http://localhost:8888/services/DataIndexing";
di.PreAuthenticate=true;
di.Credentials=new System.Net.NetworkCredential("webservices",
"K07YZ17");

try
{
di.startBaselineUpdate();
Status s=di.getStatus();
System.Console.WriteLine(s.systemState);
}
catch (System.Web.Services.Protocols.SoapException e)
{
System.Console.WriteLine(e.Detail.InnerXml);
}
System.Console.ReadLine();
}
}
}

Endeca Confidential Writing .NET Client Programs


58

Connecting to the Data Indexing Web Service

Your application must connect to the Data Indexing Web


service using code similar to the following example:

DataIndexingService di=new DataIndexingService();


di.Url="http://localhost:8888/services/DataIndexing";
di.PreAuthenticate=true;
di.Credentials=new System.Net.NetworkCredential("webservice", "K07YZ17");
...

In the DataIndexingService.dlllibrary, the


DataIndexingService class is a descendant of the
Microsoft .NET Framework
System.Web.Services.Protocols.WebClientProtocol
class. This means that the DataIndexingService class
inherits the WebClientProtocol class members.

Three WebClientProtocol properties are set in the Web


service instantiation for identification and authentication
purposes:
• The Url property sets the base URL of the XML Web
service the client is requesting. The base URL begins
with the machine name (which can be localhost for a
local machine) and port on which the Endeca
Manager is running (default port is 8888), plus the
/services/DataIndexing directory.

• The PreAuthenticate property is set to true to enable


pre-authentication, which means that the client must
be authenticated (and subsequently authorized) in
order to access the XML Web service. If the client
cannot be authenticated (for example, if the password

Data Indexing API Guide Endeca Confidential


Chapter 4
59

for the Web service is incorrect), no service methods


can be executed.
• The Credentials property sets the security credentials
for Web service client authentication. Because the
Endeca Manager uses a password-based
authentication scheme, the credentials are set with an
instantiation of the System.Net.NetworkCredential
class. The username and password in the credentials
must match those set in the tomcat-users.xml file for
the Data Indexing Web service user.

For details on these properties, see the Microsoft .NET


Framework documentation.

Starting the Baseline Update

The sample program starts a baseline update. Unlike the


partial update started by the Java sample program, you
do not need to enqueue records for the update. Instead,
the Endeca Manager will use the source data from the
incoming directory that you provisioned in Web Studio.

The sample program starts the update with the


DataIndexingService.startBaselineUpdate() method:

di.startBaselineUpdate();

After a baseline update begins, it can be stopped with the


DataIndexingService.stopBaselineUpdate() method.

Endeca Confidential Writing .NET Client Programs


60

Monitoring and System Error Methods

The following methods allow you to monitor the progress


of the update and to detect system errors:
• DataIndexingService.getStatus() retrieves the system
status (as a Status object) from the Endeca Manager.
• Status.systemState returns a string that represents the
state of the system:
− “UPDATING” means a baseline or partial update is
in progress.
− “IDLE” means that the system is not performing an
update operation. The system, however, may be
performing other types of operations.
− “SYSTEM_ERROR” means the system is in an error
state caused by an update operation.
• Status.systemErrors returns an array of SystemError
objects.
• Four SystemError class properties, such as
SystemError.errorMsg, returns the information in a
SystemError object. The information includes the
name of the Endeca component that reported the
error, the error message, the identifier of the record
that was in error, and the severity level of the error.

Note that SystemError objects may be returned even if the


system status is “IDLE” and not “SYSTEM_ERROR”. This
happens when an overall update error does not occur,
but some records failed. For example, a partial update
operation may successfully add 48 out of 50 records but

Data Indexing API Guide Endeca Confidential


Chapter 4
61

fail to add two records. In this case, the status will be


“IDLE” but two SystemError objects will be returned.

The sample program uses the following code to get the


system status and print it to the console:
Status s=di.getStatus();
System.Console.WriteLine(s.systemState);

You could add a while loop that checks the system state
and breaks when either “IDLE” or “SYSTEM_ERROR” is
returned. If one or more SystemError objects are returned
(which, as noted above, can happen even with an “IDLE”
status), the Status.systemErrors property will get the
complete error state and the SystemError properties will
get the error details.

Catching Exceptions

In a .NET client application that is calling XML Web


service methods over SOAP, use the .NET Framework
System.Web.Services.Protocols.SoapException class to
catch and handle exceptions. When the client accesses a
method over SOAP, the exception is caught on the server
and wrapped inside a new SoapException.

The SoapException object will contain a Data Indexing


API exception if one was caught on the server. These
exceptions (DIException, DIInvalidParameter,
DIInvalidOperation, and DISystemOperation) are
described in Chapter 5, “Endeca Data Indexing API
Reference”.

Endeca Confidential Writing .NET Client Programs


62

The sample program uses a try block and a catch clause


for the exceptions:
try {
di.startBaselineUpdate();
...
}
catch (System.Web.Services.Protocols.SoapException e)
{
System.Console.WriteLine(e.Detail.InnerXml);
}

The SoapException.Detail property gets a .NET


Framework System.Xml.XmlNode object that represents the
application-specific error information detail. The
XmlNode.InnerXml property gets the markup representing
only the child nodes of this node.

By examining the XML fields, you can find out error


information such as the name of the exception and the
error message returned by the Endeca Manager. For
example, if you start a partial update before a baseline
update has been run, the XML looks similar to this:
<ns1:DISystemException xsi:type="ns1:DISystemException"
xmlns:ns1="urn:com.endeca.service.dataindexing"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<message xsi:type="xsd:string">Cannot 'start update' in state 'Needs
Baseline Update'.</message></ns1:DISystemException>
<ns2:exceptionName
xmlns:ns2="http://xml.apache.org/axis/">com.endeca.service.dataindexing
.DISystemException</ns2:exceptionName>
<ns3:hostname
xmlns:ns3="http://xml.apache.org/axis/">doc-004</ns3:hostname>

Data Indexing API Guide Endeca Confidential


Chapter 4
63

In this example, a DISystemException has been thrown


with the error message:
Cannot ‘start update’ in state ‘Needs Baseline
Update’.

The host name of the machine on which the Endeca


Manager is running (doc-004) is also given.

By using other SoapException class members, you can


display more details, including a stack trace. For example:
try {
di.startBaselineUpdate();
...
}
catch (System.Web.Services.Protocols.SoapException e)
{
Console.WriteLine("SoapException occurred at " + DateTime.Now);
Console.WriteLine("Source: " + e.Source);
Console.WriteLine("Message: " + e.Message);
Console.WriteLine("Code: " + e.Code);
Console.WriteLine("Actor: " + e.Actor);
Console.WriteLine("Detail: " + e.Detail.InnerXml);
Console.WriteLine("StackTrace:");
Console.WriteLine(e.StackTrace);

if(e.InnerException != null)
{
Console.WriteLine("InnerException:");
Console.WriteLine(e.InnerException.ToString());
}
}

See the Microsoft .NET Framework documentation for


details on these properties.

Endeca Confidential Writing .NET Client Programs


64

Data Indexing API Guide Endeca Confidential


Chapter 4
Chapter 5
Endeca Data Indexing API Reference

This chapter describes the Endeca Data Indexing API


objects and methods.

In general, the syntax descriptions in this chapter follow


Java conventions. However, the exact syntax of a class
member depends on the output of the WSDL tool that you
are using. For example, a Java WSDL tool will generate a
public interface named DataIndexing while the Microsoft
.NET tool produces the DataIndexingService public class.

Likewise, the Java tool outputs a Status.getSystemState()


method, while the Microsoft .NET tool will produce a
Status.systemState property. Therefore, be sure to check
the client stub classes that are generated by your WSDL tool
for the exact syntax of the Data Indexing API class
members.
66

DataIndexing Interface
The DataIndexing interface provides for adding data
content to the Endeca implementation, retrieving status,
and running partial and baseline updates on the data.

Methods

addContent(String handle, Record[] records)

Queues source records for a partial update operation.


The records will be written out in XML format in a file
specified by the handle parameter. When the update runs,
the source records will be transformed into Endeca
records by Forge.

Parameters:
• handle – The handle to the record adapter that will
handle the data, as specified in the partial update
pipeline. The value must map to the file path specified
by the URL in the adapter (relative to Forge’s incoming
directory).
• records – An array of one or more records (Record
objects) to be passed into the system.

Throws:
• DIInvalidParameter if the handle or records
parameter is null or otherwise invalid.
• DIInvalidOperation if the Endeca Manager is currently
performing an update.

Data Indexing API Guide Endeca Confidential


Chapter 5
67

• DISystemException if the output XML file could not be


written or if some other error occurred.

clearContent(String[] handles)

Clears records that have been enqueued by one or more


addContent() methods. The record files are deleted
locally (in the Endeca Manager's update directory) as well
as in the Forge directory.

Parameters:
• handle – The handles whose data needs to be cleared.
Developers typically call this method after a successful
update and before they start adding more content.

Throws:
• DIInvalidParameter if the handle parameter is null or
otherwise invalid.
• DIInvalidOperation if the Endeca Manager is currently
performing an update.
• DISystemException if some of the files could not be
deleted or if some other error occurred.

getSystemStatus()

Gets the system status from the Endeca Manager. Includes


status of the update operation and any error messages
relating to data updated in the system.

Endeca Confidential Endeca Data Indexing API Reference


68

Returns:
Status—A Status object, indicating the state of the
system (whether it is in the middle of an update) and a
collection of errors reported by the components on
individual records in the system.

Throws:
• DISystemException if an error occurred in trying to
obtain the status from the Endeca Manager.

startBaselineUpdate()

Starts a baseline update. The call returns as soon as the


update begins.

Throws:
• DIInvalidOperation if the Endeca Manager is currently
performing an update.
• DISystemException if the baseline update cannot be
started or if some other error occurred.

startPartialUpdate()

Starts a partial update. The call returns as soon as the


update begins. Note that there is no method to stop a
partial update, because partial updates are typically
smaller and faster than baseline updates and because the
Navigation Engine cannot be asked to terminate a partial
update.

Data Indexing API Guide Endeca Confidential


Chapter 5
69

Throws:
• DIInvalidOperation if the Endeca Manager is currently
performing an update.
• DISystemException if the partial update cannot be
started or if some other error occurred.

stopBaselineUpdate()

Stops a running baseline update. The call returns as soon


as the update stops.

Throws:
• DIInvalidOperation if the Endeca Manager is not
currently performing an update.
• DISystemException if the baseline update cannot be
stopped or if some other error occurred.

PVal Class
A key/value pair, a collection of which constitutes a
Record object. The key is the name of an Endeca property
or dimension, and the value is the String value of that
key.

Endeca Confidential Endeca Data Indexing API Reference


70

Constructors

PVal(String name, String value)

Constructs a PVal object from a name and a value.

Parameters:
• name – The name of the property or dimension.
• value – The value of the property or dimension.

PVal()

Constructs a PVal object with no data. Use the setName()


and setValue() methods to add a name and value.

Methods

getName()

Gets the name of this PVal object.

Returns:
String—The name of this PVal object.

getValue()

Gets the value of this PVal object.

Returns:
String—The value of this PVal object.

Data Indexing API Guide Endeca Confidential


Chapter 5
71

setName(String name)

Sets the name of this PVal object.

Parameters:
• name – The name assigned to this PVal object.

setValue(String value)

Sets the value of this PVal object.

Parameters:
• value – The value assigned to this PVal object.

Record Class
A source record, which is a collection of data in the form
of PVal objects (key/value pairs). The record is added to
the Endeca system with the EndecaSystem.addContent()
method.

Constructors

Record(PVal[] values)

Constructs a new Record object from an array of PVal


objects.

Endeca Confidential Endeca Data Indexing API Reference


72

Parameters:
• values – A collection of PVal objects.

Record()

Constructs a new Record object containing no data.

Methods

getValues()

Gets the entire collection of PVal objects from this record.

Returns:
PVal[]—The array of PVal objects.

setValues(PVal[] pval)

Adds a collection of PVal objects to this record.

Parameters:
• pval – The collection of PVal objects to be added to
the record.

Status Class
An object that represents the state of the system,
including failed records and associated error messages.

Data Indexing API Guide Endeca Confidential


Chapter 5
73

Methods

getSystemErrors()

Gets all the error messages.

Returns:
SystemErrors[]—The array of SystemError objects.

getSystemState()

Gets the state of the Endeca system, which is represented


by one of the following string messages:
• IDLE – The system is not performing an update
operation and none of the components are in error.
• UPDATING – The system is performing a baseline or
partial update.
• SYSTEM_ERROR – The system is in an error state caused
by an update operation. This state typically means that
the system was not configured or provisioned
correctly and the update failed completely (that is,
none of the update records were propagated through
the system). This is a severe state, and usually requires
manual intervention to fix the problem (for example,
using Web Studio’s Configuration page or Provisioning
System page).

Note that an update can finish successfully even though


some records may not have been added to the system. In
this case, the system status upon completion of the

Endeca Confidential Endeca Data Indexing API Reference


74

update will be IDLE (not SYSTEM_ERROR). However, there


will be a SystemError object for each failed record.

Returns:
String—The status of the system, as indicated by one of
the above messages.

SystemError Class
An object that represents a system error. Typically, the
error involves a record that failed to be added to the
system. However, the system error may be caused by a
condition that was not related to a failed Record object.

SystemError objects may be returned even if the system


status is IDLE and not SYSTEM_ERROR. This happens
when an overall error does not occur, but some records
failed. For example, if 98 out of 100 records succeeded in
being added, but two failed (because those records
already existed), a Status.getSystemState() method
would return IDLE but Status.getSystemErrors() would
return two SystemError objects.

Methods

getComponent()

Gets the name of the Endeca component that reported


the error.

Data Indexing API Guide Endeca Confidential


Chapter 5
75

Returns:

String—The name of the Endeca component, which is


one of the following:
• MANAGER (for the Endeca Manager)
• FORGE
• DGIDX
• DGRAPH (for the Endeca Navigation Engine)

getErrorMsg()

Gets the error message.

Returns:
String—The error message, as returned from the Endeca
Manager. If the error is due to a failed Record object, the
message may include the reason why the record could
not be added or updated.

getRecordSpec()

Gets the record specifier of the record that was not


added.

Returns:
String—The record specifier of the Record object that
was not added. Returns null if the SystemError is not
related to a particular record.

Endeca Confidential Endeca Data Indexing API Reference


76

getSeverity()

Gets the severity level of the error.

Returns:

String—The severity level, which is one of the following:


• ERROR
• WARNING
• FATAL

DIException Class
The DIException represents all exceptions thrown by
Data Indexing related classes. The DIInvalidOperation,
DIInvalidParameter, and DISystemException classes all
inherit from this class.

Methods

getMessage()

Gets the error message of the exception.

Returns:
String—The error message.

Data Indexing API Guide Endeca Confidential


Chapter 5
77

DIInvalidOperation Class
The DIInvalidOperation exception is thrown for
operations that were not valid for the state in which the
system was in. For example, calling the addContent()
method when a partial update is in progress will throw
this exception. Typically, users can programmatically
recover from this exception (by waiting for the update to
finish, for example).

Methods

getMessage()

Gets the error message of the exception.

Returns:
String—The error message.

DIInvalidParameter Class
The DIInvalidParameter exception is thrown for
parameters that were not valid for a method.

Methods

getMessage()

Gets the error message of the exception.

Endeca Confidential Endeca Data Indexing API Reference


78

Returns:
String—The error message.

DISystemOperation Class
DISystemOperation indicates a much more serious
problem than DIInvalidOperation. Typically, users
cannot programmatically recover from this exception, but
instead must use manual intervention (such as
reprovisioning the system).

Methods

getMessage()

Gets the error message of the exception.

Returns:
String—The error message.

Data Indexing API Guide Endeca Confidential


Chapter 5
Appendix A
Sample Java Client Code

This appendix contains the code for the Java client


discussed in Chapter 3, “Writing Java Client Programs”.

Client.java Example
The Client.java sample code is a basic example of a partial
update application. It extracts source records from a text file
and constructs Data Indexing API Record objects. An array
of these records is added to the system with the
DataIndexing.addContent() method.

The records are used in a partial update that is performed


by the Endeca Manager. The partial update is started with
the DataIndexing.startPartialUpdate() method.
80

package com.endeca.service.dataindexing;

import javax.xml.rpc.Stub;
import java.io.*;
import endeca.*;

// client for the Data Indexing API

public class Client


{
// output file handles
private static final String c_strAddHandler = "partial_data_add_data";
private static final String c_strDelHandler = "partial_data_del_data";
private static final String c_strModHandler = "partial_data_mod_data";

//this is the directory from which we parse data to send to the API
private static final String c_strPartialUpdateDataDir =
"C:\\Projects\\partial_updates_data\\";
// this is the file that contains data we want to add
private static final String c_strAddDataInput =
c_strPartialUpdateDataDir+"mexico.txt";
// this is the file that contains data we want to delete
private static final String c_strDelDataInput =
c_strPartialUpdateDataDir+"deletes.txt";
// this is the file that contains data we want to modify
private static final String c_strModDataInput =
c_strPartialUpdateDataDir+"updates.txt";

public static void main(String [] args)


{
try {
DataIndexing server = initService();

doClearContent(server);

doAddContent(server);

Data Indexing API Guide Endeca Confidential


Appendix A
81

Status status = server.getStatus();


String s = status.getSystemState();
doUpdate(server, "partial");
waitForUpdate(server);
doClearContent(server);
}
catch (DIInvalidOperation de) {
System.err.println("DIInvalidOperation caught.");
System.err.println(de.getMessage());
}
catch (DIInvalidParameter de) {
System.err.println("DIInvalidParameter caught.");
System.err.println(de.getMessage());
}
catch (DISystemException de) {
System.err.println("DISystemException caught.");
System.err.println(de.getMessage());
}
catch (Exception e) {
System.err.println("Exception caught.");
System.err.println(e.getMessage());
}
}

private static DataIndexing initService() throws Exception


{
DataIndexingService_Impl locator = new DataIndexingService_Impl();
DataIndexing server = locator.getDataIndexing();

//set the address of our service


((Stub)server)._setProperty(Stub.ENDPOINT_ADDRESS_PROPERTY,
"http://localhost:8888/services/DataIndexing");
//set the user name and password for our service
//uses tomcat container authentication
((Stub)server)._setProperty(Stub.USERNAME_PROPERTY, "webservices");
((Stub)server)._setProperty(Stub.PASSWORD_PROPERTY, "K07YZ17MP1945Q");

return server;
}

Endeca Confidential Sample Java Client Code


82

// add content to the system


private static void doAddContent(DataIndexing server) throws Exception
{
// Data to be added.
addContentHelper(c_strAddDataInput, c_strAddHandler, "Adding record ",
server);
// Data to be deleted
//addContentHelper(c_strDelDataInput, c_strDelHandler, "Deleting record ",
server);
// Data to be modified
//addContentHelper(c_strModDataInput, c_strModHandler,
"Modifying record ", server);
}

// remove content from the system


private static void doClearContent(DataIndexing server) throws Exception
{
System.err.println("Clearing contents");
server.clearContent(new String[] {c_strAddHandler, c_strDelHandler,
c_strModHandler});
}

private static void doUpdate(DataIndexing server, String strUpdate) throws


Exception
{
System.err.println("Starting "+strUpdate+" update");
if (strUpdate.equals("baseline"))
server.startBaselineUpdate();
else
server.startPartialUpdate();
}

private static void stopUpdate(DataIndexing server) throws Exception


{
System.err.println("Stopping baseline update");
server.stopBaselineUpdate();

// make sure we're not updating


System.err.println("Checking status...");
Status status = server.getStatus();
if ("UPDATING".equals(status.getSystemState()))

Data Indexing API Guide Endeca Confidential


Appendix A
83

throw new Exception("Stop update unsuccessful. We're still in UPDATEING


state");
System.err.println("Successfully stopped update.");
}

private static void waitForUpdate(DataIndexing server) throws Exception


{
Status status;
while (true) {
System.err.println("Checking status...");
status = server.getStatus();
if (!"UPDATING".equals(status.getSystemState()))
break;
System.err.println("Status is "+status.getSystemState());
Thread.sleep(3000);
}
System.err.println("Update status: " + status.getSystemState());
SystemError[] errors = status.getSystemErrors();
for (int i=0; i<errors.length; i++) {
System.err.println("Error msg: "+errors[i].getErrorMsg()+",
component: "+errors[i].getComponent()+ ",
record spec: "+errors[i].getRecordSpec()+ ",
severity: "+errors[i].getSeverity());
}
System.err.println("Completed update");
}

private static void addContentHelper(String strInputfile, String strHandler,


String strMsg,
DataIndexing server) throws Exception
{
// read data from file and create records to be added
BufferedReader in = new BufferedReader(new FileReader(strInputfile));
String line;
int iCtr = 0;
String[] strKeys=null;

Endeca Confidential Sample Java Client Code


84

while ((line = in.readLine()) != null) {


// first line is the keys - separator is pipe character
String[] lines=line.split("\\|#");
for (int i=0; i<lines.length; ++i)
{
line=lines[i];
if (0 == iCtr) {
strKeys = line.split("\\|");
System.err.println("strKeys "+strKeys.length);
}
else {
// each subsequent line represents a record
String[] strVals = line.split("\\|");
System.err.println("strVals "+strVals.length);
if (strVals.length != strKeys.length)
throw new Exception("Invalid input: number of vals not equal to
number of keys.");
PVal[] keyvals = new PVal[strKeys.length];
for (int j=0; j<strKeys.length; j++) {
keyvals[j] = new PVal();
keyvals[j].setName(strKeys[j]);
keyvals[j].setValue(strVals[j]);
}
System.err.println(strMsg+iCtr);
Record r = new Record();
r.setValues(keyvals);
Record[] recs = {r};
server.addContent(strHandler, recs);
}
iCtr++;
}
}
}
}

Data Indexing API Guide Endeca Confidential


Appendix A
85

Client2.java Example
The Client2.java sample code is identical to Client.java
except that the addContentHelper() private method uses
the DataIndexing.addContent() method only once. That
method is listed below.

private static void addContentHelper(String strInputfile, String strHandler,


String strMsg, DataIndexing server) throws Exception
{
// read data from file and create records to be added
BufferedReader in = new BufferedReader(new FileReader(strInputfile));
String line;
String[] strKeys=null;
while ((line = in.readLine()) != null) {
// first "record" is the list of keys
// record delimiter = "|#"
// key delimiter = "|"
String[] lines=line.split("\\|#");
// since first "record" is the list of keys
// adjust the size of the record array accordingly
Record[] recs = new Record[lines.length - 1];
for (int i=0; i<lines.length; ++i)
{
line=lines[i];
if (0 == i) {
strKeys = line.split("\\|");
}
else {
// each subsequent line represents a record
String[] strVals = line.split("\\|");
if (strVals.length != strKeys.length)
throw new Exception("Invalid input: number of vals not equal to number
of keys.");

Endeca Confidential Sample Java Client Code


86

PVal[] keyvals = new PVal[strKeys.length];


for (int j=0; j<strKeys.length; j++) {
keyvals[j] = new PVal();
keyvals[j].setName(strKeys[j]);
keyvals[j].setValue(strVals[j]);
}
System.err.println(strMsg+i);
Record r = new Record();
r.setValues(keyvals);
// remember that the index i is 1 ahead the index
// into the record array
recs[i-1]=r;
}
}
server.addContent(strHandler, recs);
}
}

Data Indexing API Guide Endeca Confidential


Appendix A
Index

A creating with wscompile tool 32


Ant task for creating client stubs 36 components of Data Indexing API 16
authentication for Web .NET client 59
Axis D
Ant task for creating client stubs 36 Data Indexing API
downloading 35 clearing records 51
WSDL2Java tool 35 components 16
creating DataIndexingService.dll 54
B creating PVal objects 42
baseline updates creating Record objects 41
monitoring status from .NET client 60 exception classes 49
monitoring status from Java client 46 functionality 15
provisioning resources 27 generating client stubs with .NET
starting from .NET client 59 wsdl tool 54
starting from Java client 45 generating client stubs with Axis tool
stopping from .NET client 59 35
stopping from Java client 45 generating client stubs with
wscompile 32
C implementation overview 18
clearing update records 51 installing 19
client stubs overview 14
creating with .NET wsdl tool 54 sample Java client 79
creating with Axis WSDL2Java tool 35 starting baseline updates from .NET
88

client 59 DIInvalidParameter class


starting baseline updates from Java description 49, 77
client 45 getMessage method 77
starting partial updates 45 dimension adapter and dimension
stopping baseline updates from .NET server for partial update pipeline
client 59 26
stopping baseline updates from Java DISystemOperation class
client 45 description 49, 78
WSDL file 17 getMessage method 78
Data Indexing Web service
ewebservices role 20 E
invoking in .NET client 58 Endeca Developer Studio
invoking in Java client 36 creating dimension components for
stopping and starting 20 pipeline 26
webservices user 21 creating record adapter 23
DataIndexing interface creating record manipulator 24
addContent method 43, 66 creating update adapter 25
clearContent method 51, 67 Endeca Web Studio, provisioning the
description 66 system 27
error messages
getStatus method 46, 60
getting from exceptions 49
getSystemStatus method 67
system status 46, 60
startBaselineUpdate method 45, 59,
ewebservices role 20
68
exceptions
startPartialUpdate method 45, 68 catching in .NET client 61
stopBaselineUpdate method 45, 59, catching in Java client 49
69
DataIndexing_Stub.java file, editing 34
F
DataIndexingService class
file handle for queueing records 43
See DataIndexing class
DataIndexingService.dll file for .NET
I
client, creating 54
IDLE system state 46, 60
DIException class
implementation overview for the Data
description 49, 76
Indexing API 18
getMessage method 76, 77
incoming directory for baseline updates
DIInvalidOperation class 27
description 49, 77

Data Indexing API Guide Endeca Confidential


89

installing the Data Indexing API 19 format for source records 38


monitoring status from .NET client 60
J monitoring status from Java client 46
Java client project pipeline 21
creating Record objects 41 queueing records 43
creating stubs 32 record adapter component 23
creating stubs with Axis tool 35 record adapter for pipeline 23
format of incoming source records 38 record manipulator for pipeline 24
sample code 79 starting 45
Java WSDP update adapter for pipeline 25
downloading 30 UPDATE_RECORD expression 24
wscompile tool 31 permissions for Data Indexing Web
service 20
L pipeline for partial updates 21
location of Data Indexing Web service, provisioning the Endeca system 27
specifying 37, 58 PVal class
constructor 70
M creating objects 42
Microsoft .NET Framework description 69
installing 53 getName method 70
wsdl tool 54 getValue method 70
setName method 42, 71
N setValue method 42, 71
.NET client
adding reference libraries 56 Q
creating stubs 54 queueing records for partial updates 43
development environment 53
sample code 57 R
record adapter
O file handle 44
overview of Data Indexing API 14 for partial update pipeline 23
Record class
P constructor 71
partial updates creating objects 41
clearing records 51 description 71
dimension pipeline component 26 getValues method 72

Endeca Confidential Index


90

setValues method 42, 72 T


record manipulator for partial update tomcat-users.xml file for Web service
pipeline 24 permissions 20
record spec of failed record, getting 48
reference libraries for .NET client 56 U
role for Data Indexing Web service 20 update adapter for partial update
pipeline 25
S UPDATE_RECORD expression
security credentials for Web client expected source record format 39
authentication 59 in partial update pipeline 24
severity level of error messages, getting UPDATING system state 46, 60
46
SoapException object 61 W
source records for partial updates 38 Web service, Data Indexing
Status class See Data Indexing Web service
description 72 webservices user for Data Indexing Web
getSystemErrors method 46, 60, 73 service 21
getSystemState method 46, 60, 73 wscompile tool 31
system state WSDL file
getting error messages from .NET generating stubs with .NET wsdl tool
client 60 54
getting error messages from Java generating stubs with wscompile 32
client 46 generating stubs with WSDL2Java 35
IDLE 46, 60 location 17
retrieving from .NET client 60 WSDL2Java tool 35
retrieving from Java client 46
SYSTEM_ERROR 46, 60
UPDATING 46, 60
SYSTEM_ERROR system state 46, 60
SystemError class
description 74
getComponent method 48, 74
getErrorMsg method 48, 75
getRecordSpec method 48, 75
getSeverity method 48, 76
information in objects 46, 60

Data Indexing API Guide Endeca Confidential

You might also like