Kofax Integration

1 Introduction

Ascent Capture is a solution that allows scanning documents, extracting information and releasing it to external modules for storage and further processing. It is developed by Kofax, a California based company, part of DICOM Group, a global leader in Electronic Document Exchange.

eXo Platform is a cutting-edge product line of Portal oriented solutions. One of them is eXo JCR (which stands for "Java Content Repository"). It can be considered as a standard compliant database to store structured information, such as enterprise documents. Another one is eXo ECM ("Enterprise Content Management"). It is built on top of the former one and enables to handle the complete lifecycle of documents, including capturing, storing, management, publishing and backup.

It makes perfect sense to merge Ascent Capture and eXo Platform and obtain a comprehensive chained solution, leveraging best of both components : extraction for Ascent Capture, storage and management for eXo Platform.

2 Concepts

In Ascent Capture, the export of images and data is done through a "Release Script", a Windows COM component that implements specific interfaces. eXo Platform, on its side, is developed using Java. As a consequence, a means had to be found to enable cross platform invocation. It was decided to select WebDAV ("Web-based Distributed Authoring and Versioning", sometimes abbreviated as "DAV").

WebDAV is an extension to the HTTP protocol which allows users to collaboratively edit and manage files on remote web servers. It is opened and standardized by a working group. There exists many clients supporting WebDAV, such as Windows, Linux, Mac OS X, MS Office, Adobe Acrobat… It is also the preferred protocol for remote access to the JCR in eXo Platform. Two things are allowed. First, users can access Ascent Capture released documents from any available client on their machine. A typical example is the Windows Explorer. Second, it should be possible to leverage the eXo Release Script to access *any WebDAV repository.

kfx_bigpicture.jpg Big picture

The eXo Release Script releases a document in three steps:

  • First, a node is created in the JCR, at a location specified by the
configuration. It represents the structured document. The name of this node is obtained from the document data. Its default type is kfx:document. This is a convention node type in eXo Platform which allows to identify all Ascent Capture released documents. Of course, custom node types can be defined to specify accurately the document structure and ensure the consistency of the repository.

  • Then, child files are added to this node. They consist of Ascent
Capture artifacts, which can be of types PDF, TIFF and text.

  • Finally, properties are set on the main node. They represent the
extracted information, more precisely the values of the Ascent index and batch fields. The mapping between Ascent data fields and JCR properties is configured when setting up the script. Two sets of destination JCR properties are available :

    • The structural properties hold content data contained by the scanned
document. Typical examples are "name", "address" or "customer identifier". Those properties are often prefixed by the "kfx:" namespace, which gives kfx:name, kfx:address or kfx:customer_identifier.

    • The metadata properties hold content that describes the document. Examples
are "title", "date" or "description". Those properties are often a part of the "Dublin Core" http://dublincore.org/" set, which is grouped into the "dc:" namespace. Example : dc:title, dc:date8 or *dc:description. eXo Platform allows to define any additional set of metadata that best fit your business requirements.

Once the documents are stored in the JCR, customed actions can be launched to modify, check, or start a Content Validation workflow. They can also be visualized or edited thanks to templates. A generic template is provided. It basically lists each "kfx:" prefixed property and allows the downloading of binary files. This template can be customized to reproduce faithfully the original document look.

kfx_template.jpg Generic template showing a sample document

3 Installation

The Release Script has been tested with Ascent Capture 7.0 Service Pack 3.

  • Install the Microsoft .Net framework 2.0 Redistributable Package
available at http://www.microsoft.com/downloads/details.aspx?FamilyID=0856eacb-4362-4b0d-8edd-aab15c5e04f5&displaylang=en}.

  • Download the eXo Release Script from the project forge at
http://forge.objectweb.org/project/showfiles.php?group_id=151. Unzip the obtained package in any directory.

  • Launch the Ascent Capture Administration module, select the
"Release Script Manager" menu and click on the "Add button". Select the file called "eXo.inf" in the directory containing the extracted Release Script files.

If the installation is successful, the eXo Release Script has been added to the list of available Release Scripts.

4 Configuration

The eXo Release Script setup dialog is divided into three tabs.

The Connection tab allows providing information to access the JCR, such as WebDAV URI, user name and password. In some eXo configuration, you might have to append the name of the portal at the end of the password, like "exo@ecm".

kfx_setupconnection.jpg Connection tab

The Destination tab enables to specify the JCR path where the documents are stored by the Release Script. By convention, the first element of the path is the name of the JCR Workspace, for instance /draft. It is also possible to specify the Node Type that will be used. Finally, the name of the nodes should be linked with an Ascent Capture data field.

kfx_setupdestination.jpg Destination tab

The Mapping tab enables to link Ascent Capture data fields with JCR properties. To add a mapping, first select a data field in the combobox. The index fields and batch fields items are prefixed with the index_fields and batch_fields keywords respectively. Then insert a JCR Property name and click on the "Add" button.

kfx_setupmapping.jpg Mapping tab

5 Online demo

kfx_demo.jpg Online demonstration

Start Demo !

6 Troubleshooting and support

The release script might end up with the following error message : *MKCOL returned a wrong status : 403 FORBIDDEN*. This happens in the case a document already exists with the same name in the JCR. Indeed, for the safety reason, the document cannot be overriden. It should be renamed or removed beforehand.

Generally speaking, the eXo Release Script appends information to the Ascent Capture logs. They can be helpful while diagnosing problems. For any question or feedback, please post a message to the mailing list mailto:exoplatform@objectweb.org.


Creator: Gennady Azarenkov on 06/05/2007
Copyright (c) 2000-2009. Allright reserved - eXo platform SAS
1.6.13286