FOCUS REPORT - DOCUMENT CAPTURE SOLUTIONS |
Papyrus Capture. Scan. Recognize. Classify.
Document Capture
Solutions
Papyrus Capture
Understanding your documents - an overview.
Page 2
Energy utilities
Data Capture of forms for customer self-reading of supply
meter for electricity. Page 3
Winterthur, Switzerland
Savings in costs by fully automated data extraction
of complex doctors' statements Page 4
Telekom Austria
Centralization of incoming mail: classification and indexing
of paper and fax for automated distribution. Page 6
Bank Austria Creditanstalt, Austria
Data capture at the highest level - processing of hundreds
of thousands of checks and bank transaction forms. Page 8
Keba, Austria
Thousand fold OEM-Application for self service banking
and lottery terminals. Page 10
Papyrus Capture
Understanding your Documents
To make valuable, business-critical information on
incoming business documents (order forms, application forms, invoices, money transfer forms, questionnaires, emails etc.) accessible to a company, conversion to electronic data is mandatory. Cost and time involved in manual data entry constitute the major part of document processing overall costs.
Successful business automation can only take place once the content (data) is electronically stored in and available to the IT system as coded and uncoded information.
Intelligent document capture with Papyrus Capture provides a wide range of capabilities that automate, speed up and streamline information capture and integrate accurate, valida?ted data into any line of business applications. Papyrus ?Capture is a flexible platform for efficient high end utilization of all capture related processes including solutions requiring document classification and extraction of unstructured data, e.g. invoices.
The individual steps involved within data capture including scanning or fax import, classification of business documents, recognition of data, validation and data export. These have been fully automated within Papyrus Capture which employs advanced AI–technologies, such as Machine Learning, Neural Networks and a sophisticated level of Image processing -, Character - and Content Recognition - technologies. The user and administrator functions can also be used on zero-install-clients within the Intra - or Internet by standard Web Browsers.
The case studies in this Focus Report demonstrate an extract of the manifold and highly demanding company solutions solved by Papyrus Capture.
Advantages
The efficient capture of large quantities of printed documents greatly reduces costs resulting from labor and time intensive manual data entry. Papyrus Capture allows for a typical system pay back period of 6 to 12 months based on the following advantages:
X Speed - substantial acceleration of the document transformation process, data become available faster
X Cost - drastic reduction in the cost of data capture and data entry
X Quality - improvement of data cleansing through automatic plausibility checks (fuzzy context validation)
X Versatility - access to data from archives and work flow systems
X Scalability - Papyrus Capture can easily be customized and grows with the requirements, from a standalone desktop system to high performance forms processing solutions for several hundred thousand documents per day.
X Additional benefits - a higher percentage of a company's stored data is available to the ongoing processes
Process
Technical prerequisites
Papyrus Capture products support the current scanner models, run on PC and Server
industry standards and do not require any particular hardware.
Modules &
Functionalities
X Papyrus Capture
X Papyrus Scan
X Papyrus Classify
X Papyrus FreeForm®
Applications
X Mailroom Processing
X Campaign Management
X Forms processing
X Invoice Data Capture
X Payment Transactions
Papyrus Capture for
Energy Utilities
Customer self-reading of supply meter for electricity
The Challenge
The reading of electricity supply meters and other types of energy by utility company personnel is both labor intensive and costly. In rural areas the meters are often remotely situated and in the city gaining access is often a problem.
L The Solution: self-reading by the customers
Progressively, many utility companies are replacing the reading of supply meters using Internet reporting methods directly from its customers’ self-read returns. The customer receives the meter reading card and completes the figures indicated on the meter. Whenever there is a change in ownership, together with the occasional random sampling a utility company’s representative will read the meters to provide assurance of reasonable coherence within the readings. The savings in expenditure as a result of adopting these self-reading and direct electronic reporting methods provide an ROI of an automated capture solution within only a few months.
L Scanning
The reading cards are scanned in the mailroom using a high
volume scanner.
L Import and Recognition
The image files are automatically imported into the document capture system and the content of each pre-defined field is read. Read rates of greater than 97% are being achieved reading the handwritten numeric characters. The remaining questionable fields being marked for inquiry. To ensure that no false data gets into the database, numerous plausibility checks are integrated. For example: if the meter reading figure deviates beyond the estimated range of variance the document is
forwarded to the correction and verification workstation.
L Verification & Correction
Tariff group staff within the ”Correction and Verification” section need only deal with rejected character reads or those failing the plausibility checks. The validation data and contextual information used for plausibility checks are being retained within SQL-databases. (e.g. Oracle, SQL Server, etc.)
Documents that cannot be corrected or verified immediately (e.g. need contact with the customer) are suspended and
forwarded to an exception item workstation or to a manual process for on-going processing.
L Data export
Once the documents are validated they are automatically forwarded for export whereupon the export files are transferred to the host applications. The export file formats are being prepared for direct entry into a database or “flat files” for uplift by the host application. At the same time text and image files are prepared for archiving and other defined tasks.
L Additional applications at energy suppliers
De-regulation within the electricity marketplace and other utili?ties in Austria has caused energy suppliers to place more emphasis on marketing activities to hold existing customers and to acquire new ones. Knowing the capabilities of Papyrus Capture, many of these utility supply companies are extending the applications to include information capture for their marketing campaigns. These applications benefit from the integra?ted analysis within the capture processes of Papyrus Capture to effect a quicker access to the returned marketing information and achieve significantly reduced costs of capture .
c References c
Wienstrom GmbH
Energie AG Oberösterreich
EVN
KELAG
STEWEAG
Salzburg AG
Physician statements – Extraction of every detail
An average of 7,000 invoices daily from insurance clients and physicians represent a substantial volume of information. Only recently have the technology and tools been available to automate the process of information capture.
Papyrus FreeForm® technology extracts the key data of each invoice, such as insured client, date of treatment, amounts involved. Due to adaptive document understanding functions and precise recognition it is even possible to automatically capture every single service item position, plus additional service related information. The details provided by Papyrus Capture with FreeForm® allow for consistent and objective revision of the positions on the submitted documents.
The Requirements
The insurance company Winterthur is dealing with peaks of up to 10,000 invoices from physicians and laboratories daily. The requirement was to extract relevant information from the invoices in support of the insurance customer’s claim for reimbursement. Clearly, a capture power tool was required to provide a responsive service to its clients. However, each supplier tends to have a unique invoice format which makes searching its contents for the information of interest a
significant challenge that could not be met by a conventional “bottom-up template-driven” approach. Furthermore, the information of interest is frequently printed in small font size and not within constrained areas on the document.
The Solution:
Papyrus FreeForm® for Invoices
From previous experience Winterthur made their selection of solution supplier based upon the benchmark of a pilot project. Papyrus Capture clearly won the benchmark through its high flexibility, easy training and excellent recognition results.
Papyrus Capture with embedded FreeForm® technology searches for and extracts the relevant information from each invoice.
The savings incurred through introducing the Papyrus Capture solution were realized immediately and the costs of the document capture system was recovered within only a few months, especially due to facilitate detailed revision of positions.
Functionalities
Logical definition libraries were initially created by ISIS comprising document patterns with basic document types and their respective information fields requiring extraction. Then the expressions and descriptors required for the “Extractor definitions” were generated by training from samples of each document type
(document class) using a “learn by example” approach.
Each position is found automatically and then validated and transformed for consistency with the information held on a master database. This normalization of notational variance and uncertainties created within the text recognition is achieved using ‘Fuzzy-logic’ matching technology.
Production Process
Image and Data Capture
The incoming invoices are scanned in both sorted and unsorted batches using a high performance Kodak document scanner.
Images of the documents are transferred automatically within the system for classification into document type and the extraction of their contents of interest.
Verification, Correction & Export
Staff within the verification group deal only with the exception documents, e.g. uncertainties raised during the recognition process or non-compliance to the
business rules.
The processed document batches are then exported automatically to the host system. During this stage the data for processing claim settlements are transferred via ODBC into a DB2 database and from there into the Winterthur system “Heureka plus”.
The information contained within the statements/invoices are analyzed and checked for plausibility e.g. policy coverage, scope of benefits provided, etc.
Controlled by a workflow system, the document images are presented to the responsible official for finalization.
Wincare,
the health insurance of Winterthur...
...is an enterprise within Credit Suisse Group, who is a worldwide leading bank and insurance corporation. A “Top Ten” company in the Swiss health insurance sector with approximately 300,000 insured clients and an annual turnover in excess of 500 million CHF.
TELEKOM AUSTRIA
Automated distribution of centrally received mail and fax, using intelligent classification
The Customer
The Telekom Austria group is Austria’s leading provider for tele?communication services in both stationary and mobile networks.
In the Czech Republic, Croatia and in Slovenia Telekom Austria is a major provider within the Internet and mobile communication business. In 2000 the corporate group with 18,560 employees achieved revenue of EURO 3.9 billion. By the year-end of 2000 Telekom Austria had grown to more than 3.2 million telephone lines and 2.8 million customers within the mobile sector. The services offered by Telekom Austria include the data communication and Internet technology sectors.
The Requirements
In order to meet the demands of an increasingly competitive marketplace Telekom Austria began an operational restructuring to expand its service offerings. These included centralizing its processing throughout Austria and higher automation of the incoming mail. However, the transactions relating to services such as installing new telephone lines, changing of telephone numbers, issuing invoices, or changes in tariffs, typically involve the information interaction with its customers to be accomplished using paper.
In changing markets such as the telecommunication sector, a proactive marketing strategy is essential. To be effective the marketing campaigns require that representative volumes of returned forms be processed rapidly.
In the case of Telekom Austria, the contents of more than 10,000 letter and fax documents daily needed to be distributed fast and reliably to the appropriate departments in the organization for ongoing processing. This demanded a highly integrated solution capable of high levels of automation, accurate distribution of documents, and the rapid introduction of new document types.
The Concept
Telekom Austria’s scan center in Vienna was chosen as the centralized distribution point for all the incoming mail. The entire process involving customer correspondence required re-structuring, from the digitalization of incoming data (e.g.customer mail and business documents) through to long-term archive of information with fast and transparent data access available, on demand, at the desktop.
Within the project TOM (Telekom Office Management) ISIS, using its Papyrus Capture product, offered a solution that created a transparent portal between the incoming paper documents and the digital information resident within the system’s domain. The implementation of Papyrus Capture will also provide the basis for the introduction of further workflow at Telekom Austria.
The Solution
The implementation of this concept was realized by ISIS using Papyrus Capture with Papyrus FreeForm® technology. Two high volume scanners process up to 10,000 documents daily with the document sorting and information extraction being performed by Papyrus Capture. Unstructured mail and documents are classified by the FreeForm® subsystem of Papyrus Capture and the index information is automatically extracted.
At commencement of the project, the system was trained to classify the incoming documents into 30 different document categories. However, the number of document types has more than doubled through subsequent additions by Telekom ?Austria’s trained personnel.
The Production Process
The scanners generate images of the paper documents (TIFF files), which form the basis for digitalization. Both single- and multi-page documents are being handled as one business object.
The TIFF files are then imported into the Papyrus Capture database for classification into document type. Initially, an automated prime sort is performed based on image analysis and reading identified text. Those documents that cannot be identified within the prime sort are forwarded for further classification via FreeForm®, with its capability to classify document types from less structured documents.
Following document classification the index data is extracted from the contents of each document. Typically, this index information comprises: customer number, area code, postal code (zip code) and telephone number. The extracted data is validated to ensure its integrity using data matching (using ‘Fuzzy logic’ ) against the customer database and compliance to the business rules. If the customer number does not correlate with a given telephone number or if a data field is incomplete, the system either corrects the discrepancies or forwards the document for data editing at an operator workstation. Each operator workstation presents a highly ?intuitive user interface that focuses on the information under consideration.
The resulting information is exported as a background task, in XML format, to the long-term IXOS archive system together with the application server. Designated personnel within Telekom Austria may retrieve the information and/or documents from the archive using the customer relationship management (CRM) System Clarify.
Furthermore, the extracted contents of each document are exported as an individual business case, in text file
format to Telekom’s host system for on-going processing.
Highlights
S Accepts all document types .
S Powerful design tools that simplify the definition of new document types and extensive use of supervised “machine learn by example” techniques.
S Rapid migration with minimal disruption to the operation
S From the initial 30 forms the system has expanded significantly and is currently
handling in excess of 70 document types
S The mail is typically distributed to the designated personnel throughout Austria within 4 hours of being received.
S Approximately 30% of the mail received daily requires to have its contents fully extracted using automated recognition assistance. Of those requiring full content capture 80% need no additional processing to be performed.
Decision Criteria
S Capable of processing large daily volumes with high accuracy
S Intuitive and productive user interfacing during the completion process
(Verification Module)
S Improved performance benefits (number of documents per employee hour)
S A flexible and comprehensive design environment to readily adapt the
system for new tasks
S Workflow integration
“Having learned to use the new tool and successfully integrated the capture system into our existing systems we obtained a high level of transparency, which not only achieved new processing targets but also improved our customer’s perception and, hence, improved acceptance of our products.”
Klaus Ambros, Telekom Austria AG
Bank Austria Creditanstalt
Data Capture at the Highest Level
About Bank Austria Creditanstalt:
Bank Austria Creditanstalt AG (BA-CA), a member of the Bavarian Hypo Vereinsbank Group (HVB), is the largest bank in Austria with total assets of about 50 billion EURO and 4,000 branches. BA-CA also has an extensive bank network in Central and Eastern Europe. Bank Austria Creditanstalt services more than 1.8 million private and 120,000 corporate customers.
Initial Situation/Objectives
The requirements for the new system which has to work for a just-merged very large organization were:
O Fast processing on the day of entry, with the guarantee of highest recognition and validation
O Rich functionalities, such as complete data capture, verification according to the “four-eyes-principle”
O Flexible, temporary access by various employees by providing possibilities for verification, data completion and administration from any workstation via a web application
O Integration into the existing IT and scanner environment
The new centralized processing office of Bank Austria Credit?anstalt in Vienna has to handle an average of 200,000 incoming documents per day, in peak seasons even up to 400,000. The incoming documents can be records submitted by customers in paper form or transfer orders for BA-CA accounts electronically submitted from other banks. The data of both document sources should be handled by one document capture system. The standardized A6 and A4 forms for national and international payment transactions can consist of print or hand writing.
The Solution
Papyrus Capture ZV, ISIS Papyrus’ flexible high-performance capture platform, is a special product for the requirements of payment transactions. Its workflow can easily be adapted according to the differing needs of each banking establishment. The flexibility of Papyrus Capture ZV and its options for bank-specific balancing of account numbers and validating customer addresses were key decision factors for selecting the ISIS Papyrus solution.
Papyrus Capture enabled the implementation of all requirements efficiently and successfully:
Database matching of customer data and account wording check using Fuzzy Matching technology guarantees less post editing effort and more accurate data capture.
The amounts read by OCR/ICR can be checked either by a visual comparison or by entering the amounts manually. The definition of any control parameter for release is possible.
Local verification and validation with Papyrus Portal enables data completion via the familiar Web browser environment. Data security is enabled by de-personalization of the records by “field-scrambling” (only single fields are displayed, no conclusion on the complete whole document information is possible). Additionally the connection to the central server is protected through a Secure Socket Layer (SSL).
The Process
Two Kleindienst H-Series and one Kleindienst SC 1660 are used for scanning. The scanning process generates Multi-Tiff images (grey level image, dropout (net) image, front and rear side). On single payment transaction forms account number, bank identification code, addressee, sender and the amount are registered. On records based on a cash transaction, addressee and amount are captured. The number of different document types is unlimited and can be expanded at any time. Completion of data is done either in a field-by-field way or on document level. Before the export, the data has to undergo a double verification control.
Highlights
A special highlight of this installation is the new method of data correction via web browser. The intranet-based correction with a thin-client allows a flexible number of work places and entering the corrections from any network PC with a web browser. If necessary, colleagues from other departments can collaborate after having been authorized as users by the system administrator and thus help at daily or monthly peaks without additional co-workers or delays. This new technology dramatically simplifies maintenance and roll out efforts as well as costs.
KEBA
Three Domains
One Cooperation
KEBA, an established company with headquarters in Linz, Austria, is a major world-wide provider in the fields of industry, bank and service automation.
ISIS Papyrus provides the intelligent underlying recognition technologies that empower various KEBA solutions.
The technical abilities of KEBA with its 560 employees are a source of international respect specializing in the fields of control engineering, communication technologies and production engineering. KEBA develops both products and solutions to meet the specific needs of their customers and employs only proven advanced technologies to maintain a competitive advantage. Strong emphasis is placed upon the development methodologies as well as compliance to regulatory international standards.
RONDO
RONDO, a product suite of KEBA, covers all main domains of self-service banking.
The Rondo self-service program has a modular structure. It offers customized solutions which are optimized to each bank’s requirements. ISIS Papyrus recognition technology is employed within the Rondo program to automate
services which are processed more efficiently and at lower costs in self service than at the cash desk, e.g. scanning of payment transaction slips.
Rondo self-serve transaction terminals
Currently there are about 3,000 Rondo self-serve terminals in operation. Each unit comprises a high level of functionality and has the capability to process many types of transactions obtaining highest recognition accuracy. The modular approach protects investment through add-on functionality to meet future demands, following the strategy to upgrade rather than replace equipment and optimize economy of space.
Built to a robust design using proven advanced technologies and high-tech components the terminals are easy to use and employ a touch-screen user interface.
The ISIS “ÜBox”
As a natural continuation of the Papyrus back-office payment
processing systems ISIS developed and patented the self-service scan-entry system called ÜBox. The ÜBox enables each customer to directly scan bank documents thereby reducing the “point-of-entry” process to an absolute minimum.
KeWin
KEWIN supplies a wide range of terminals from the entry-level models through to the fast multimedia lottery terminals. The modular approach of KEBA consoles makes them suitable for a wide range of applications including: interactive ticket validation, lottery ticket scan stations, etc. To handle the diversity in quality of the lottery ticket entries, e.g. faded or incomplete marks, Papyrus Capture was selected as the underlying recognition technology.
Customers of KEBA include:
Österreichische Lotterien GmbH
Österreichische Raiffeisenbanken
Österreichische Sparkassen
Sparkassen IT-Center IZB, Germany
Dubai Police
Telecom Asia, Thailand
Highlights of ISIS Technology at KEBA
% Robust and accurate –OMR (mark reading)
% OCR/ICR Character Recognition of both machine-print and hand-print
% Intelligent Image Pre-processing to ensure optimal
presentation to the recognition sub-system
% Distributed direct document scanning
KEBA Quick Facts
Location: Linz, Austria
Industry: Development and manufacturing of industry, bank and service automation equipment
Employees: 560+
Revenue: e70 million+
A Fully Integrated
Business Document Solution Using State of the Art Technology
The Papyrus Document System is much more than a sum of software components. Its architecture follows a thoughtfully designed blueprint providing solutions to individual customer problems as well as a long-term concept for integrating new technologies naturally into your environment. The Papyrus components can be used as stand alone products. Combined in an integrated system, they cover the complete life cycle from development to the archive.
Papyrus Designer Suite
A complete package to fulfill all graphical development requirements on a Windows platform.
Papyrus DocEXEC
A high-speed formatting engine available on 12 platforms for high volume batch and interactive single document production.
Papyrus Client with the integrated DocEXEC formatter offers online and interactive document application production and printing.
Papyrus Desktop using a Plug-In Interface with Papyrus Client provides user authorization by a role and policy system.
Papyrus Capture incorporates incoming paper and email into the system by classifying them according to a sample document set or keywords. Text and data fields are extracted for further processing and archiving. Data is
provided in formats that are ready for instant processing by other business applications.
Papyrus Postprocessing/PrintPool
Documents of any data source can be bundled, sorted and merged into one envelope. Storing documents in the PrintPool with document index information allows to output manage them for different channels such a fax, e-mail, archive, internet delivery and print.
Papyrus WebArchive fulfills customer care and Internet delivery requirements. The
documents can be viewed and printed in AFP format as well as converted on the fly for viewing in native PDF, GIF and TIFF formats. An XML interface links the document index information to third party long-term archiving systems.
Papyrus WebRepository provides document and resource versioning and validation (from/to). User role management, print job management and automatic software distribution are standard features of the product.
Papyrus Host & Server and WebControl
Papyrus Host is a Functional Subsystem linking to JES2/3 on OS/390 for remote print
management. Papyrus Server converts the independent AFP document format with highest fidelity to the required output formats for printing, faxing, mailing, web and the archive.
Papyrus WebControl offers the Print/Job/Queue Management for Papyrus Servers across the network using the Papyrus Objects Desktop.