YOUR FEEDBACK
wrote: Trackback Added: IBM aims at VDI players with … VERDE; IBM is enterin...
Cloud Computing Conference
March 30 - April 1, New York
Register Today and SAVE !..

SYS-CON.TV
TOP THREE LINKS YOU MUST CLICK ON


Data Transformation in the WebSphere Business Integration Platform
Developing a comprehensive solution

The most basic yet challenging problem of business integration is reconciling differences between data and messages within the enterprise. Enterprises have multiple application systems that hold data related to the data in other applications. The problem is the way in which those data are expressed - the semantic (or meaning) and syntactic (or format) incompatibilities between the disparate data models and representations used by applications. In a way, the data model for all of these different bits and pieces is inconsistent.

The crucial role of transformation in a business integration strategy is to be able to map these inconsistent data formats and data models into common objects that can be shared among the various applications. The role of data transformation is crucial, because at each and every step of a business process, transformation of the data is typically required.

The ability to transform complex data is of particular importance today because the rush to operational efficiency and regulatory compliance pressures has created the need for a new class of high-performance, mission-critical applications. These applications require accurate real-time data transformation between unstructured and semistructured document and message formats, such as nonstandard variants of EDI, legacy COBOL structures, XML, and proprietary business document types such as Microsoft Office and Adobe PDF.

Data transformation is often a major barrier to achieving business value and ROI from business integration projects, due to the diversity of formats used in the modern enterprise. Until now, data transformation has been accomplished with tools and technology that:

  • Fail to address diverse data types and often require costly custom development
  • Exacerbate technology proliferation issues
  • Drive up overall integration costs, complexity, and project timelines
CLASSES OF TRANSFORMATION PROBLEMS
From a business viewpoint, the most pressing problems associated with data transformation are:
  • The need to channel a large and diverse amount of information to end users and systems in support of customer self-service and real-time enterprise applications/initiatives such as B2B and other "extended enterprise" connections.
  • Regulatory compliance requirements such as Sarbanes-Oxley, HIPAA in healthcare, the U.S. Patriot Act, and Basel II accords in financial services. Compliance imposes control, reporting, and data integration challenges.
And, of course, another significant problem in these cases is cost. On average, application integration implementation costs range from three to five times the cost of required software licenses. Data transformation accounts for a disproportionate share of that implementation cost, which grows even higher as the data integration requirements increase in complexity.

And the problems will not soon go away. Unstructured data accounts for as much as 75% of the information stored in enterprises today. Semistructured data, which includes a variety of B2B and legacy data formats, is prevalent in many business integration scenarios and business application initiatives, such as B2B transaction integration, collaborative document-oriented workflow, and customer-self service.

TECHNICAL REQUIREMENTS FOR TRANSFORMATION
Recent standards such as XML, Extenible Style Sheet Language (XSL), XSL Transformation (XSLT), and the J2EE Connector Architecture (J2CA or J2EE-CA), bring some order to the world of data transformation. However, they don't address the full set of data transformation requirements for the business integration challenges that enterprises need to solve today.

In addition to robust support for the conversion of multiple data formats to and from XML, which is rapidly becoming the lingua franca for business integration, a comprehensive solution for transformation must support the integration of complex data and should include:

  • Prebuilt functions for one-to-many and many-to-many transformations
  • Support for transformation of repeating groups
  • An intuitive graphical development environment for designing and maintaining transformation maps (because even standard data formats may change all too quickly)
  • Support for custom transformation functions
  • Support for processing sets of data (e.g., dealing with multiple input records and associated filtering, summarizing, and sorting)
  • Support for unstructured and semistructured documents
  • Automatic discovery/import of document metadata - the ability to create definitions from a format specification or an example document
DEVELOPMENT ACTIVITIES IN TRANSFORMATION
From the technical perspective, the transformation process involves the following activities:
  • Sourcing: finding and understanding the structure of the candidate data specification or data instance for transformation
  • Definition: specifying the information to be extracted from the source, which is difficult to automate for complex data without some form of data visualization
  • Mapping: the specification of the semantic relationship between source and target formats, which is difficult to automate for complex data that typically doesn't have the hierarchical structure required by most mapping tools
  • Translation: the implementation of syntactic changes applied to the data
  • Reconciliation: the validation of the transformation; deals with any inconsistencies in the data
Today, these activities are accomplished using inefficient tools and processes that rely heavily on custom programming. What's needed is a rational and comprehensive approach to data transformation that addresses the greatest inefficiency in the process: the programming - and maintenance - of intensive transformation activities.

Data Transformation in the WebSphere Business Integration Platform

The IBM Business Integration Reference Architecture (BIRA) encompasses the key areas of integration capability required for comprehensive, enterprise-wide integration strategies and solutions. In the context of the BIRA, scenarios are presented in areas where the need for a robust data transformation solution is most apparent. These scenarios also mention the specific WebSphere Business Integration components that carry out complex data transformations, and those components are then described.

BUSINESS INTEGRATION REFERENCE ARCHITECTURE SCENARIOS
Figure 1 shows the IBM Business Integration Reference Architecture, including both the generic services and the IBM products that deliver the required generic capabilities.

When automating processes across the value chain, data is inevitably encountered that isn't self-describing and that is not defined with schemas, as are XML documents. To integrate across systems, such unstructured data must be accommodated. Below are several scenarios that map to various BIRA capabilities and the data transformation requirements they entail:

  • Partner Services: These are provided by WebSphere Business Integration Connect for the exchange of B2B transactions. These transactions may involve standards-based data, and custom implementations of those standards, or the exchange of binary documents.
  • Application and Data Access: For application and data access services, WebSphere Business Integration Adapters exchange transactions in their native application formats, enabling applications to participate in a service-oriented architecture (SOA). In many cases, transforming complex application formats will be required.
  • Enterprise Service Bus (ESB): Within the Enterprise Service Bus, which transports and mediates transactions across distributed systems in an SOA, ESB clients may exchange data in complex data formats. In this case, WebSphere MQ, Web Services Gateway, and the WebSphere Business Integration Message Broker are used.
  • Process Services: For process services, where solutions implement multistep business processes that span people and systems, complex data must be also be accommodated. In this case, the WebSphere Business Integration Server and WebSphere Business Integration Server Foundation are used.
DATA TRANSFORMATION TOOLS IN WEBSPHERE BUSINESS INTEGRATION COMPONENTS
This section lists the WebSphere Business Integration Platform components and the tools they provide for data transformation:
  • WebSphere Business Integration Connect enables a business to extend integration beyond the enterprise - exchanging information with trading partners and managing trading communities. WBI Connect can use schema validations, XSLT transformations, and custom exits to call external transformation services on the transmitted documents.
  • WebSphere Business Integration Server provides process integration, workforce management, and enterprise application connectivity. It has the following sub-components:
    - WebSphere InterChange Server automates and synchronizes business activities executed across multiple applications as business processes. Its GUI-based Map Designer defines and generates code for transformation maps between application-specific and normalized data models. It specifies the transformation steps for each destination attribute to be transformed. Mapping is supported between data values and data entity relationships, and via access to third-party mapping products and databases.
    - WebSphere MQ Workflow enables the integration of all participants in the business process, including those external to the organization. Data mapping tools are provided via the WebSphere MQ Workflow Buildtime graphical environment.
    - WebSphere Business Integration Message Broker transforms and enriches in-flight information to provide a level of intermediation between applications that use different message structures and formats. It provides rich support for configuring message flows that transform messages from one format to another with the Message Flow Mapping editor in the Message Brokers Toolkit for WebSphere Studio.

    For messages that are pre-defined - their content and structure are both known and predictable - WebSphere Business Integration Message Broker can use the facilities provided by the MRM. For messages that aren't predefined, a common approach is to include a Compute node in the message flow to create the output message with the required content. This task can vary greatly in complexity and effort, depending on the complexity of the format. For example, for each compute node in a given flow or application, an ESQL module must be coded using ESQL statements and functions to tailor the behavior of the node.

    From a runtime perspective, WebSphere Business Integration Message Broker includes prebuilt parsers. Parsers are programs that interpret the bit streams of incoming messages and create internal representations of the message.
    - WebSphere Business Integration Server Foundation is a standards-based platform optimized for building and deploying composite applications by creating reusable services out of Web services, Java assets, back-end systems, and packaged applications. It includes WebSphere Process Choreographer, which is a business-process engine that allows for the efficient execution of business processes. With it, business process technology can be combined with any other service offered by the open J2EE architecture.
    - WebSphere Business Integration Server Foundation includes WebSphere Studio Application Developer Integration Edition, which provides a drag-and-drop integration development environment optimized for building composite applications. Studio provides a number of XML tools for building and validating DTDs, XML schemas, and XML files; for generating JavaBeans from a DTD or XML schema; and for defining transformation mappings between XML documents by generating XSLT scripts. In addition, Studio tools support the creation of an HTML document by applying an XSL style sheet to an XML document, the definition of mappings between relational tables and DTD files, or between SQL statements and DTD files.

  • WebSphere Business Integration Adapters enable an enterprise to create integrated processes that exchange information between ERP, HR, CRM, and supply chain systems. There are several different types of adapters as well as a toolkit for developing custom adapters:
    - Applications Adapters: extract data and transaction information from cross-industry and industry-specific packaged applications, and connect them to a central hub
    - Mainframe Adapters: provide access to application data in OS/390 systems and provide connectivity approaches to AS/400 systems
    - Technology Adapters: provide the connectivity for accessing data, technologies, and protocols that enhance integration infrastructure
WebSphere Business Integration Adapters use data handlers to perform data transformations and to manage interactions with both WebSphere Business Integration Platform components and applications. IBM provides several standard data handlers (e.g., Fixed-Width, Delimited, and Name-Value data handlers) and special data handlers (XML and EDI data handlers).

The WebSphere Business Integration Adapter Framework enables adapters to be used in many different Business Integration solutions, including those provided by WebSphere InterChange Server, WebSphere Business Integration Message Broker, or WebSphere Business Integration Server Foundation.

ADVANCED DATA TRANSFORMATION
For the integration of complex data, IBM provides the WebSphere Business Integration Data Handler for Complex Data. It allows for the bidirectional conversion of text and binary formats to and from WebSphere Business Integration business objects. By using this data handler, a user can integrate with standard and proprietary formats such as Microsoft Word, Microsoft Excel, Adobe PDF, COBOL Copybooks, and HL7.

This data handler is commonly used with a WebSphere Business Integration Technology Adapter, e.g., JText, HTTP, MQ, or e-mail. It can also be used with a custom-built adapter. The IBM data handler leverages the Itemfield ContentMaster product, which provides a visual development and test environment for constructing custom parsers, without requiring custom programming.

Custom Java data handlers can be created that manipulate the Data Handler API by using the Adapter Development Kit and writing custom Java code. However, when data integration complexity warrants coding of a custom data handler, the IBM/Itemfield solution offers an excellent alternative.

The WebSphere Business Integration Data Handler for Complex Data implementation leverages XML to exchange data between the data handlers and Itemfield ContentMaster.

DESIGN TIME TOOLING
Figure 2 shows a design time implementation in which ContentMaster Studio is used to generate the parsers and serializers needed to deal with the complex data stream. An XML schema is used to describe the XML exchange with the data handler.

The XML schema is then utilized by the WebSphere Business Integration Adapter tooling to generate business object definitions describing the data exchanged between the adapter and the integration server.

RUNTIME INFRASTRUCTURE
Figure 3 illustrates runtime configuration. Here, the adapter manages the exchange of business objects with the integration server, as well as the exchange of data streams with the application, using its native interface. Using the data handler, it translates between data streams and business objects. It does so by leveraging the embedded ContentMaster Engine with its serializers and parsers, as well as the XML data handler.

Because the Data Handler for Complex Data can be plugged into a technology adapter, ContentMaster can be used in virtually all WebSphere Business Integration components. The adapter framework is accessible throughout the entire WebSphere Business Integration stack.

Conclusion

The WebSphere Business Integration Platform provides many options for transforming data, including off-the-shelf adapters and data handlers. However, dealing with complex data is the heart of today's data transformation issue. These off-the-shelf components may not provide all of the necessary support for creating custom transformation logic that the integration platform doesn't provide out of the box, or for handling input formats that require too much maintenance.

For the integration of complex data, IBM has provided a compelling solution that provides an attractive alternative to developing custom Java data handlers within adapters. Complex data transformations can be implemented rapidly, without custom programming, making them very flexible in the face of change. IBM's tight integration of Itemfield's next-generation data transformation solution extends the scope and power of the WebSphere Business Integration platform by automating the integration of complex data - the most challenging part of many business integration efforts.

About Joseph Schwartz
As vice president of marketing, Joe is responsible for Itemfield's product pricing, positioning, packaging and the formulation of the company's ongoing go-to-market strategy. Previously, he was the vice president of product management at IONA Technologies, Inc., a middleware company. Joe also founded and managed two successful technology startups, one of which went public on the NASDAQ.

WEBSPHERE LATEST STORIES . . .
IBM is taking another shot at blowing Microsoft off the desktop and this time it’s got the foul economic winds at its back. In the name of cost cutting, IBM is proposing that companies virtualize their desktops and turn them into thin clients using Virtual Bridges' Virtual Enterprise...
Lighthouse Computer Services has expanded its software-related services with the formation of a new group devoted to IBM WebSphere application infrastructure and integration solutions. Lighthouse's WebSphere Services Practice offers extensive capabilities surrounding application integr...
The reason why ex-IBM executive Mark Papermaster can’t work for Apple is because Apple and IBM compete in microprocessors for iPod and iPhones. That’s what the judge deciding where Papermaster can work – in view of his non-compete – said in his 28-page opinion explaining why IB...
IBM is going to buy Transitive, the British cross-platform virtualization firm that salvaged legacy Macintosh programs and made Apple's move from IBM to Intel chips as graceful as a prima ballerina’s pirouette. Transitive is clever at running applications written for one kind of micr...
Emulex has announced that its LightPulse LP21000 family of Fibre Channel over Ethernet (FCoE) Converged Network Adapters (CNAs) have been tested and found to be compatible for use with IBM Systems x3650(7979), x3655(7943) and x3755(7163) series servers. Emulex CNAs enable the consolida...
Mark Papermaster, the ex-VP of blade development at IBM and the guy that IBM stopped from going to Apple to run its iPod and IPhone development on the strength of the non-compete he signed, has sued his former master looking for a declaratory judgment in his favor.
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE
BREAKING WEBSPHERE NEWS
Today, IBM (NYSE: IBM) announced a set of actions to bolster its security solutions that can help cl...