Sunday, September 28, 2014

JBoss Fuse and JBoss Data Virtualization Integration using OData2/Olingo2 Camel Component

I have recently contributed an OData2.0 component based on Apache Olingo2 library to Apache Camel. The component supports all OData operations listed below:

  1. Reading EDM, Service Document, Entity Feed
  2. Creating, reading, updating, merging, patching, deleting Entity Records
  3. Reading, updating complex and simple properties
  4. Reading, updating, deleting relationship links
  5. Performing all the above operations as a sequence in a single Batch operation
Thanks to the efforts of my Red Hat colleague Ted Jones, I was able to resolve some interoperability issues in the component we found when testing it against JBoss Data Virtualization OData server. As a result the component is now fully interoperable with JBoss DV. 


This opens up another avenue for integrating JBoss Fuse and JBoss DV in addition to existing methods like the Camel SQL component. I was able to put this component together so quickly thanks to the Apache Camel Component Framework I have blogged about earlier

Unfortunately some of the fixes from this effort did not make it into the Apache Camel 2.14.0 release, but will be included in JBoss Fuse 6.2 release coming soon. 

I will be posting a demo soon that shows how the new Olingo2 component can be used to integrate JBoss Fuse and JBoss Data Virtualization to combine the awesome power of both platforms. 

Thursday, September 11, 2014

Apache Camel API Component Framework or: How I learned to stop worrying and love writing feature rich Camel Components

Prologue

Apache Camel is a wonderful routing and mediation integration framework with support for a large number of protocols, products and technologies. All Camel components expose functionality through Endpoint URIs for message producers and consumers to a route. Or in other words component functionality is mapped to and from URIs. Parts of URI path and options or arguments become parameters for invoking component behavior. 

Writing a Camel component is easily done using a Maven archetype:

mvn archetype:generate \
-DarchetypeGroupId=org.apache.camel.archetypes \
-DarchetypeArtifactId=camel-archetype-component \
-DarchetypeVersion=<camel-version> \
-DgroupId=myGroupId \
-DartifactId=myArtifactId

Most of the initial effort in writing the component boils down to implementing the component interfaces that hook into the Camel framework, allowing it to be invoked by the framework when message exchanges need to be processed or produced from consumer endpoints. Rest of the work is component specific, and involves mapping URI parts to component functionality. 

One quickly starts to realize that starting to write a component is the easy part. Making it feature rich is time consuming and sometimes hard. It involves mapping each and every feature and its parameters to URIs and invoking some functionality wrapped by the component, such as an underlying transport, library, SDK, etc. In the past an alternative to putting in a lot of upfront component development effort was to let user demand and contributions drive component maturity. But that left initial versions of components somewhat lacking. 

After having written a few components that wrapped some sort of third party client library/SDK I realized that there was a method to the madness that is writing such Camel components. The idea is at its core quite simple, which is what makes it as flexible and useful as it has turned out to be in writing a number of recently added Camel components. 

Madness

Every Camel component that wraps some kind of third party client API/SDK has to essentially do the same thing. Map URIs and arguments (optionally Exchange headers) to some Java method invocation implemented by some Client proxy. There may be multiple client proxy interfaces/classes and multiple methods involved. 
Of course depending upon how cleanly or amateurishly the client API was designed, not all the methods may be exposed, and not all the arguments may need to be exposed or have been aptly named. Some arguments may have values computed by the component, and may represent some sort of conversational state (such as an authorization token), so they may not be exposed as URIs. 
Despite all these special cases, there is still a fairly good amount of unskilled labor involved in mapping URIs to methods and arguments. 
For components with a large number of functionality and hence APIs (either interfaces/classes, or methods or both) this unskilled labor is precisely what takes up a huge amount of time in component development. 

Method

As must be evident by now, the solution is basically runtime reflection and invocation of APIs, mapping URI paths and arguments to methods (Class and method names) and method arguments. Mapping paths to method names is easy, since method names are preserved at runtime in Java. However, argument names are removed, so any tool that wants to do the mapping from URI's named arguments and exchange name-value headers has no access to them. 

That's where Javadoc or simple signature text files fill in the missing piece in the puzzle. Maven repositories have publicly available Javadoc for client APIs. Even if one writes a custom client API, it's trivial to add the Maven Javadoc plugin to the project to generate one. Since Javadoc HTML is standard (there are slight variations between SDK versions, but nothing that can't be accounted for in a parser) it can be easily parsed to pull parameter names. 
In situations where the Javadoc is not available or too complex, a simple text file with method signatures and parameter names can be provided. 

After this information has been parsed, a model can be generated that can be instantiated and used at runtime to map URIs to methods. 
The beauty of this method is that its completely independent of the number of API classes and methods being invoked. Also, if the API changes, the model can be quickly regenerated and a new component version released. 

Also, the model can also be used to automatically generate skeleton integration testing code and even documentation for the complex feature rich components being developed as a result of this framework. 

All the component developer now needs to do is describe how URI paths map to API classes in a simple but powerful XML configuration and write a small amount of code to instantiate and manage client proxy objects that are used to invoke reflected methods. 

Framework Features

Tooling

The API component framework is used to generate a component using the Camel maven archetype:
mvn archetype:generate \
-DarchetypeGroupId=org.apache.camel.archetypes \
-DarchetypeArtifactId=camel-archetype-api-component \
-DarchetypeVersion=2.14-SNAPSHOT \
-DgroupId=org.apache.camel.component.example \
-DartifactId=camel-example \
-Dname=Example \
-Dscheme=example \
-Dversion=1.0-SNAPSHOT \
-DinteractiveMode=false

This generates a skeleton component project preconfigured with a couple of simple hello world style API classes. The generated project can actually be compiled, and produces integration tests that can be tested using JUnit. 
This archetype generated project is configured to use the camel-api-component-maven-plugin to generate API model code when the component classes are built. This plugin's configuration has to be edited to point to the user's API to be used. 

The skeleton project has the following structure:
./camel-example # root project pom
./camel-example/camel-example-api # sample API classes, configured to produce Javadoc reference by component
./camel-example/camel-example-component # component classes and pom configured to use camel-api-component-maven-plugin

The api module can be dropped and the component module moved up to root level in cases where the API classes are provided by a third party or product wrapped by the Camel component. 

Component Configuration

The API framework maps URIs of the form:

scheme://endpoint-prefix/endpoint?optionName1=optionValue1...&optionNameN=optionValueN

Where, scheme comes from the archetype command, endpoint prefixes map to API names (classes or interfaces), endpoints map to method names, and optionNames map to method argument names from Javadoc or a simple text file with method signatures. 
API names are mapped to Java proxy classes (either classes or interfaces) using the fromApis goal in camel-example-component/pom.xml. 
An example is shown next:
<goals>
<goal>fromApis</goal>
</goals>
<configuration>
<apis>
<api>
<apiName>comments</apiName>
<proxyClass>org.apache.camel.component.linkedin.api.CommentsResource</proxyClass>
<fromJavadoc/>
</api>
<api>
<apiName>companies</apiName>
<proxyClass>org.apache.camel.component.linkedin.api.CompaniesResource</proxyClass>
<fromJavadoc/>

Prefixes are easily mapped to proxy classes. The fromJavadoc element shows that the code generator should look for Javadoc to be available in the maven provided scope. Alternatively, the signature may come from a text file path provided in a fromSignatureFile element. 
In addition to the mapping from endpoint prefixes to proxy classes, the code generation can be controlled through a number of configuration options that are described later. 

Component Lifecycle

In addition to configuring the API mapping, component developer has to create and manage instances of proxy classes in the ExampleEndpoint class. The component may require custom proxy creation parameters or properties, such as user credentials, connection properties like URLs for external resources. These properties must be added to camel-example//camel-example-component/src/main/java/org/apache/camel/component/example/ExampleConfiguration.java. 
Endpoint configuration properties for method arguments are generated in classes that extend ExampleConfiguration. 
Sample code generated by the archetype can be easily modified to instantiate proxies in the afterConfigureProperties() method. The class should also override doStop() to do any resource cleanup, if required. 
Most of the Camel components I have written using this framework have also exploited the idea of creating a shared proxy instance in the component, which is shared across endpoints that use identical connection configuration in ExampleConfiguration. 

API Configuration

The API mapping can be tweaked and enriched with a number of configuration options. These options can be specified at the global level under the apis element or per API under the api element. 

Parameter Substitution

Unless the API being wrapped has been cleanly designed, there can sometimes be conflicts in method argument names and types. Since an endpoint configuration class is generated per proxy class with properties that map to argument names for all methods in that class, the class becomes a namespace for all arguments across methods. So methods with the same argument name MUST be of the same type, or should not be too similar or confusing, etc. Basically there may be a number of reasons why the component developer may want to rename or modify parameter names. 
The framework supports parameter substitutions as shown next:
<fromSignatureFile>signatures/file-sig-api.txt</fromSignatureFile>
<!-- Use substitutions to manipulate parameter names and avoid name clashes -->
<substitutions>
<substitution>
<method>^.+$</method>
<argName>^.+$</argName>
<argType>java.lang.String</argType>
<replacement>$1Param</replacement>
<replaceWithType>false</replaceWithType>
</substitution>
</substitutions>
The elements method and argName are regular expressions and may match all or part of the method name and argument name respectively. The argType is the class name regex to match, and may be omitted. The replacement text is the required text used to substitute the argument name. If the replaceWithType flag  is set to true (false by default) the replacement text uses $ variables from the argType regex instead of the argName regex. 
With all these options, its easy to substitute any ugly argument name with a pretty argument name as the component developer desires. 

Exclude Argument by Name

Arguments can be excluded by name using a regex for option excludeConfigNames. 

Exclude Argument by Type

Arguments can be excluded by type using a regex for option excludeConfigTypes. 

Custom/Extra Arguments

Custom endpoint URI options can be added to the API proxy mapping using the option extraOptions like so:
<!-- Add custom endpoint options to generated EndpointConfiguration class for this API -->
<extraOptions>
<extraOption>
<type>java.util.List&lt;String&gt;</type>
<name>customOption</name>
</extraOption>
</extraOptions>
This extra option will usually be intercepted by the component in the overridden method interceptProperties() in the endpoint, consumer and or producer. These extra options may be used to compute/generate a complex value for another method argument excluded using the options mentioned earlier. 

Nullable Options

Some method arguments in APIs can be null. These argument names can be provided in a list of extraOption elements as:

<!-- Add nullable endpoint options for this API -->
<nullableOptions>
<nullableOption>option1</nullableOption>
<nullableOption>option2</nullableOption>
</nullableOptions>

Method Aliases

APIs sometimes use common naming conventions for methods, which can be exploited to create shorthand aliases for endpoints. For example, the common get/set Java bean property method pattern can be used to generate aliases for endpoints as:

<!-- Use method aliases in endpoint URIs, e.g. support 'widget' as alias for getWidget or setWidget
<aliases>
<alias>
<methodPattern>[gs]et(.+)</methodPattern>
<methodAlias>$1</methodAlias>
</alias>
</aliases>

Javadoc Options

In addition to the options listed above, Javadoc parsing can also be customized using the following options:
  • excludePackages - regex for excluding API base classes by package name, the default value is javax?\.lang.*
  • excludeClasses - regex for excluding API base classes by class name, all classes from the hierarchy above the matching class are ignored for adding methods to generated model
  • includeMethods - regex for methods to be included in the URI mapping model
  • excludeMethods - regex for excluding method from the model
  • includeStaticMethods - should static methods be included in the model

Use Cases

The component framework has been used to model and handle a number of different APIs and styles so far. 

Paypal

The Paypal component written by a colleague Paolo Antinori demonstrates how the API component framework dramatically reduces the amount of code that has to be written and managed for a feature rich component. The component essentially exposes all Paypal APIs from their SDK with minimal amount of hand written code. 

Box.com

This component shows third party Box.com Java SDK being modeled and invoked using the API framework. It also shows how the framework can be easily adapted to write custom consumer polling to support Box.com's long polling API. 

LinkedIn

This component demonstrates how a simple WADL for the LinkedIn REST service and CXF wadltojava plugin can be used to generate an API that can be wrapped using the API component framework to produce a fully functional component that supports most, if not all, LinkedIn REST endpoints. 
This approach can be easily repeated to create a Camel component for any SaaS product or platform. 

Google Drive

This component demonstrates how the API component framework can even handle Method Object style Google APIs, where URI options are mapped to a method object, which is then invoked in an overridden doInvoke() method in consumer and producer. 

Olingo2

This component demonstrates how even a callback based Asynchronous API can be leveraged using the API component framework. This example shows how asynchronous processing can be pushed into underlying resources like HTTP NIO connections to make Camel endpoints more resource efficient. 

Epilogue

As must be clearly evident by now the Camel API Component framework is a powerful micro framework built on top of the Camel component classes. It allows component developers to quickly and efficiently develop components, sometimes in the order of one or two weeks, compared to weeks of development on end, painfully adding one API/feature at a time in the past. 

Another potential use case for the framework is for users of CXF based REST or SOAP services. Although the Camel CXF component is quite capable, it is unable to handle parameter names since CXF does not expose them in its Camel endpoints. This forces users to resort to Lists of un-named values for CXF endpoint parameters. 
By generating a CXF client proxy for an existing REST/SOAP endpoint, and then wrapping it using the API Component Framework, end users can quickly create custom components for invoking their services with named options with type checking. This ot only makes endpoints more readable, it could significantly cut down the amount of time wasted testing and fixing issues with unnamed parameter lists.

The API Component Framework will be available in Apache Camel 2.14.0, and the framework and components will be part of the JBoss Fuse 6.2 release coming soon. 

Hopefully, this posts inspires you to take a look at this framework, and contribute more feature-rich components to Apache Camel for the hundreds of APIs out there.