Sunday, September 28, 2014

JBoss Fuse and JBoss Data Virtualization Integration using OData2/Olingo2 Camel Component

I have recently contributed an OData2.0 component based on Apache Olingo2 library to Apache Camel. The component supports all OData operations listed below:

  1. Reading EDM, Service Document, Entity Feed
  2. Creating, reading, updating, merging, patching, deleting Entity Records
  3. Reading, updating complex and simple properties
  4. Reading, updating, deleting relationship links
  5. Performing all the above operations as a sequence in a single Batch operation
Thanks to the efforts of my Red Hat colleague Ted Jones, I was able to resolve some interoperability issues in the component we found when testing it against JBoss Data Virtualization OData server. As a result the component is now fully interoperable with JBoss DV. 


This opens up another avenue for integrating JBoss Fuse and JBoss DV in addition to existing methods like the Camel SQL component. I was able to put this component together so quickly thanks to the Apache Camel Component Framework I have blogged about earlier

Unfortunately some of the fixes from this effort did not make it into the Apache Camel 2.14.0 release, but will be included in JBoss Fuse 6.2 release coming soon. 

I will be posting a demo soon that shows how the new Olingo2 component can be used to integrate JBoss Fuse and JBoss Data Virtualization to combine the awesome power of both platforms. 

Thursday, September 11, 2014

Apache Camel API Component Framework or: How I learned to stop worrying and love writing feature rich Camel Components

Prologue

Apache Camel is a wonderful routing and mediation integration framework with support for a large number of protocols, products and technologies. All Camel components expose functionality through Endpoint URIs for message producers and consumers to a route. Or in other words component functionality is mapped to and from URIs. Parts of URI path and options or arguments become parameters for invoking component behavior. 

Writing a Camel component is easily done using a Maven archetype:

mvn archetype:generate \
-DarchetypeGroupId=org.apache.camel.archetypes \
-DarchetypeArtifactId=camel-archetype-component \
-DarchetypeVersion=<camel-version> \
-DgroupId=myGroupId \
-DartifactId=myArtifactId

Most of the initial effort in writing the component boils down to implementing the component interfaces that hook into the Camel framework, allowing it to be invoked by the framework when message exchanges need to be processed or produced from consumer endpoints. Rest of the work is component specific, and involves mapping URI parts to component functionality. 

One quickly starts to realize that starting to write a component is the easy part. Making it feature rich is time consuming and sometimes hard. It involves mapping each and every feature and its parameters to URIs and invoking some functionality wrapped by the component, such as an underlying transport, library, SDK, etc. In the past an alternative to putting in a lot of upfront component development effort was to let user demand and contributions drive component maturity. But that left initial versions of components somewhat lacking. 

After having written a few components that wrapped some sort of third party client library/SDK I realized that there was a method to the madness that is writing such Camel components. The idea is at its core quite simple, which is what makes it as flexible and useful as it has turned out to be in writing a number of recently added Camel components. 

Madness

Every Camel component that wraps some kind of third party client API/SDK has to essentially do the same thing. Map URIs and arguments (optionally Exchange headers) to some Java method invocation implemented by some Client proxy. There may be multiple client proxy interfaces/classes and multiple methods involved. 
Of course depending upon how cleanly or amateurishly the client API was designed, not all the methods may be exposed, and not all the arguments may need to be exposed or have been aptly named. Some arguments may have values computed by the component, and may represent some sort of conversational state (such as an authorization token), so they may not be exposed as URIs. 
Despite all these special cases, there is still a fairly good amount of unskilled labor involved in mapping URIs to methods and arguments. 
For components with a large number of functionality and hence APIs (either interfaces/classes, or methods or both) this unskilled labor is precisely what takes up a huge amount of time in component development. 

Method

As must be evident by now, the solution is basically runtime reflection and invocation of APIs, mapping URI paths and arguments to methods (Class and method names) and method arguments. Mapping paths to method names is easy, since method names are preserved at runtime in Java. However, argument names are removed, so any tool that wants to do the mapping from URI's named arguments and exchange name-value headers has no access to them. 

That's where Javadoc or simple signature text files fill in the missing piece in the puzzle. Maven repositories have publicly available Javadoc for client APIs. Even if one writes a custom client API, it's trivial to add the Maven Javadoc plugin to the project to generate one. Since Javadoc HTML is standard (there are slight variations between SDK versions, but nothing that can't be accounted for in a parser) it can be easily parsed to pull parameter names. 
In situations where the Javadoc is not available or too complex, a simple text file with method signatures and parameter names can be provided. 

After this information has been parsed, a model can be generated that can be instantiated and used at runtime to map URIs to methods. 
The beauty of this method is that its completely independent of the number of API classes and methods being invoked. Also, if the API changes, the model can be quickly regenerated and a new component version released. 

Also, the model can also be used to automatically generate skeleton integration testing code and even documentation for the complex feature rich components being developed as a result of this framework. 

All the component developer now needs to do is describe how URI paths map to API classes in a simple but powerful XML configuration and write a small amount of code to instantiate and manage client proxy objects that are used to invoke reflected methods. 

Framework Features

Tooling

The API component framework is used to generate a component using the Camel maven archetype:
mvn archetype:generate \
-DarchetypeGroupId=org.apache.camel.archetypes \
-DarchetypeArtifactId=camel-archetype-api-component \
-DarchetypeVersion=2.14-SNAPSHOT \
-DgroupId=org.apache.camel.component.example \
-DartifactId=camel-example \
-Dname=Example \
-Dscheme=example \
-Dversion=1.0-SNAPSHOT \
-DinteractiveMode=false

This generates a skeleton component project preconfigured with a couple of simple hello world style API classes. The generated project can actually be compiled, and produces integration tests that can be tested using JUnit. 
This archetype generated project is configured to use the camel-api-component-maven-plugin to generate API model code when the component classes are built. This plugin's configuration has to be edited to point to the user's API to be used. 

The skeleton project has the following structure:
./camel-example # root project pom
./camel-example/camel-example-api # sample API classes, configured to produce Javadoc reference by component
./camel-example/camel-example-component # component classes and pom configured to use camel-api-component-maven-plugin

The api module can be dropped and the component module moved up to root level in cases where the API classes are provided by a third party or product wrapped by the Camel component. 

Component Configuration

The API framework maps URIs of the form:

scheme://endpoint-prefix/endpoint?optionName1=optionValue1...&optionNameN=optionValueN

Where, scheme comes from the archetype command, endpoint prefixes map to API names (classes or interfaces), endpoints map to method names, and optionNames map to method argument names from Javadoc or a simple text file with method signatures. 
API names are mapped to Java proxy classes (either classes or interfaces) using the fromApis goal in camel-example-component/pom.xml. 
An example is shown next:
<goals>
<goal>fromApis</goal>
</goals>
<configuration>
<apis>
<api>
<apiName>comments</apiName>
<proxyClass>org.apache.camel.component.linkedin.api.CommentsResource</proxyClass>
<fromJavadoc/>
</api>
<api>
<apiName>companies</apiName>
<proxyClass>org.apache.camel.component.linkedin.api.CompaniesResource</proxyClass>
<fromJavadoc/>

Prefixes are easily mapped to proxy classes. The fromJavadoc element shows that the code generator should look for Javadoc to be available in the maven provided scope. Alternatively, the signature may come from a text file path provided in a fromSignatureFile element. 
In addition to the mapping from endpoint prefixes to proxy classes, the code generation can be controlled through a number of configuration options that are described later. 

Component Lifecycle

In addition to configuring the API mapping, component developer has to create and manage instances of proxy classes in the ExampleEndpoint class. The component may require custom proxy creation parameters or properties, such as user credentials, connection properties like URLs for external resources. These properties must be added to camel-example//camel-example-component/src/main/java/org/apache/camel/component/example/ExampleConfiguration.java. 
Endpoint configuration properties for method arguments are generated in classes that extend ExampleConfiguration. 
Sample code generated by the archetype can be easily modified to instantiate proxies in the afterConfigureProperties() method. The class should also override doStop() to do any resource cleanup, if required. 
Most of the Camel components I have written using this framework have also exploited the idea of creating a shared proxy instance in the component, which is shared across endpoints that use identical connection configuration in ExampleConfiguration. 

API Configuration

The API mapping can be tweaked and enriched with a number of configuration options. These options can be specified at the global level under the apis element or per API under the api element. 

Parameter Substitution

Unless the API being wrapped has been cleanly designed, there can sometimes be conflicts in method argument names and types. Since an endpoint configuration class is generated per proxy class with properties that map to argument names for all methods in that class, the class becomes a namespace for all arguments across methods. So methods with the same argument name MUST be of the same type, or should not be too similar or confusing, etc. Basically there may be a number of reasons why the component developer may want to rename or modify parameter names. 
The framework supports parameter substitutions as shown next:
<fromSignatureFile>signatures/file-sig-api.txt</fromSignatureFile>
<!-- Use substitutions to manipulate parameter names and avoid name clashes -->
<substitutions>
<substitution>
<method>^.+$</method>
<argName>^.+$</argName>
<argType>java.lang.String</argType>
<replacement>$1Param</replacement>
<replaceWithType>false</replaceWithType>
</substitution>
</substitutions>
The elements method and argName are regular expressions and may match all or part of the method name and argument name respectively. The argType is the class name regex to match, and may be omitted. The replacement text is the required text used to substitute the argument name. If the replaceWithType flag  is set to true (false by default) the replacement text uses $ variables from the argType regex instead of the argName regex. 
With all these options, its easy to substitute any ugly argument name with a pretty argument name as the component developer desires. 

Exclude Argument by Name

Arguments can be excluded by name using a regex for option excludeConfigNames. 

Exclude Argument by Type

Arguments can be excluded by type using a regex for option excludeConfigTypes. 

Custom/Extra Arguments

Custom endpoint URI options can be added to the API proxy mapping using the option extraOptions like so:
<!-- Add custom endpoint options to generated EndpointConfiguration class for this API -->
<extraOptions>
<extraOption>
<type>java.util.List&lt;String&gt;</type>
<name>customOption</name>
</extraOption>
</extraOptions>
This extra option will usually be intercepted by the component in the overridden method interceptProperties() in the endpoint, consumer and or producer. These extra options may be used to compute/generate a complex value for another method argument excluded using the options mentioned earlier. 

Nullable Options

Some method arguments in APIs can be null. These argument names can be provided in a list of extraOption elements as:

<!-- Add nullable endpoint options for this API -->
<nullableOptions>
<nullableOption>option1</nullableOption>
<nullableOption>option2</nullableOption>
</nullableOptions>

Method Aliases

APIs sometimes use common naming conventions for methods, which can be exploited to create shorthand aliases for endpoints. For example, the common get/set Java bean property method pattern can be used to generate aliases for endpoints as:

<!-- Use method aliases in endpoint URIs, e.g. support 'widget' as alias for getWidget or setWidget
<aliases>
<alias>
<methodPattern>[gs]et(.+)</methodPattern>
<methodAlias>$1</methodAlias>
</alias>
</aliases>

Javadoc Options

In addition to the options listed above, Javadoc parsing can also be customized using the following options:
  • excludePackages - regex for excluding API base classes by package name, the default value is javax?\.lang.*
  • excludeClasses - regex for excluding API base classes by class name, all classes from the hierarchy above the matching class are ignored for adding methods to generated model
  • includeMethods - regex for methods to be included in the URI mapping model
  • excludeMethods - regex for excluding method from the model
  • includeStaticMethods - should static methods be included in the model

Use Cases

The component framework has been used to model and handle a number of different APIs and styles so far. 

Paypal

The Paypal component written by a colleague Paolo Antinori demonstrates how the API component framework dramatically reduces the amount of code that has to be written and managed for a feature rich component. The component essentially exposes all Paypal APIs from their SDK with minimal amount of hand written code. 

Box.com

This component shows third party Box.com Java SDK being modeled and invoked using the API framework. It also shows how the framework can be easily adapted to write custom consumer polling to support Box.com's long polling API. 

LinkedIn

This component demonstrates how a simple WADL for the LinkedIn REST service and CXF wadltojava plugin can be used to generate an API that can be wrapped using the API component framework to produce a fully functional component that supports most, if not all, LinkedIn REST endpoints. 
This approach can be easily repeated to create a Camel component for any SaaS product or platform. 

Google Drive

This component demonstrates how the API component framework can even handle Method Object style Google APIs, where URI options are mapped to a method object, which is then invoked in an overridden doInvoke() method in consumer and producer. 

Olingo2

This component demonstrates how even a callback based Asynchronous API can be leveraged using the API component framework. This example shows how asynchronous processing can be pushed into underlying resources like HTTP NIO connections to make Camel endpoints more resource efficient. 

Epilogue

As must be clearly evident by now the Camel API Component framework is a powerful micro framework built on top of the Camel component classes. It allows component developers to quickly and efficiently develop components, sometimes in the order of one or two weeks, compared to weeks of development on end, painfully adding one API/feature at a time in the past. 

Another potential use case for the framework is for users of CXF based REST or SOAP services. Although the Camel CXF component is quite capable, it is unable to handle parameter names since CXF does not expose them in its Camel endpoints. This forces users to resort to Lists of un-named values for CXF endpoint parameters. 
By generating a CXF client proxy for an existing REST/SOAP endpoint, and then wrapping it using the API Component Framework, end users can quickly create custom components for invoking their services with named options with type checking. This ot only makes endpoints more readable, it could significantly cut down the amount of time wasted testing and fixing issues with unnamed parameter lists.

The API Component Framework will be available in Apache Camel 2.14.0, and the framework and components will be part of the JBoss Fuse 6.2 release coming soon. 

Hopefully, this posts inspires you to take a look at this framework, and contribute more feature-rich components to Apache Camel for the hundreds of APIs out there. 

Sunday, May 18, 2014

Scalable IoT integration using Apache ActiveMQ and MQTT

I have been doing a lot of work on MQTT support in Apache ActiveMQ recently, starting with hardening and adding support for MQTT 3.1.1 in ActiveMQ for the MQTT Interop Day Event I mentioned in a previous post.

I like MQTT as a simple protocol for IoT. It's easy to implement in devices, and is not overly complicated as protocols go. However, as an experienced JMS architect, the first thing that struck me is that it uses the publish-subscribe model. And as I expected AT_LEAST_ONCE and EXACTLY_ONCE subscriptions in MQTT are mapped to durable subscriptions in ActiveMQ.

This means MQTT consumers are limited to creating a single subscription with a fixed client-id for those QoSs, if they don't want to have to deal with duplicates. Essentially it has the same limitation when it comes to scaling consumers for JMS Topics.

If you aren't already familiar with it, the ActiveMQ documentation describes the issue in more detail. The documentation there also describes the ActiveMQ Virtual Topics feature to solve this problem using logical Topics which are mapped to physical Queues. Messages on these Queues can then be load balanced across multiple connections and consumers without having to worry about duplicates.

Compared to durable subscriptions, Queues also make management and monitoring easier. For instance, monitoring tools can be used to raise an alert when Queue size becomes too large, signaling that messages are piling up in the Broker. This alert could also be used to create more consumer process instances, etc. Apache Camel JMS endpoints, and other JMS utilities such as JMS listeners in Spring Framework can automatically increase or decrease the number of consumers based on demand.

The same documentation page also describes ActiveMQ's Composite Destinations feature for routing messages, which can come in handy as a wire-tap for audit logs, etc.

I have recently submitted a couple of major fixes in AMQ-5160 and AMQ-5187. AMQ-5160 started as an issue with wildcard authorization in ActiveMQ, which I first fixed for non-retained messages. Dejan Bosanac's suggestion of using Subscription Recovery Policy for retained messages, together with my fix for AMQ-5187, now makes it possible to do what I wanted to be able to do early on when I started fixing issues in ActiveMQ MQTT transport, i.e. process MQTT messages using Virtual Topics.

Also, the fix for AMQ-5160 basically adds Retained Messages as a Broker level feature in ActiveMQ. So non-MQTT Topic clients can set the ActiveMQ.Retain boolean property to true to mark a message to be retained in the Topic, and the Broker sets the boolean property ActiveMQ.Retained to true to mark a message as having been recovered as a retained message in a Topic. Note that the Broker always uses RetainedMessageSubscriptionRecoveryPolicy and any user supplied policies are simply added to retained message recovery. So, the user doesn't have to do anything special in the configuration for retained message support.

Retained messages work for mapped JMS Queues by recovering the retained message from the Virtual Topic for the first Queue consumer, so there are no duplicate recovered messages. The retained message will have the property ActiveMQ.Retained set to true.

The patches are waiting further testing and validation and should be applied to ActiveMQ trunk soon, to be included in the 5.10 release.

The highly scalable MQTT solution basically consists of MQTT producers sending messages using the MQTT protocol to ActiveMQ Virtual Topics, which are configured trivially using name patterns. These Virtual Topics are mapped to Queue names used by regular ActiveMQ Java JMS consumers. MQTT messages are mapped to JMS BytesMessages. Java developers should be happy to be able to use their favorite language on the server/consumer side.

Although the ActiveMQ Broker completely manages the QoS flow with the MQTT producer, the JMS BytesMessage will have the property ActiveMQ.MQTT.QoS set to sender's QoS. The JMS consumer does not have to do anything special with it, besides the standard JMS message acknowledgement. This property can also be used by JMS producers as the MQTT QoS for MQTT consumers. Also, JMS consumers can use JMS transactions to include other transactional resources such as databases, either using Idempotent Consumers or in the worst case, XA transactions.

Hopefully, users will have as much fun using these new capabilities in Apache ActiveMQ as I have had developing them. Cheers and good luck with your super scalable MQTT deployments with ActiveMQ. 

Thursday, April 3, 2014

MQTT 3.1.1 support in JBoss A-MQ 6.1, Apache ActiveMQ 5.10-SNAPSHOT and Apache Camel 2.13.0

I had the good fortune of recently attending the MQTT Interoperability Test Day during the recent EclipseCon in Burlingame, California on March 17th 2014. The event was held by the Eclipse Foundation and the Eclipse IoT Working Group.

It's aim was to prove spec maturity of MQTT 3.1.1 by demonstrating industry adoption and interoperability among products that support them, and to potentially iron out any issues in the spec that might show up through the exercise. MQTT is a key protocol for the rapidly growing IoT approach. If you haven't heard of MQTT before you should definitely check out http://mqtt.org/.

At the event I was representing Red Hat Inc. and its JBoss A-MQ 6.1 (Early Access build 367) product, Apache ActiveMQ 5.10-SNAPSHOT, and Apache Camel 2.13.0. As part of the exercise Ian Craggs from the Eclipse Paho team had built a very useful mock client and server to check compliance with the draft MQTT 3.1.1 spec. That Python kit proved very valuable to the Fuse team and I in helping find and address several issues in Apache ActiveMQ's MQTT protocol implementation. More information on Ian's test kit can be found at https://wiki.eclipse.org/Paho/MQTT_Interop_Testing_Day.

As a result of all the testing and fixes, the MQTT implementation in ActiveMQ has improved by leaps and bounds. I also wrote test Java clients and server using the Fuse mqtt-client library, Apache Camel and the Apache ActiveMQ broker in JBoss A-MQ 6.1. The test client I wrote mirrors tests executed by Ian's Python client. It verified compliance with several key improvements in MQTT 3.1.1 listed below:
  1. Basic publish subscribe
  2. Retained messages
  3. Offline message queueing
  4. Will messages for client disconnects
  5. Overlapping subscriptions with MQTT wildcards
  6. Connection keep alive
  7. Redelivery of messages on reconnect
  8. Zero length client id (optional)
  9. Dollar topics (optional)
  10. Subscription failure (optional)
The improvements in ActiveMQ MQTT support were demonstrated when both the client and server passed all the above tests with flying colors when tested against the Python compliance test client and server as well as several other MQTT products tested for interoperability during the Interop Test Day. Since I implemented several fixes in the MQTT transport for ActiveMQ as well as some critical fixes in the Broker for supporting MQTT topics and wildcards, I can state that there is only one spec requirement (MQTT-3.1.4-2) that isn't supported at the moment. 

That requirement is questionable since it mandates that MQTT Brokers MUST disconnect an existing client connection when another connection sends a CONNECT packet with the same client id. This will cause issues in client libraries such as Fuse mqtt-client, which automatically reconnect when disconnected from the Broker. So ActiveMQ chooses to reject the new connection instead. 

The code for my Java JBoss A-MQ test clients and server configuration can be found at https://github.com/dhirajsb/jboss-fuse-mqtt-test. The instructions for running them are simple and found in the README.md files. 

All in all it was a very fruitful event, personally for me since I put in a lot of work to get all the MQTT issues fixed in Apache ActiveMQ, and for Red Hat to be able to now proudly say that we support MQTT 3.1.1 spec in a soon to be GA product JBoss Fuse A-MQ 6.1. 

Of course this wouldn't have been possible without the hard work by Ian Skerret from the Eclipse Foundation in organizing the Interop Test Day, and all the help and support I received from my colleagues in Fuse engineering team at Red Hat.

I hope you take the time to check out the Java test clients and Broker and have lots of fun using MQTT in your applications.