Advertisements
RSS

Tag Archives: Events

Publishing Apache Avro messages on a Apache Kafka topic

In earlier posts I played around with both Apache Avro and Apache Kafka. The next goal was naturally to combine both and start publishing binary Apache Avro data on a Apache Kafka topic.

screen-shot-2016-09-11-at-3-28-49-pm

Generating Java from the Avro schema

I use the  Avro schema “location.avsc” from my earlier post.

$ java -jar avro-tools-1.8.1.jar compile schema location.avsc .

Which results in the Location.java for our project.

/**
* Autogenerated by Avro
*
* DO NOT EDIT DIRECTLY
*/
package nl.rubix.avro;

import org.apache.avro.specific.SpecificData;
// ... and more stuff

Make sure we have the maven dependencies right in our pom.xml:

<dependencies>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka_2.11</artifactId>
        <version>0.10.0.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro</artifactId>
        <version>1.8.1</version>
    </dependency>
  <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-maven-plugin</artifactId>
        <version>1.8.1</version>
    </dependency>
</dependencies>

We can now use the Location object in Java to build our binary Avro message

public ByteArrayOutputStream GenerateAvroStream() throws IOException
{
    // Schema
    String schemaDescription = Location.getClassSchema().toString();
    Schema s = Schema.parse(schemaDescription);
    System.out.println("Schema parsed: " + s);

    // Encode the data using JSON schema and embed the schema as metadata along with the data.
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    DatumWriter<GenericRecord> writer = new GenericDatumWriter<GenericRecord>(s);
    DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<GenericRecord>(writer);
    dataFileWriter.create(s, outputStream);

    // Build AVRO message
    Location location = new Location();
    location.setVehicleId(new org.apache.avro.util.Utf8("VHC-001"));
    location.setTimestamp(System.currentTimeMillis() / 1000L);
    location.setLatitude(51.687402);
    location.setLongtitude(5.307759);
    System.out.println("Message location " + location.toString());

    dataFileWriter.append(location);
    dataFileWriter.close();
    System.out.println("Encode outputStream: " + outputStream);

    return outputStream;
}

When we have our byteArrayOutput stream we can start publishing it on a Apache Kafka topic.

public void ProduceKafkaByte()
{
    try
    {
        // Get the Apache AVRO message
        ByteArrayOutputStream data = GenerateAvroStream();
        System.out.println("Here comes the data: " + data);

        // Start KAFKA publishing
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("serializer.class", "kafka.serializer.StringEncoder");
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.ByteArraySerializer");

        KafkaProducer<String, byte[]> messageProducer = new KafkaProducer<String, byte[]>(props);
        ProducerRecord<String, byte[]> producerRecord = null;
        producerRecord = new ProducerRecord<String, byte[]>("test","1",data.toByteArray());
        messageProducer.send(producerRecord);
        messageProducer.close();
    }
    catch(IOException ex)
    {
        System.out.println ("Well this error happened: " + ex.toString());
    }
}

When we subscribe on our topic we can see the bytestream cruising by:

INFO Processed session termination for sessionid: 0x157d8bec7530002 (org.apache.zookeeper.server.PrepRequestProcessor)
Objavro.schema#####ype":"record","name":"Location","namespace":"nl.rubix.avro","fields":[{"name":"vehicle_id","type":"string","doc":"id of the vehicle"},{"name":"timestamp","type":"long","doc":"time in seconds"},{"name":"latitude","type":"double"},{"name":"longtitude","type":"double"}],"doc:":"A schema for vehicle movement events"}##<##O#P#######HC-001#ڲ#
=######@#####;@##<##O#P#######016-10-18 19:06:24,005] INFO Expiring session 0x157d8bec7530005, timeout of 30000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2016-10-18 19:06:24,005] INFO Processed session termination for sessionid: 0x157d8bec7530005 (org.apache.zookeeper.server.PrepRequestProcessor)

All code available in github here.
github

Advertisements
 
Leave a comment

Posted by on 19-10-2016 in Uncategorized

 

Tags: , , , , , , ,

Playing around with Apache Avro

When entering the world of Apache Kafka, Apache Spark and data streams, sooner or later you will find mentioning of another Apache project; namely Apache AVRO. So ….

What is Apache Avro ?

avro-logo

Avro is a remote procedure call and data serialization framework developed within Apache’s Hadoop project (source: wikipedia). It is much the same as Apache Thrift and Google Protocol Buffers. Probably the main reason for Avro to gain in popularity is due to the fact that Hadoop-based big data platforms natively support serialization and deserialization of data in Avro format. Avro is based upon JSON based schemas and messages can be sent in both JSON and binary format. If binary is used then the schema is sent together with the actual data,

Playing with Apache Avro from the command line

So first let’s create a Avro schema “location.avsc” for the data records

{"namespace": "nl.rubix.avro",
  "type": "record",
  "name": "Location",
  "fields": [
    {"name": "vehicle_id", "type": "string", "doc" : "id of the vehicle"},
    {"name": "timestamp", "type": "long", "doc" : "time in seconds"},
    {"name": "latitude", "type": "double"},
    {"name": "longtitude",  "type": "double"}
  ],
  "doc:" : "A schema for vehicle movement events"
}

And we have this example file “location1.json” with a valid data record

{"vehicle_id": "1", "timestamp": 1476005672, "latitude": 51.687402, "longtitude": 5.307759}

Working with the Avro tools

Download the latest version of the Avro tools (currently 1.8.1) from the Avro Releases page.

$ java -jar avro-tools-1.8.1.jar
Version 1.8.1 of Apache Avro
Copyright 2010-2015 The Apache Software Foundation

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
----------------
Available tools:
          cat extracts samples from files
      compile Generates Java code for the given schema.
       concat Concatenates avro files without re-compressing.
   fragtojson Renders a binary-encoded Avro datum as JSON.
     fromjson Reads JSON records and writes an Avro data file.
     fromtext Imports a text file into an avro data file.
      getmeta Prints out the metadata of an Avro data file.
    getschema Prints out schema of an Avro data file.
          idl Generates a JSON schema from an Avro IDL file
 idl2schemata Extract JSON schemata of the types from an Avro IDL file
       induce Induce schema/protocol from Java class/interface via reflection.
   jsontofrag Renders a JSON-encoded Avro datum as binary.
       random Creates a file with randomly generated instances of a schema.
      recodec Alters the codec of a data file.
       repair Recovers data from a corrupt Avro Data file
  rpcprotocol Output the protocol of a RPC service
   rpcreceive Opens an RPC Server and listens for one message.
      rpcsend Sends a single RPC message.
       tether Run a tethered mapreduce job.
       tojson Dumps an Avro data file as JSON, record per line or pretty.
       totext Converts an Avro data file to a text file.
     totrevni Converts an Avro data file to a Trevni file.
  trevni_meta Dumps a Trevni file's metadata as JSON.
trevni_random Create a Trevni file filled with random instances of a schema.
trevni_tojson Dumps a Trevni file as JSON.

Generate data record from JSON to Avro

$ java -jar avro-tools-1.8.1.jar fromjson --schema-file location.avsc location1.json > location.avro

The result is an output “location.avro” with the Avro binary. The interesting about Avro is that is encapsulates both the schema and the content in it’s binary message.

Objavro.schema#####ype":"record","name":"Location","namespace":"nl.rubix.avro","fields":[{"name":"vehicle_id","type":"string","doc":"id of the vehicle"},{"name":"timestamp","type":"long","doc":"time in seconds"},{"name":"latitude","type":"double"},{"name":"longtitude","type":"double"}],"doc:":"A schema for vehicle movement events"}avro.codenull~##5############.1м##

Retrieving the JSON message from Avro data

$ java -jar avro-tools-1.8.1.jar tojson location.avro > location_output.json

Retrieving the Avro schema from Avro data

And because the schema is present in the data we can retrieve the schema as well.

$ java -jar avro-tools-1.8.1.jar getschema location.avro > location_output.avsc

References:

 
1 Comment

Posted by on 18-10-2016 in Uncategorized

 

Tags: , , , , , ,

Fusion Middleware Partner Community Forum XX

Oracle Partner Community Forum

During the 1st week of March I had the pleasure of visiting the beautiful city of Budapest. Location of the 20th(!!) Oracle Fusion Middleware Partner Community Forum (#ofmForum). More than 180 delegates from more than 30 countries will visited the Boscolo Hotel Conference Center to share, learn and network with other Fusion Middleware minded. Oracle offered a great package for staying in the Boscolo hotel where the conference was, however in this era of “digital disruption” I decided to book a AirBNB appartment just 1 block from the hotel.

Boscolo hotel

Boscolo hotel

 

Day 1:

Oracle ACE briefing
Due to the amount of Oracle ACEs in Budapest, Jürgen Kress organized a sort of EMEA Oracle ACE briefing session. Where the ACE community had the opportunity to talk to both keynote speakers Andrew Sutherland and Amit Zavery. The topics where current and near-future Oracle Fusion Middleware products and PaaS Cloud Services.

Oracle Platform as a service (PaaS)

After Jurgens welcome speech it is time for Oracle Amit Zavery to start the community forum with a keynote. His session explains the Oracle Fusion Middleware strategy which (no surprise here) is cloud, cloud and some more cloud. Or as Amit stated himself: “Cloud is pervasive across everything we do at Oracle these days”. Cloud Platform for the digital business, was a phrase used multiple times.

Oracle Fusion Middleware Update
Andrew Sutherland (Senior Vice President of Technology and Systems) takes us on the trip explaining the digital disruption taking places in numerous markets at the moment. Think of websites/services like AirBNB, Spotify, Uber that are immediate threats to the current order counterparts. Board rooms should be scared of Digital Disruption. Digital Business needs both speed and agility. Modern business is all about real time and we should avoid processes which depends on batches.

SOA Suite 12c & cloud platform
The session explained important elements of service integration with both the current Oracle SOA Suite 12c and the future roadmap for Oracle towards the cloud. Including a sneak peek to the soon to be released Integration Cloud Service (ICS). The session also allowed for a great presentation (and live demo) about Stream Explorer 12c by Oracle ACE Director Lucas Jellema.

BPM Suite 12c & cloud platform
With the release of Oracle BPM 12c some new features were added to allow better (web based) modelling and on the other hand improve developer productivity. Not strange, since the same web interface will be used for the soon to be released Process Cloud Service. The session also allowed us to get a sneak peak of the new Process Cloud Service which is a fully self-service, PaaS service on the Oracle Public Cloud that allows for complete life cycle management of processes.

Social Event @ Spoon Budapest
The (official) part of day 1 ended with a dinner hosted by Oracle at the Spoon restaurant boat in the Donau river. While meeting old friends (and creating new ones) enjoying the fantastic diner with a marvelous view on the Buda Castle.

Spoon Restaurant @ Budapest

Spoon Restaurant

 

Day 2:

API Management (SOA track)
Robert van Molken & Yogesh Sontakke present the Oracle vision on API management.API management is about managing, discover, govern, monitor, and support your business APIs (services). The presenters first showed the highover positioning of products like Oracle API Catalog, Oracle API Gateway, Oracle API Manager & Oracle Service Bus. Later Robert went in depth for us showing details how to use the Oracle API Catalog with a live demo (very brave) and Yogesh did the same with the Oracle API Manager.

Oracle Business Actiivty Monitoring 12c (BPM track)
Mark Simpson explains that BAM is now adopted by the BPM team which shows the strategic focus from Oracle. Oracle BAM 12c is a completely redesigned product that provides real time insight into process. Oracle BAM 12c has some new features as geomaps, multi-browser support (no more IE yeah!), mobile enablement & better role based security options. The session showed different key features in BAM to support 4 important sections of information: Operation Analytics, Business Analytics, Operational Intelligence & Strategic Analytics. Mark also handed us some dashboard d

Oracle ACM Implementation Best Practices (BPM track)
Presented by both Danilo Schmiedel & Andrejus Baranovskis telling their experiences about implementation of ACM at 3 different (insurance) companies. Danilo first started with the characteristics of unstructured processes and the requirements that knowledge workers have on the IT solution. Also the importance of early-on case UI design is essential to achieve insight on business information requirements. Needed for dynamic case management is the definition of your activities, milestones, rules, events, data & case stakeholders. Danilo also mentions the ProM Tools for Process Mining research and extraction of knowledge about a (business) process from its process execution logs. Surely need to look at that. After the session there was a very valuable talk with Dirk Janssen, Harrie van Oosten &  Danilo Schmiedel to share experiences we with the Oracle BPM/ACM implementations in both Germany and the Netherlands.

Speaker & ACE Dinner
Finalized my stay in Budapest with a great diner in the hills of Buda.

 

 
Leave a comment

Posted by on 12-03-2015 in Common, Events

 

Tags: ,

Oracle BPM / ADF integration Best Practices (#oow)

Last OOW I visited the presentation of Danilo Schmiedel (blog & twitter) and Andrejus Baranovskis (blog & twitter) covering the topic “Oracle BPM & Oracle ADF” integration best practices. With my experience in BEA AquaLogic BPM 6 and Oracle BPM 10 the ADF integration in 11g was all new for me, so with just playing around with it I was very interested in the best practices.combining these 2 Oracle technologies for rich user interfaces.

So last month we started our Oracle BPM 11g project and during “sprint 0” we came to the conclusion that we would have long-running instances with a lot of human interaction and discussed the look-and-feel posibilities with our teams user experience expert. So a good time to look back upon the best practices from both gentlemans. So we used their presentation and advice to discuss the option of creating a generic ADF human task handler.

Check out Danilo’s blogpost about their presentation yourself and off course the presentation on slideshare.



 
Leave a comment

Posted by on 30-10-2012 in Events

 

Tags: , , ,

Oracle Open World 2012 – it’s a wrap !

After 5 days with many sessions, keynotes, hands-on-labs and even more parties we can call it a week. Fantastic meeting old and new colleagues/friends/partners and more.

Oracle Open World 2012 ended with a blast! While the blue angels made a fly by above Yerba Buena Gardens the Swedish garage rock band The Hives showed the Oracle crowd why they won awards for being the best live act.


Tomorrow we will return home, hoping to see you all again next year!

 
Leave a comment

Posted by on 05-10-2012 in Events, Oracle

 

Tags: ,

Oracle Open World 2012 Wednesday

4th day of the Oracle Open World conference and today my amount of sessions visited was clearly decreasing.

CON7344 – Anatomy of a Scalable Oracle BPM Suite 11g project

The session gave a lof of tips and best practices for running a BPM project in general (not purely Oracle BPM as a product).

Typical issues you may run into during your project:

  • Weak solution design: one super composite with 40+ components, no versioning and archiving strategy
  • Weak preparation of business: task unattended when someone leaves, no process monitoring
  • Weak operational: missing fault handling, no end-to-end monitoring

Mitigate that risk in your project:

  • Sit and think before you build: distinguish micro-process vs enterprise grade processes: Solution architecture & design, define guidelines
  • Engage early with business: demo simple prototype, play sandbox, ask the right questions
  • Engage early with infra: layout infra, control with enterprise manager 12c

Process definition:

  • Identify human task detaiuls: in/out documents, in/out business terms, assignemt patterns, role and mode, escalation rules, out-of-office policy
  • Identify business rules task details: in/out payload, decision table, rule manageability
  • Identify service task details: service and operation, in/out payload

SOA reference architecture. Lessons learned. No super composite.

Consistent developer: same jdev, same jdk, same parameters, build en deploy tools (maven)

Creating human tasks:

  • Make sure to always decouple human task complex payload from the process: use keys in the process, pass it to the human task. Do NOT import complex models (large xsds as human task payload), implement proper validation, leverage validation callback.
  • When assigning task to a performer: always use group assignments mapped against authorization repository, or leverage oracle entitlement server.
  • Always use a process ownmer, so unattended tasks can be rerouted to the owner
  • Leverage mapped attributes for business convenience

Fault handling:

  • Types of bpmn exception: Business exception (user defined exception)
  • Fault policy: catch system exceptions
  • How to define: composite based, component based, reference based

Oracle Open World youtube channel

The keynotes and interviews from OOW 2012 are available on the official Oracle youtube channel. One session that I missed but watched on this channel myself was quit interesting, the movie is called ‘Oracle Fusion Middleware Strategies Driving Business Innovation’

Oracle Partner Network @ OOW

Visited the Oracle Partner Network lounge in Moscone South and talked with some other Oracle partners who also visited Oracle Open World this year.

Appreciation Event @ Treasure Island

Highlight of this years Oracle Open World was obviously the appreciation event held on wednesday evening. As always the event is arranged perfectly with massive amounts of people traveling to and from Treasure Island, and of course the huge amount of food and beverage being offered to all the attendees.

Kings of Leon:

and of course legendary Pearl Jam:

Pearl Jam closing the night with a fabulous tribute to The Who with their song Baba O’Riley !!!

 

 
1 Comment

Posted by on 04-10-2012 in Events, Oracle

 

Tags: ,

Oracle Open World 2012 Tuesday

Not much session today but watched 2 keynotes, walked the demo grounds and attended only 1 Oracle BPM/ADF session which was very good.

CON2787 – Oracle BPM & ADF Integration Best Practices

Very good session by Danilo Schmiedel and Andrejus Baranovskis explaining the best practices for the intregration points between Oracle BPM and Oracle ADF. By default you can use the BPM workspace application or create a custom ADF application where you need to import the BPM workspace JAR libraries. Session started with explanation about the complexity challenges both gentlemen had at their project. A project where long running processes where created, each with a huge amount of human tasks and a load of over 300.000 instances/year.

To prevent creating huge amounts of ADF artifacts for each human task, they created a common data model for the human tasks and use standardization. The generic ADF Human Task Handler then uses routing on security groups an generate/render the correct options (buttons) for that task based upon an external configuration. I found this very interesting and need to look into this further when back home.

During the session some administrative reminders where made about the complex Oracle BPM environment as it existed in their project. “Housekeeping” was the term used, and reminded us to monitor the filesystem, configure Enterprise Manager correctly, arrange proper monitoring, use JRockit Mission Control for tuning and manage the database growth (MDS / SOAINFRA). Make sure you think about purging and read the Oracle Fusion Middleware administration guide about managing database growth.

KEY10724 – Oracle OpenWorld Keynote: Oracle and Infosys

The keynotes on tuesday are something to look forward to. Due to the perfect weather in San Francisco we enjoyed watching them from the Yerba Buena Garden. The keynotes started with Infosys and friends, but finally the big man himself came on stage. You can watch his keynotes on youtube:

Oracle Benelux Party @ Ruby Sky bar

Ended the night in style in one of the hottest clubs in San Francisco, the Ruby Skye. The event was open for Oracle customers, partners and employees from the Benelux region. Party made possible by the Oracle partners VXCompany and Qualogy.

 
Leave a comment

Posted by on 03-10-2012 in Events

 

Tags: ,