Hi! Welcome...

I am Ashish. I am Member of Apache MINA PMC, ASF Committer and avid Code hacker. Contact me at paliwalashish at gmail dot com. I work for Terracotta [http://www.terracotta.org] as Solution Architect. I contribute to jsmpp [http://code.google.com/p/jsmpp/] as well.

23 November 2009 ~ 1 Comment

Getting Started with Terracotta


Abstract

In this small post we shall explore Terracotta, a leading pure Java Scalability platform. The discussion is based on the AtomicInteger example from Terracotta site, which shows how to implement a Cluster wide id generator (Actually it’s the Sequencer example, but to keep my steps simple had used AtomicInteger). The reason why I choose this example was coz of a very similar requirement that I had to implement in Clustered J2EE application.
Terracotta is well known and needs no introduction :-)

About problem Statement

Well, I needed a simple solution to have
• Cluster wide unique id’s
• Less frequent access to these id’s
• Optional Persistence

NOTE: Please note that the current example is slightly modified version of example from terracotta.org.

Pre-requisite

To run this example, you need to have following installed
• Terracotta
• And JDK offcourse :-)
Let’s take a look at the sample code first

import java.util.concurrent.atomic.*;

public class IdGenerator
{

private AtomicInteger masterCounter = new AtomicInteger(0);

private AtomicInteger slaveCounter = new AtomicInteger(1024);

public int getNewMasterId()
{
int newId = masterCounter.getAndIncrement();
System.out.println("Master Id is: " + newId);
return newId;
}

public int getNewSlaveId()
{
int slaveId = slaveCounter.getAndIncrement();
System.out.println("Slave Id is: " + slaveId);
return slaveId;
}

public static void main(String[] args)
{
new IdGenerator().getNewMasterId();
}
}

The code has two fields ids master and slave, of which master id need to be unique cluster wide. The code is pretty straight forward, and there is nothing to explain. The id’s are generated by simple getAndIncrement() API.

Lets see the tc-config.xml


<tc:tc-config xmlns:tc="http://www.terracotta.org/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-4.xsd">

<application>
<dso>
<roots>
<root>
<field-name>IdGenerator.masterCounter</field-name>
</root>
</roots>
</dso>
</application>
</tc:tc-config>

Since only master Id needed to be unique in this case, hence the configuration

Let’s look at the deployment architecture

Deployment Architecture

Deployment Architecture

The application was to run on 3 nodes (3 JVM’s on same machine) and one Terracotta server running.
Before we get into running this application, lets take a look at some useful scripts provided by Terracotta
dso-java (bat|sh) – startup script that bootstraps Terracotta libraries to your application
start-tc-server (bat|sh) – Script to start Terracotta Server. It is mandatory to start Terracotta Server before running Clients

Let’s run the application
1. Compile the application. There is no dependency on TC libraries.
2. First Start the terracotta Server using script start-tc-server
3. Run 3 separate JVM with command “dso-java IdGenerator”
That’s it. Our first application is running. If we observe the console output, you can see that we have unique id’s for each invocation.

References:

Source code for Terracotta examples - http://svn.terracotta.org/svn/forge/cookbook

29 September 2009 ~ 1 Comment

Executors in Action – Part 1 – The Life cycle


From the Javadoc

… interface provides a way of decoupling task submission from the mechanics of how each task will be run, including details of thread use, scheduling, etc. An Executor is normally used instead of explicitly creating threads.

Indeed, Executors provide a more robust and easy mechanism for decoupling of submitting a task, from its execution.

In this series, we shall explore this concurrency construct in detail.

NOTE: Have tried to avoid explaining what is already available in javadoc.

Let’s first see the lifecycle of an Executor

ExecutorService extends the Executor interface, adding a bunch of life cycle methods.

Executor Life Cycle

Executor Life Cycle


Let’s create a simple program to demonstrate the same

public static void main(String[] args) {
// Executor doesn't exist before this step
// Executor created and moved to running state
ExecutorService executor = Executors.newSingleThreadExecutor();

// Executor into running state
executor.execute(new Runnable() {
public void run() {
System.out.println("Running...");
}
});

// Lets shutdown
executor.shutdown();

System.out.println(" isShutdown = "+executor.isShutdown());
System.out.println(" isTermianted = "+executor.isTerminated());
}

The program depicts the complete lifecycle. Please see the comments inline in the code

Initially the Executor doesn’t exist. When we create an Executor, using Executor’s we have it in Ready state (ready to accept tasks for execution).
Once we submit a task for execution, the Executor moves to running state. Once it completes the task, it moves back to ready state.

Once we call shutdown() on Executor, it begins its shutdown cycle. Once it is completed it moves to Terminated state.

There are couple of things that need to be noted after shutdown is initiated. An Executor can be executing or may have couple of tasks pending to be executed.

Calling shutdown() initiates an orderly shutdown, means all the pending tasks shall be completed, but new tasks won’t be accepted

Calling shutdownNow() initiates a shutdown, by trying to halt the processing the tasks by interrupting the threads. The method returns a list of Runnable’s which were not executed. Programs can use them to reschedule them later or discard them.

What will happen, if we submit a task once shutdown has been initiated?
This situation can be handled by using RejectExecutionHandler. The default implementation simply discards the tasks. We can have the custom logic to handle the tasks that couldn’t be executed.

Upcoming topic – Exploring a little on ScheduledExecutorService

17 September 2009 ~ 0 Comments

MINA vs Netty Series – Closing down


Folks after posting two articles in the series, I have decided to hold my work on this for a while. I am working of new MINA User Guide, and will be busy over there.

However, this is not an end to this. May be sometime in the future, when I am have bandwidth available shall start working on this again.

You can check the new MINA User Guide at the following link http://mina.apache.org/user-guide.html

02 August 2009 ~ 29 Comments

If-else vs switch – Which is better?


"Use switch instead of if-else, its more readable and has better performance." I have to admit that this was one of my favorite code review comment. Until one fine day, while hacking Apache Sanselan's image format decoding function, I tried optimizing the code based on the same comments and while benchmark there was hardly any difference. I thought about investigating it further. Though found some interesting mail chains, though about posting my finding.

To begin with decided to run some samples on switch and if-else constructs and analyze further.

Wrote three function
1. For if-else - a if-else ladder based on int comparisons
2. For Switch - switch with 21 cases, from 1 to 20
3. For Switch - switch with sparse random values

The reason for choosing two functions for switch was this statement from VM spec "Compilation of switch statements uses the tableswitch and lookupswitch instructions"

Lets see the function codes

if-else

public static void testIfElse(int jumpLabel) {
if(1 == jumpLabel) {
    System.out.println("1");
} else if(2 == jumpLabel) {
    System.out.println("2");
} else if(3 == jumpLabel) {
    System.out.println("3");
} else if(4 == jumpLabel) {
    System.out.println("4");
}
// Removed for simplicity
else {
    System.out.println("default");
}
}

Lets see the switch functions

Finite switch version

public static void testSwitchFinite(int jumpLable) {

switch (jumpLable) {
case 1:
    System.out.println("1");
    break;

case 2:
    System.out.println("2");
    break;

case 3:
    System.out.println("3");
    break;

case 4:
    System.out.println("4");
    break;

case 5:
    System.out.println("5");
    break;

// Removed other cases for simplicity

default:
    System.out.println("default");
    break;
}
}

Sparse switch version

public static void testSwitchSparse(int jumpLable) {

switch (jumpLable) {
case 100:
    System.out.println("1");
    break;

case -1:
    System.out.println("2");
    break;

case 5000:
    System.out.println("3");
    break;

case -8:
    System.out.println("4");
    break;

case 1600:
    System.out.println("5");
    break;

case 250:
    System.out.println("250");
    break;

// Removed other cases for simplicity

default:
    System.out.println("default");
    break;
}
}

With the groundwork done, its the benchmarking time. The benchmarking strategy was simple. Run these functions in loop and see the result, with iteration ranging from 100 to 1000.

Lets look at one of the functions

public static void testSwitchPerf(int iteration) {
    long t1 = System.nanoTime();
    for (int i = 0; i < iteration; i++) {
        testSwitchFinite(i);
    }
    long t2 = System.nanoTime();
    System.out.println("Time Taken (switch) = "+(t2 - t1)/1000000);
}

Well this was the ground work, after executing the conditions, here is the data

Iteration -> 100 1000
if-else 8 ms 69 ms
switch finite 3 ms 34 ms
switch sparse 7 ms 21 ms

NOTE: There is some difference due to the way data is provided and the sample space doesn't provide precise results. However, since the sample space is same for both, it would serve its purpose

Conclusion

1. There is no significant execution difference between if-else and switch. The difference observed is due to the sample space choosen.
2. If using if-else, its always recommended to put frequently used if condition at the top of if-else ladder
3. The finite switch statement was converted to tableswitch and sparse switch was converted to lookupswitch

would be interested to hear from folks about their experience in relation to this. I would say, i still prefer switch for readability. Henceforth, my review comment shall be modified as "Use switch instead of if-else, its more readable and has better performance"

Share your thoughts

I would be very keen to hear from you all, about your opinions and experience about the topic

18 July 2009 ~ 0 Comments

[Apache Sanselan] Demystifying how Sanselan determines image format


Apache Sanselan is a pure Java library for reading and Writing Image formats. It has recently graduated from Incubator and is now a proud member of Apache commons proper.

In previous articles, we saw how to retrieve Image metadata and information. In this post we shall see how Sanselan guesses the Image format. We shall take it in two steps

  • First we shall look at the ImageFormat class
  • Then, we shall look into the implementation guessFormat() API

ImageFormat class

org.apache.sanselan.ImageFormat Class has three members name, extension and actual. Name and extension are same, however I am not sure I understand the use of actual variable.

The class has list of formats (including Unknown) supported by the library. They are all instances of ImageFormat class. The list is

  1. IMAGE_FORMAT_UNKNOWN
  2. IMAGE_FORMAT_PNG
  3. IMAGE_FORMAT_GIF
  4. IMAGE_FORMAT_ICO
  5. IMAGE_FORMAT_TIFF
  6. IMAGE_FORMAT_JPEG
  7. IMAGE_FORMAT_BMP
  8. IMAGE_FORMAT_PSD
  9. IMAGE_FORMAT_PBM
  10. IMAGE_FORMAT_PGM
  11. IMAGE_FORMAT_PPM
  12. IMAGE_FORMAT_PNM
  13. IMAGE_FORMAT_TGA
  14. IMAGE_FORMAT_JBIG2

guessFormat API – Under the hood

The guessFormat() API looks at the initial 2 to 4 bytes, also known as magic numbers, to determine the Image Format.

Algorithm is as follows:

  • Read first two bytes
  • Match with existing set of magic numbers and determine format
  • It use byte 3 and 4 to determine format for JBig2

The list of magic numbers for different formats is as follows

Format Byte 1 Byte 2
IMAGE_FORMAT_GIF 0x47 0x49
IMAGE_FORMAT_PNG 0x89 0x50
IMAGE_FORMAT_JPEG 0xff 0xd8
IMAGE_FORMAT_BMP 0x42 0x4d
IMAGE_FORMAT_TIFF

(Motorola byte order)

0x4D 0x4D
IMAGE_FORMAT_TIFF

(Intel byte order)

0x49 0x49
IMAGE_FORMAT_PSD 0x38 0x42
IMAGE_FORMAT_PBM 0x50 0x31 or 0x34
IMAGE_FORMAT_PGM 0x50 0x32 or 0x35
IMAGE_FORMAT_PPM 0x50 0x33 or 0x36
IMAGE_FORMAT_JBIG2 0x97 0x4A

In addition to this IMAGE_FORMAT_JBIG2 format, byte 3 must be equal to 0x42 and byte 4 must be equal to 0x32.

Based on this table, Sanselan recognizes the Image format. If the magic numbers don’t match from the one in the table, it returns IMAGE_FORMAT_UNKNOWN

References

Sanselan.java