Hi! Welcome...

I am Ashish. I am Member of Apache MINA PMC, ASF Committer and avid Code hacker. Contact me at paliwalashish at gmail dot com. I work for Terracotta [http://www.terracotta.org] as Solution Architect. I contribute to jsmpp [http://code.google.com/p/jsmpp/] as well.

09 February 2010 ~ 0 Comments

So you want Distributed, Scalable and Highly Available Cache?


Abstract

So you want Distributed, Scalable and Highly Available Cache? If yes, then this is the right place for you. The Terracotta Ehcache release has these features inbuilt. The Express Installation mode has simplified Terracotta integration in your application.

How this post is organized
First we shall start with simple standalone cache and then see how Terracotta can easily help us create a Distributed, Scalable and Highly Available cache, with a few minor configurations

What all do you need run the example?

  • Terracotta 3.2.0 - Download it from here


Sample Application
Lets design a sample application (a real simple one), which we shall use to demonstrate the features. Device Monitoring is a common requirement in OSS System, and caching the Device information reduces the Database hits. The essential components of our application are
- DevideInfo - A POJO that stores device information
- CacheHandler - Class that handles cache initialization and other ops

Lets see the CacheHandler class, which is the heart of our implementation

public class DeviceInfo  implements Serializable {

    String deviceId;

    String name;

    // .... other device information

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getDeviceId() {
        return deviceId;
    }

    public void setDeviceId(String deviceId) {
        this.deviceId = deviceId;
    }
}

The class is simple and contains information related to the device.

Let see our CacheHandler class

public class CacheHandler {

    public Cache deviceCache;

    public void initCache() {
        // Initialize the CacheManager
        CacheManager cacheManager = CacheManager.create();
        // Get the Cache to store device information
        deviceCache = cacheManager.getCache("deviceCache");
    }

    public void addToCache(DeviceInfo device) {
        Element el = new Element(device.getDeviceId(), device);
        deviceCache.put(el);
    }

    protected int getDeviceCacheSize() {
        return deviceCache.getSize();
    }

    public static void main(String[] args) {
        CacheHandler handler = new CacheHandler();
        handler.initCache();

        // Lets add 10 devices
        // just to keep life simple
        for(int i = 0; i < 10; i++) {
            DeviceInfo device = new DeviceInfo();
            device.setDeviceId("Device-"+i);
            handler.addToCache(device);
        }

        // Not recommended in production
        System.out.println("Cache Size = "+handler.getDeviceCacheSize());

    }
}

The class has two main functions
- initCache - The API initializes the cache
- addToCache - The API adds the device information to the device cache

Lets see our ehcache.xml

<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="ehcache.xsd"
             updateCheck="true" monitoring="autodetect">

   <defaultCache
            maxElementsInMemory="10000"
            eternal="false"
            timeToIdleSeconds="120"
            timeToLiveSeconds="120"
            overflowToDisk="true"
            diskSpoolBufferSizeMB="30"
            maxElementsOnDisk="10000000"
            diskPersistent="false"
            diskExpiryThreadIntervalSeconds="120"
            memoryStoreEvictionPolicy="LRU"
            />

    <cache name="deviceCache"
           maxElementsInMemory="1000000"
           maxElementsOnDisk="1000"
           eternal="false"
           overflowToDisk="false"
           diskSpoolBufferSizeMB="20"
           timeToIdleSeconds="100"
           timeToLiveSeconds="100"
           memoryStoreEvictionPolicy="LFU">
     </cache>

</ehcache>

In the configuration we have defined a deviceCache, that we shall use to store device information.

Following is the list of jars you would need to run this app

  • ehcache-core-1.7.2.jar
  • slf4j-api-1.5.8.jar
  • slf4j-jdk14-1.5.8.jar

All these jars are available as part of Terracotta installation under /distributed-cache directory

You can run this example and see the same in action.


Making it Distributed....

Now lets assume that we want to make the Cache fault tolerant, survive JVM crashes, as well as have it available on different JVM's. It should be able to Scale as we add more JVM's to it and other stuff. With Terracotta, we can just do it in a few steps without changing the code. the magic lies with Express Installation mode, introduced in Terracotta version 3.2.0 and above

Lets see the 3 steps that we need to perform

1. Updates to the ehcache.xml
Add following element to ehcache.xml

<terracottaConfig url="localhost:9510" />

and for the deviceCache, we add <terracotta /> element

here is the updated ehcache.xml

<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="ehcache.xsd"
             updateCheck="true" monitoring="autodetect">

   <terracottaConfig url="localhost:9510" /> 

   <defaultCache
            maxElementsInMemory="10000"
            eternal="false"
            timeToIdleSeconds="120"
            timeToLiveSeconds="120"
            overflowToDisk="true"
            diskSpoolBufferSizeMB="30"
            maxElementsOnDisk="10000000"
            diskPersistent="false"
            diskExpiryThreadIntervalSeconds="120"
            memoryStoreEvictionPolicy="LRU"
            />

    <cache name="deviceCache"
           maxElementsInMemory="1000000"
           maxElementsOnDisk="1000"
           eternal="false"
           timeToIdleSeconds="100"
           timeToLiveSeconds="100"
           memoryStoreEvictionPolicy="LFU">

        <terracotta />
     </cache>

</ehcache>

There is no need to change the code :-)

2. Add ehcache-terracotta-1.8.0.jar to the classpath

Lets start the Terracotta Server

3. Goto Terracotta_Install_dir/bin and execute

$ ./start-tc-server.sh

This shall start the Terracotta server

Now run the application on multiple JVM's. The device information is available on all client JVM's :-)

To monitor Cache, you can use Terracotta Dev Console

If you have further question/queries, please visit Terracotta Forums.

What's coming up Next?
The next post shall touch on write-behind feature in the latest Terracotta Darwin release. Its essentially to get best of both worlds - Caching and Database Offloading :-) Stay tuned..


Sphere: Related Content

25 January 2010 ~ 2 Comments

Exploring Terracotta tim-async - Scalable way to off-load DB updates from App Transaction


Lets start this chat with a small conversation

Manager: What are the options we have to increase the TPS of our SMSC Server?

Architect: Can you elaborate a bit on this?

Manager: As of now we process a request and update the Database in the same tranaction. Essentially, we are operating our Server at the TPS at which our DB can operate.

Architect: Well we have to ensure that our Data is persisted

Manager: Yes, but can't we update the DB asynchronously, without lossing data

Architect: Yes it can be done, but we have to create a lot of infrastructure code around this, so that we don't loose data for failed transaction, process crashes etc.

If you have been in such a situation, the post is meant for you.

Summary

For any request, don't update the Database in same transaction, but data need to be persisted. This has always been a typical need to increase the liveliness of the application. The application accessing data in memory work must faster, than fetching/updating data form databases. Assume you have a solution which helps you to work in memory and allows your application to update the data to Datastore in separate transaction, providing HA and Scalability out of the box. If this is what you are looking at, Terracotta async processor (tim-async) is the right fit for you.

The article is organized to detail what tim-async is, followed by a sample application.

We shall see tim async in brief and then later would walk through the a small POC that I build around the same.

Pre-requisite

You should have basic understanding of Terracotta 3.2 and its Configuration.
You need to download tim-async using tim-get tool

What is tim-async?

tim stands for Terracotta Integration Module, and tim-async is module for asynchronous processing. tim-async provides a scalable, high performance way to asynchronously write the business data to data source (typically a database), while the application works on in-memory data structures. Decoupling the main processing from the underlying database decreases the write latency of domain objects, while high availability of data is provided by terracotta.

What features does it have

  • Multithreaded write behind to flush the processed data from the client VMs (L1s) to the data source
  • Data to be flushed remains highly available (HA) as it is shared with terracotta.
  • Every client takes the responsibility of writing its data set to the data source
  • Listeners to monitor work progress at bucket level.

NOTE: Have pulled out this information from the attachment here

Building Blocks of tim-async from User Perspective

From a User's POV, these three are the important building blocks

  • ItemProcessor
  • AsyncCoordinator
  • AsyncConfig

These are all classes/interface that are part of tim-async. Lets look at them in brief

ItemProcessor

Its an interface and the implementation class of this interface is where we shall be writing the actual processing logic of the Domain Object, like persisting it to the DB.

public interface ItemProcessor<I> {
public void process(I item) throws ProcessingException;
}

AsyncCoordinator

Its a core class and need to be declared as Shared Root in tc-config.xml. It allows work to be added and processed asynchronously.
The class has a method start() that needs to be called by every client node (Terracotta L1) to participate in processing.
An ItemProcessor instance is local to the node, so its the right place to hold stuff DB Connection etc

AsyncConfig

Its an interface and can be implemented to custom tailor the configuration of Async processing. If we don't define one, the org.terracotta.modules.async.configs.DefaultAsyncConfig is used.

public interface AsyncConfig {
public long getWorkDelay();

public long getMaxAllowedFallBehind();

public boolean isStealingEnabled();
}

Now lets see how can we implement tim-async in our application (We shall see the sample application)

- We need to implement ItemProcessor for doing custom processing say for example putting the stuff in DB
- We need to initialize AsyncCoordinator and call start() API, pass on the ItemProcessor from previous step as a param.
- Create tc-config.xml (pick teh sample from tim-async example), add the AsyncCoordinator field instance as the root
- Start TC Server and Client JVM's with tc-config.xml created above


Lets see these steps in Action.

Sample Application

I had written a SMSC simulator based on jsmpp which used to receive the SubmitSM request from Client, used to persist in the DB and return
the id to the client. There is a specific reason why I choose this application. I already had this running and since the SubmitSM just needed to persisted in DB for subsequent processing by other process. The business case was justified that the Server is very responsive to Client, as well as its ensured that the Data is persisted.

Lets walk through the a simple processing sequence

  • Client send a SubmitSM request
  • Server translated it into an internal SMS POJO, assigns a unique ID to the SMS
  • Server persists the SMS in Database
  • Sever sends the ID generate as part of SubmitSM response

The application uses Hibernate to save the POJO in MySQL Database.

Since the Domain object is least significant so won't discuss the internals of SMS POJO

Lets see how we enhanced the application to asynchronously update the DB

Implementing the ItemProcessor

public class AsyncSMSProcessor implements ItemProcessor {
public void process(final SMS sms) throws ProcessingException {
Session hibernateSession = DAO.getSession();
DAO.getSession().getTransaction().begin();
DAO.getSession().save(sms);
DAO.getSession().getTransaction().commit();
}
}

Essentially what we have to write the logic of saving the SMS object to the database. This is the object that Terracotta is going to pass back to us.
What we do here is save the SMS in a Hibernate transaction. This completes out ItemProcessor code

Initialize AsyncCoordinator

public class SMSProcessor {

public AsyncCoordinator asyncUpdator;

public SMSProcessor() {
asyncUpdator = new AsyncCoordinator();
// Use the default API

asyncUpdator.start(new AsyncSMSProcessor());

}

}
public long processIncomingSMS(SubmitSm submitSm,
SMPPServerSession source, ...) {

// some logic here
// give the date to AsyncCoordinator
asyncUpdator.add(sms);
// some logic there

}

This is the change that we have done here. Earlier we used to save the SMS in DB here. We have now given the data to AsyncCoordinator for subsequent processing.

Let's look at the tc-config.xml

this is what we need to add to the tc-config.xml

<application>
<dso>
<!--Declaring a field of a class a root will make it available for all instances
of our app that runs via DSO-->
<roots>
<!-- XXX: -->
<root>
<field-name>com.tc.timasync.demo.smsAsyncCoordinator</field-name>
</root>
</roots>
</dso>
</application>

We declared the AsyncCoordinator variable as DSO root here, which makes it shared across cluster.

That's it.

We are ready. Now we start Terracotta server with our tc-config file and start our SMSC Simulator as TC Client.

How does it all fit together

As we see the instead of updating the DB in the same transaction, we handed the data over to Terracotta. Essentially, this meant that our data has gone to Server, and won't be lost against crashes. The application operated at memory speed and responded to client. The Terracotta Server based on the configuration of tim-async shall call the ItemProcessor in different transaction. We can have multiple nodes updating the DB asynchronously or some other way, based upon the Use Case.

Let's run the both the implementations


Note on Comparison of both the implementations

I shall run both the implementation on the same hardware/OS, which would be my Laptop with 2GB of RAM, runnning Windows and MYSQL 5.X. The comparison shall be relative for the two implementation approached, and not the absolute figures. The figures would be much better if the samples are run on higher end machined with Terracotta Server (L2) running on a dedicated machine and the Clients (L1) running on different machines. The idea here is just to see the TPS of the Server.

Case I: DB updates in App transaction

For 10 runs of sending 1000 SubmitSM request client took following time

Total Time for 1000 SMS = 28891
Total Time for 1000 SMS = 34234
Total Time for 1000 SMS = 27812
Total Time for 1000 SMS = 34547
Total Time for 1000 SMS = 29953
Total Time for 1000 SMS = 34344
Total Time for 1000 SMS = 33985
Total Time for 1000 SMS = 35328
Total Time for 1000 SMS = 36968

Case II: DB updates based on tim-async

Total Time for 1000 SMS = 1359
Total Time for 1000 SMS = 1125
Total Time for 1000 SMS = 781
Total Time for 1000 SMS = 813
Total Time for 1000 SMS = 750
Total Time for 1000 SMS = 719
Total Time for 1000 SMS = 703
Total Time for 1000 SMS = 719
Total Time for 1000 SMS = 718
Total Time for 1000 SMS = 735

There was a significant decrease in the processing time of request :-)

What's next

Download Terracotta from here and try yourself. You can also register for upcoming Darwin release, which has many new interesting features

Resources

  • http://blog.terracottatech.com/2009/02/offloading_a_db_even_when_upda.html - A wonderful post by Ari
  • http://www.slideshare.net/sbtourist/real-terracotta-presentation
  • http://forums.terracotta.org/forums/forums/list.page


Sphere: Related Content

07 January 2010 ~ 1 Comment

Joined Terracotta


Finally, after a sprint of 9+ years at Hughes family, I started a new career. Started my work life with Hughes Software Systems in 2000, then joined Hughes Systique in 2006.

Year 2009 was a year of happenings for me. As I got more involved with Open Source software, my thoughts towards Software Development changed drastically. While involved in Apache MINA mailing list, was introduced to Terracotta. Well this was the beginning and later on I got hired:-) (off-course I applied for it).

Well now the easy life is gone, have much more challenging tasks ahead and at Terracotta, I will be working with some of the finest talent in the industry. A long road ahead for me. The biggest reason for joining Terracotta, Passion. This little word was common between me and most of the folks I interacted at Terracotta. Its very close to my punch line "Great passions can defy Destiny".

With my joining Terracotta as Solution Architect, I am very hopeful that my contribution towards Open Source community will increase significantly.

Before I close this post, a lot of credit to this success goes to Emmanuel Lecharny. He initiated me into Open Source and has been mentoring me all the way.

Wish you a very Happy New Year !

Sphere: Related Content

26 November 2009 ~ 0 Comments

Essential Components of Terracotta based Application


Abstract

How does a Terracotta Application look like? What are the essential components that make up the application? These were the question I had after writing my first post “Getting Started with Terracotta”. Here I try to take a quick look at these questions from a Users point of view. Please note that this post doesn’t try to dive deep into Terracotta inner working.


Components of Terracotta based Application

Components of Terracotta based Application

To keep life simple, have considered the general application architecture, which essentially can be any Terracotta app. From the figure we can see that we have four different parts that contribute to a complete Terracotta app.

User Application – These are applications that want to use Terracotta to Scale. They can either use Terracotta transparently or can use explicit Terracotta API’s

Terracotta Server – It’s the nerve center. Terracotta applications can’t run without it. For running any TC application, it’s a must to run atleast one TC Server.

Terracotta config file – Also known as tc-config.xml (default name). It provides the detailed configuration for Server as well as Client. It contains information like Server’s IP’s, persistence mode, classes to instrument etc.

Terracotta Libraries – These are essentially part of Client or User applications. They are the work horses of the running application and enables Terracotta integration with User application. An example would be dso-java.bat script adds the Terracotta bootjars to the classpath, enabling Terracotta features for the application.

So how does it all fit together at runtime

  • Start Terracotta Server, providing a configuration file with desired configuration
  • Start User Application using dso-java.(bat|sh) instead of normal java, along with a configuration file
  • That’s it :-)


Sphere: Related Content

23 November 2009 ~ 1 Comment

Getting Started with Terracotta


Abstract

In this small post we shall explore Terracotta, a leading pure Java Scalability platform. The discussion is based on the AtomicInteger example from Terracotta site, which shows how to implement a Cluster wide id generator (Actually it’s the Sequencer example, but to keep my steps simple had used AtomicInteger). The reason why I choose this example was coz of a very similar requirement that I had to implement in Clustered J2EE application.
Terracotta is well known and needs no introduction :-)

About problem Statement

Well, I needed a simple solution to have
• Cluster wide unique id’s
• Less frequent access to these id’s
• Optional Persistence

NOTE: Please note that the current example is slightly modified version of example from terracotta.org.

Pre-requisite

To run this example, you need to have following installed
• Terracotta
• And JDK offcourse :-)
Let’s take a look at the sample code first

import java.util.concurrent.atomic.*;

public class IdGenerator
{

private AtomicInteger masterCounter = new AtomicInteger(0);

private AtomicInteger slaveCounter = new AtomicInteger(1024);

public int getNewMasterId()
{
int newId = masterCounter.getAndIncrement();
System.out.println("Master Id is: " + newId);
return newId;
}

public int getNewSlaveId()
{
int slaveId = slaveCounter.getAndIncrement();
System.out.println("Slave Id is: " + slaveId);
return slaveId;
}

public static void main(String[] args)
{
new IdGenerator().getNewMasterId();
}
}

The code has two fields ids master and slave, of which master id need to be unique cluster wide. The code is pretty straight forward, and there is nothing to explain. The id’s are generated by simple getAndIncrement() API.

Lets see the tc-config.xml


<tc:tc-config xmlns:tc="http://www.terracotta.org/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-4.xsd">

<application>
<dso>
<roots>
<root>
<field-name>IdGenerator.masterCounter</field-name>
</root>
</roots>
</dso>
</application>
</tc:tc-config>

Since only master Id needed to be unique in this case, hence the configuration

Let’s look at the deployment architecture

Deployment Architecture

Deployment Architecture

The application was to run on 3 nodes (3 JVM’s on same machine) and one Terracotta server running.
Before we get into running this application, lets take a look at some useful scripts provided by Terracotta
dso-java (bat|sh) – startup script that bootstraps Terracotta libraries to your application
start-tc-server (bat|sh) – Script to start Terracotta Server. It is mandatory to start Terracotta Server before running Clients

Let’s run the application
1. Compile the application. There is no dependency on TC libraries.
2. First Start the terracotta Server using script start-tc-server
3. Run 3 separate JVM with command “dso-java IdGenerator”
That’s it. Our first application is running. If we observe the console output, you can see that we have unique id’s for each invocation.

References:

Source code for Terracotta examples - http://svn.terracotta.org/svn/forge/cookbook

Sphere: Related Content