Wednesday, March 19, 2014

DateFormat in a Multithreading Environment

DateFormat in a Multithreading Environment

This is the first of a series of articles concerning proposed practices while working with the Java programming language.

All discussed topics are based on use cases derived from the development of mission critical, ultra high performance production systems for the telecommunication industry.

Prior reading each section of this article it is highly recommended that you consult the relevant Java API documentation for detailed information and code samples.

All tests are performed against a Sony Vaio with the following characteristics :

System : openSUSE 11.1 (x86_64)

Processor (CPU) : Intel(R) Core(TM)2 Duo CPU T6670 @ 2.20GHz

Processor Speed : 1,200.00 MHz

Total memory (RAM) : 2.8 GB

Java : OpenJDK 1.6.0_0 64-Bit

The following test configuration is applied :
Concurrent worker Threads : 200

Test repeats per worker Thread : 1000

Overall test runs : 100


Using DateFormat in a multithreading environment

Working with
DateFormat in a multithreading environment can be tricky. The Java API documentation clearly states :

Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.

A typical case scenario is to convert a
Date to its String representation or vice versa, using a predefined format. Creating new DateFormat instances for every conversion is very inefficient. You should keep in mind that the static factory methods “getDateInstance(..)” also create new DateFormat instances when used. What most developers do is that they construct a DateFormat instance, using a DateFormat implementation class (e.g. SimpleDateFormat), and assign its value to a class variable. The class scoped variable is used for all their Date parsing and formatting needs. The aforementioned approach, although very efficient, can cause problems when multiple threads access the same instance of the class variable, due to lack of synchronization on the DateFormat class. Typical exceptions thrown when parsing to create a Date object are :
  • java.lang.NumberFormatException
  • java.lang.ArrayIndexOutOfBoundsException
You should also experience malformed Date to String representation when formatting is performed.

To properly handle the aforementioned issues, it is vital to clarify the architecture of your multithreading environment. The Java Virtual Machine allows an application to have multiple threads of execution running concurrently. Typically, in a multithreading environment (either a container inside the JVM or the JVM itself),
Thread pooling should be performed. Worker threads should be constructed and initialized upon startup, utilized to execute your programs. For example a Web container constructs a pool of worker threads to serve all incoming traffic. Thread pooling is the most efficient way to manipulate system resources mainly due to the fact that Thread creation and initialization is a high resource consuming task for the Java Virtual Machine. Nevertheless application parallelism can be achieved by simply creating a new Thread of execution for every piece of code you want to be executed concurrently.

Concerning class scoped
DateFormat instances :
  • If you have clarified that NO Thread pools are used in your environment then only new Thread instances concurrently access your DateFormat instance. In this case it is recommended to synchronize that DateFormat instance externally
  • In case Thread pools are used, there is a limited number of Thread instances that can access your DateFormat instance concurrently. Thus it is recommended to create separate DateFormat instances for each thread using the ThreadLocal approach

Below are examples of “getDateInstance(..)”, "synchronization" and
ThreadLocal approaches :


package com.javacodegeeks.test;





import java.text.DateFormat;

import java.text.ParseException;



import java.text.SimpleDateFormat;

import java.util.Date;





public class ConcurrentDateFormatAccess {





 public Date convertStringToDate(String dateString) throws ParseException {



  return SimpleDateFormat.getDateInstance(DateFormat.MEDIUM).parse(dateString);

 }





}




package com.javacodegeeks.test;





import java.text.DateFormat;

import java.text.ParseException;



import java.text.SimpleDateFormat;

import java.util.Date;





public class ConcurrentDateFormatAccess {





 private DateFormat df = new SimpleDateFormat("yyyy MM dd");





 public Date convertStringToDate(String dateString) throws ParseException {



  Date result;

  synchronized(df) {



   result = df.parse(dateString);

  }



  return result;

 }





}
Things to notice here :
  • Every individual Thread executing the “convertStringToDate” operation, is trying to acquire the monitor lock on the DateFormat Object prior acquiring a reference to the DateFormat class variable instance . If another Thread is holding the lock then the current Thread waits until the lock is released. That way only one Thread is accessing the DateFormat instance at a time




package com.javacodegeeks.test;





import java.text.DateFormat;

import java.text.ParseException;



import java.text.SimpleDateFormat;

import java.util.Date;





public class ConcurrentDateFormatAccess {





 private ThreadLocal df = new ThreadLocal () {





  @Override



  public DateFormat get() {

   return super.get();



  }





  @Override

  protected DateFormat initialValue() {



   return new SimpleDateFormat("yyyy MM dd");

  }





  @Override



  public void remove() {

   super.remove();



  }





  @Override

  public void set(DateFormat value) {



   super.set(value);

  }





 };





 public Date convertStringToDate(String dateString) throws ParseException {



  return df.get().parse(dateString);

 }






}
Things to notice here :
  • Every individual Thread executing the “convertStringToDate” operation, invokes the “df.get()” operation in order to initialize or retrieve an already initialized reference of its local scoped DateFormat instance

Below we present a performance comparison chart between the three aforementioned approaches (notice that we have tested the parsing functionality of the
DateFormat utility class. We convert a String representation of a date to its Date Object equivalent, according to a specific date format).https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_25k3us1fwg-GbOL_QJv-D_CxSL6Q6p9Jhub-wg5rB8Y09Vio9sqbGMnT_zryJz7H9o-dX3-Ere54ajPUyqclHUiQULxyMln1v5fHkurVvZLdR6q9ioQte99BhDBU9z4NRTIcQvoW8nM/s640/chart.png

The horizontal axis represents the number of test runs and the vertical axis the average transactions per second (TPS) for each test run. Thus higher values are better. As you can see by using
Thread pools and the ThreadLocal approach you can achieve superior performance compared to the “synchronization” and the “getDateInstance(..)” approaches.

Lastly, let me pinpoint that using the
ThreadLocal approach without Thread pools, is equivalent to using the “getDateInstance(..)” approach due to the fact that every new Thread has to initialize its local DateFormat instance prior using it, thus a new DateFormat instance will be created with every single execution.



Happy Coding!

No comments:

Post a Comment