Friday, April 17, 2009

Parsing with JavaCC

Currently i am working on deciphering a message with a certain format and filling in some DTOs(Data transfer objects) which were then sent to be stored in the database. First i looked at using java regex to do the task but found it was getting way too complicated and the code was not much readable which would result in maintenance problems in the future. After searching around on the net for a few minutes i stumbled upon a library which was originally developed by Sun called JavaCC which is basically a lexical analyser and parser generator. After going through some samples i was able to figure out the expressions used and was able to decipher the message and store releavant data in respective DTO attributes to be sent to the DB.
 
Following i have given a sample code which i used in the early stages to try out a HelloWorld kind of scenario.
 

options{

STATIC=false;

}

PARSER_BEGIN(FLW)

import java.io.*;

public class FLW{

public static void main(String ar[]){

CustomerDTO dtos = getDTO("xxxx 40,dddddddddd eeeeeeeeeer");

System.out.println("First Name: "+dtos.getFName());

System.out.println("Last Name: "+dtos.getLName());

System.out.println("Address : "+dtos.getAddress());

}

static CustomerDTO getDTO(String inString){

Reader reader = new StringReader(inString);

FLW parser = new FLW(reader);

StringBuffer buf = new StringBuffer();

try{

return parser.parse();

}

catch(Exception e){

System.out.println("exception");

e.printStackTrace();

}

return null;

}

}

PARSER_END(FLW)

TOKEN:{<SPACE:" ">}

TOKEN:{<#COMMA:",">}

TOKEN:{<FIRST_NAME:(<LETTER>){4}>}

TOKEN:{<ADDRESS:(<NUMBER>){2}(<COMMA>){1}(<LETTER>){10}>}

TOKEN:{<LAST_NAME:(<LETTER>){11}>}

TOKEN:{<#LETTER:["a"-"z","A"-"Z"]>}

TOKEN:{<#NUMBER:["0"-"9"]>}

CustomerDTO parse():

{

Token fName;

Token lName;

Token address;

CustomerDTO cusDTO = new CustomerDTO();

}

{

(

((fName=<FIRST_NAME>

{cusDTO.setFName(fName.image);}))

<SPACE>

(address=<ADDRESS>)

{cusDTO.setAddress(address.image);}

<SPACE>

(lName=<LAST_NAME>)

{cusDTO.setLName(lName.image);}

)

{return cusDTO;}

}

 

Note that you have to first install JavaCC and also create the CustomerDTO which is in the default class path in the above example and store this in a file named xxx.jj. What the above code does is basically break down the string message passed in the main method and put the relevant data in the relvant attributes of the DTO. Note that JavaCC automatically hanldes EOF(End of file). Hope this helps anyone who is looking at how to use JavaCC for such scenario.