Sunday, May 13, 2007

Glassfish and Audit Logging

Sarbanes-Oxley aside, one of the concerns of a well written enterprise application is the ability to track changes, including the who, what, and when of a transaction. In the past, developers used database triggers, as well as separate service interfaces, to aid in generating the bread crumbs of audit logging records. Triggers are wonderful in that they're an automatic feature, but they have one draw-back: they have difficulty in deriving the user committing the transaction. In most cases, a servlet engine or application server will log into a database with a single instance, and that will be the user of record according to the database for all transactions. A separate service interface for audit logging is undesirable for two reasons. First, it introduces a redundancy of code. Second, it introduces coupling to a layer that leads to maintenance nightmares if ever changed. With current ORM technologies, however, developers have the concept of a centralized API to help them out. Specifically, I'll be detailing one approach with the Glassfish application server, which uses Oracle's open source implementation of JPA: Toplink Essentials. This approach leverages Listeners on the EntityManager architecture, which will be triggered upon events such as insert, update, and delete. These callbacks can be assigned to arbitrary @Entity types, allowing the developer to write a single class to respond to these events and issue the desired logging statements.


First, we start with the domain objects, or entity beans. With entity beans, we basically have two choices: to inherit, or not to inherit. Simple entity per table models will require more configuration effort, while an inheritance approach may have impacts on performance. For the latter, the specific concern is with InheritanceType.JOINED, where a single table will have a row per entity in the system. If the number of entities grows large ( > 10^8), you will find that database performance will suffer. So if you're tackling data domains with large numbers of records, you'll want to think carefully how you tread.

For my example, I'm going with a JOINED inheritance hierarchy. This will enable OOD throughout the application, and as well see, simplify Toplink configuration.

To start off, here's the base class @Entity. I've incorporated all of the attributes that will be common throughout the hierarchy, which you'll want to augment as needed.


/**
* FileName: BaseEntity.java
* Author: Julian Klappenbach
* Date Created: May 4, 2007
* Purpose: Base class for entity model
*
*/
package com.yourorg.jpa;

import java.io.Serializable;

import javax.persistence.Column;
import javax.persistence.DiscriminatorColumn;
import javax.persistence.DiscriminatorType;
import javax.persistence.DiscriminatorValue;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.Inheritance;
import javax.persistence.InheritanceType;
import javax.persistence.JoinColumn;
import javax.persistence.OneToOne;

@Entity
@Inheritance(strategy = InheritanceType.JOINED)
@DiscriminatorColumn(name="RESOURCETYPE", discriminatorType = DiscriminatorType.STRING, length = 20)
@DiscriminatorValue(value = "BaseEntity")
public class BaseEntity implements Serializable
{
private static final long serialVersionUID = 1L;
protected Integer id;
protected String title;
protected String description;
protected BaseEntity modifiedBy;

@Id
@GeneratedValue
public Integer getId()
{
return id;
}
public void setId(Integer id)
{
this.id = id;
}

@Column
public String getTitle()
{
return title;
}
public void setTitle(String title)
{
this.title = title;
}

@Column
public String getDescription()
{
return description;
}
public void setDescription(String description)
{
this.description = description;
}

@OneToOne
@JoinColumn(name = "MODIFIEDBYID")
public BaseEntity getModifiedBy()
{
return modifiedBy;
}
/**
* @param modifiedBy the modifiedBy to set
*/
public void setModifiedBy(BaseEntity modifiedBy)
{
this.modifiedBy = modifiedBy;
}

/**
* Set hashCode to entity's ID
*/
public int hashCode()
{
return id;
}

/**
* Assign equivalence based on entity's ID
*/
public boolean equals(Object obj)
{
if (obj instanceof BaseEntity)
{
if (((BaseEntity) obj).getId() == id)
return true;
}
return false;
}
}

To build out the example, I'm going to add User to the hierarchy:


/**
* FileName: User.java
* Author: Julian Klappenbach
* Date Created: May 4, 2007
* Purpose: class defining User entity
*
*/
package com.yourorg.jpa.user;

import java.util.HashSet;
import java.util.Collection;

import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.DiscriminatorValue;
import javax.persistence.Entity;
import javax.persistence.FetchType;
import javax.persistence.JoinColumn;
import javax.persistence.JoinTable;
import javax.persistence.ManyToMany;

import com.yourorg.jpa.BaseEntity;

@Entity
@DiscriminatorValue("User")
public class User extends BaseEntity
{
private static final long serialVersionUID = 1L;
protected String email = "";
protected String username = "";
protected String firstName = "";
protected String lastName = "";
protected String password = "";

@Column
public String getEmail()
{
return email;
}
public void setEmail(String email)
{
this.email = email;
}

@Column
public String getFirstName()
{
return firstName;
}
public void setFirstName(String firstName)
{
this.firstName = firstName;
}

@Column
public String getLastName()
{
return lastName;
}
public void setLastName(String lastName)
{
this.lastName = lastName;
}

@Column
public String getUserName()
{
return username;
}
public void setUserName(String username)
{
this.username = username;
}

@Column
public String getPassword()
{
return password;
}
public void setPassword(String password)
{
this.password = password;
}

public boolean equals(Object o)
{
if (o instanceof User)
{
User u = (User) o;
if (u.getId() == id)
return true;
}
return false;
}

public int hashCode()
{
return id;
}

public String toString()
{
return firstName + " " + lastName;
}
}


As a note: I won't go into the construction of the session beans managing these entities. See the Toplink JPA documentation if you have questions. We'll next define how we'll store information about each transaction. We have three operation types that we care about, INSERT, UPDATE, and DELETE. We also need to analyze the use cases for audit information. These range all the way from version interrogation and control, to simply tracking the actors involved in an operation. For this example, we're only concerned with the simple case of tracking. To store the information, we'll use an @Entity again:


/**
* FileName: AuditEntry.java
* Author: Julian Klappenbach
* Date Created: May 4, 2007
* Purpose: The definition of the AuditEntry entity
*
*/
package com.yourorg.jpa.audit;

import java.io.Serializable;
import java.util.Collection;
import java.util.Date;

import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.FetchType;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.OneToMany;
import javax.persistence.Temporal;
import javax.persistence.TemporalType;

@Entity
public class AuditEntry implements Serializable
{
private static final long serialVersionUID = 1L;
public static final String UPDATE_OPERATION = "UPDATE";
public static final String INSERT_OPERATION = "INSERT";
public static final String DELETE_OPERATION = "DELETE";

protected Integer id;
protected Integer baseEntityId;
protected String operation;
protected Date operationTime;
protected Collection fields;

@Id
@GeneratedValue
public Integer getId()
{
return id;
}
public void setId(Integer id)
{
this.id = id;
}

@Column
public Integer getBaseEntityId()
{
return resourceEntityId;
}
public void setResourceEntityId(Integer resourceEntityId)
{
this.resourceEntityId = resourceEntityId;
}

@Column
public String getOperation()
{
return operation;
}
public void setOperation(String action)
{
this.operation = action;
}

@Column
@Temporal(value = TemporalType.TIME)
public Date getOperationTime()
{
return operationTime;
}
public void setOperationTime(Date operationTime)
{
this.operationTime = operationTime;
}

@OneToMany(cascade = { CascadeType.ALL }, fetch = FetchType.EAGER, mappedBy = "auditEntry")
public Collection getFields()
{
return fields;
}
public void setFields(Collection fields)
{
this.fields = fields;
}

/**
* Set hashCode to entity's ID
*/
public int hashCode()
{
return id;
}

/**
* Assign equivalence based on entity's ID
*/
public boolean equals(Object obj)
{
if (obj instanceof AuditEntry)
{
if (((AuditEntry) obj).getId() == id)
return true;
}
return false;
}
}

In addition to knowing general information about the audit event, we'll also want to store information about the entity attributes that were changed during a transaction. For that, I've defined an AuditField entity:


/**
* FileName: AuditField.java
* Author: julian
* Date Created: May 4, 2007
* Purpose: Store information about fields changed in a transaction
*/
package com.yourorg.jpa.audit;

import java.io.Serializable;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.JoinColumn;
import javax.persistence.ManyToOne;

@Entity
public class AuditField implements Serializable
{
private static final long serialVersionUID = 1L;
private Integer id;
private String fieldName;
private String fieldValue;
private AuditEntry auditEntry;

@Id
@GeneratedValue
public Integer getId()
{
return id;
}
public void setId(Integer id)
{
this.id = id;
}

@Column
public String getFieldName()
{
return fieldName;
}
public void setFieldName(String fieldName)
{
this.fieldName = fieldName;
}

@Column
public String getFieldValue()
{
return fieldValue;
}
public void setFieldValue(String fieldValue)
{
this.fieldValue = fieldValue;
}

@ManyToOne
@JoinColumn(name = "AUDITENTRYID")
public AuditEntry getAuditEntry()
{
return auditEntry;
}
public void setAuditEntry(AuditEntry auditEntry)
{
this.auditEntry = auditEntry;
}
}

Now, that I have the entities defined, let's take a look at the listener interface that we'll use to intercept to EntityManager events:


/**
* FileName: AuditHandler.java
* Author: Julian Klappenbach
* Date Created: May 4, 2007
* Purpose: The listener implementation for auditing on EntityManager callbacks
*/
package com.yourorg.jpa.audit;

import java.util.Collection;
import java.util.Date;
import java.util.LinkedList;
import java.util.Vector;

import oracle.toplink.essentials.changesets.DirectToFieldChangeRecord;
import oracle.toplink.essentials.descriptors.ClassDescriptor;
import oracle.toplink.essentials.descriptors.DescriptorEvent;
import oracle.toplink.essentials.descriptors.DescriptorEventAdapter;
import oracle.toplink.essentials.queryframework.InsertObjectQuery;
import oracle.toplink.essentials.queryframework.WriteObjectQuery;
import oracle.toplink.essentials.tools.sessionconfiguration.DescriptorCustomizer;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

public class AuditHandler extends DescriptorEventAdapter implements
DescriptorCustomizer
{
private static final Log log = LogFactory.getLog(AuditHandler.class);

/**
* This method "customizes" the DescriptionEventManager, adding this listener to be invoked
* for events
*/
public void customize(ClassDescriptor classDescriptor) throws Exception
{
classDescriptor.getDescriptorEventManager().addListener(this);
}

/** (non-Javadoc)
* @see oracle.toplink.essentials.descriptors.DescriptorEventAdapter#postDelete(oracle.toplink.essentials.descriptors.DescriptorEvent)
*/
@Override
public void postDelete(DescriptorEvent event)
{
AuditEntry entry = new AuditEntry();
entry.setOperation(AuditEntry.DELETE_OPERATION);
entry.setOperationTime(new Date());
entry.setResourceEntityId(event.getSource().hashCode());
InsertObjectQuery insertQuery = new InsertObjectQuery(entry);
event.getSession().executeQuery(insertQuery);
}

/** (non-Javadoc)
* @see oracle.toplink.essentials.descriptors.DescriptorEventAdapter#postInsert(oracle.toplink.essentials.descriptors.DescriptorEvent)
*/
@Override
public void postInsert(DescriptorEvent event)
{
processWriteEvent(event);
}

/** (non-Javadoc)
* @see oracle.toplink.essentials.descriptors.DescriptorEventAdapter#postUpdate(oracle.toplink.essentials.descriptors.DescriptorEvent)
*/
@Override
public void postUpdate(DescriptorEvent event)
{
processWriteEvent(event);
}

/**
* Common method to handle both Update and Insert events. Fortunately, the Toplink libs
* are ammenable to this.
* @param event The DescriptorEvent to process
*/
protected void processWriteEvent(DescriptorEvent event)
{
AuditEntry entry = new AuditEntry();
entry.setOperation(event.getEventCode() == 7 ? AuditEntry.UPDATE_OPERATION :
AuditEntry.INSERT_OPERATION);
entry.setOperationTime(new Date());
entry.setResourceEntityId(event.getSource().hashCode());

Collection fields = new LinkedList();
WriteObjectQuery query = (WriteObjectQuery) event.getQuery();
Vector changes = query.getObjectChangeSet().getChanges();
for (int i = 0; i < changes.size(); i++)
{
if (changes.elementAt(i) instanceof DirectToFieldChangeRecord)
{
DirectToFieldChangeRecord fieldChange = (DirectToFieldChangeRecord) changes.elementAt(i);
AuditField field = new AuditField();
field.setAuditEntry(entry);
field.setFieldName(fieldChange.getAttribute());
field.setFieldValue(fieldChange.getNewValue().toString());
fields.add(field);
}
}
entry.setFields(fields);

InsertObjectQuery insertQuery = new InsertObjectQuery(entry);
event.getSession().executeQuery(insertQuery);

for (AuditField field : fields)
{
insertQuery = new InsertObjectQuery(field);
event.getSession().executeQuery(insertQuery);
}
}
}

A few notes here. First, the implementation only handles insert, update, and delete events. It's capable of much more, and all that is required is the @Override of the remaining DescriptorEventAdaptor methods. I've chosen post* events, as this ensures that records have already been written before the audit entry is processed. Second, the implementation relies on the override of the Entity's hashCode method to produce the ID attribute. This override (as well as equals) is a good idea in general, and essential for presentation frameworks like Wicket, which have components that rely on the equals contract to assess the equivalence of detached domain objects. Third, even though I've gone through the effort of defining the annotations to map the one-to-many relationship of AuditEntry to AuditField, the low level persistence API that I'm using to execute queries does not observe the relationship. Therefore, I'm explicitly making calls to persist the AuditField records. These annotations will come in handy, however, if I later choose to code up a user interface into AuditEntry records in my application.

Now that we have our listener implementation, we're almost there. We just need to tell TopLink where to find it. We do this through the persistence.xml descriptor, where we'll include property elements that associate our entities with the listener.


The property has the following key value pairs:


<property name="toplink.descriptor.customizer.*" value="[Listener Implementation Classname]"/>


The name of the property follows a naming convention, where "*" is the simple name for the @Entity you wish to have mapped. Because I've used inheritance, I only need to create a mapping for my base class. Otherwise, I would need to create an explicit mapping for each entity I would want to have audited. The following is the full persistence.xml definition:


<?xml version="1.0" encoding="UTF-8"?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence" version="1.0">
    <persistence-unit name="PrototypePU" transaction-type="JTA">
        <jta-data-source>jdbc/PrototypeDS</jta-data-source>
        <properties>
            <property name="toplink.logging.level" value="FINE" />
            <property name="toplink.target-database" value="MySQL4" />
            <property name="toplink.ddl-generation" value="none" />
            <property name="toplink.descriptor.customizer.BaseEntity" value="com.yourorg.jpa.audit.AuditHandler" />
        </properties>
    </persistence-unit>
</persistence>

In this article, I have covered a complete solution to audit logging for Glassfish using the default Toplink JPA architecture. Though there is additional overhead to this approach, it is by far one of the most efficient approaches. The listener API enables the detection of dirty fields on entity objects, efficient access to the persistence data source, as well as fine grained control over the entities to be audited. Though there still exists some concerns uncovered, such as batch operations, this is a clear choice for JPA applications.

5 comments:

Wandrey said...

Great work.
Tanks a lot.

lorenzo said...

Thanks for that great post.

I'm looking for something similar that uses JPA with Hibernate. Any help would be greatly appreciated.

I'm thinking that if I use JPA only, I would end up creating @PostXXX entity listeners. But I would like to have access to the list of changed field values, similar to what you've presented (using Oracle Toplink's DescriptorEventAdapter).

Cheers

Anonymous said...

lorenzo,

i've been looking for a hibernate auditing listener as well ... any luck since your post?

Stig said...

I have not used it myself, but I guess Hibernate's event system could be used for this:

http://www.hibernate.org/hib_docs/reference/en/html/events.html

Anonymous said...

Thanks for the post.

Have you considered a generic way to wire-in a User entity that was responsible for the event?

i.e. some businesses may require to audit the logged in user that have initiated the event ...

Rares

Google