Thursday, April 16, 2015

Hibernate Interview Questions

What are Caching types for Hibernate ?

First-Level Cache: First-Level Cache is associated with session and always works within Session scope. Hibernate uses First-Level Cache by default and on a per transaction basis (within a single transaction boundary).
When multiple updates are issued on an object hibernate delays the update as long as possible to reduce the number of update SQL statements issued. That is instead of updating after every modification done in the transaction, it updates the transaction only at the end of the transaction. If the session is closed, all the objects being cached are lost and either persisted or updated in the database.

Second-Level Cache: Second-level Cache always associated with session factory and works within Session Factory scope. Hence it is available to be used in all sessions which are created using that particular session factory. Second level cache is an optional cache and first-level cache will always be consulted before any attempt is made to locate an object in the second-level cache. When the entity is not present in first level cache but is found in the second level cache, it is stored in first level cache for next invocation. If entity is not found in both first level cache and second level cache, then database query is executed and entity is stored in both cache levels, before returning as response of load() method. The second-level cache can be configured on a per-class and per-collection basis and mainly responsible for caching objects across sessions. An org.hibernate.cache.CacheProvider interface is provided, which is implemented to provide Hibernate with a third party cache implementation. Second level cache configuration requires concurrency strategy (Transactional, Read-write, Nonstrict-read-write, Read-only), cache expiration and physical cache attributes using the cache provider e.g. EHCache.

Query-Level Cache: Hibernate implements a cache for query resultsets that integrates closely with the second-level cache. This is optional and requires two additional physical cache regions that hold the cached query results and the timestamps when a table was last updated. It is useful for queries that are run frequently with the same parameters. It is activated by using the hibernate.cache.use_query_cache="true" property in the configuration file.

What is lazy loading ?
Whenever a relationship is defined in Hibernate, the fetch type parameter decides whether load all of the relationships of a particular object/table as soon as the object/table is initially fetched or not. Lazy loading is enabled by default hence not loading the child objects while loading the Parent Object, unless they are explicitly invoked in the application by calling getChild() method on parent. When lazy loading is set as false, hibernate will load the child while loading the parent from the database. Lazy loading is configured in the respective hibernate mapping file of the parent class. Hibernate loads the child object automatically when tried to access a child from parent object. Lazy-loading helps to improve the performance significantly by preventing loading of child objects when not needed. The default behavior is to load property values eagerly and to load collections lazily. Every relationship that finishes with @Many will be lazy loaded by default: @OneToMany and @ManyToMany. Also every basic field (E.g. String, int, double) is eagerly loaded. Every relationship that finishes with @One will be eagerly loaded by default: @ManyToOne and @OneToOne.

What is n+1 select problem ?
Hibernate by default will not load all the children while accessing the collection as lazy loading is set as true. Instead, it will load each child individually executing individual query. While iterating over the collection, this causes a query being executed for every child. The problem with this is that each query has quite a bit of overhead. It is much faster to issue 1 query which returns 100 results than to issue 100 queries which each return 1 result. The N+1 query problem can be fixed by batching queries. All the data is loaded before iterating through it which is oversimplified and omits error checking).

When is an exception thrown during the lazy loading ?
Whenever a query is executed to load a collection (or a property), which was lazily loaded initially, a session closed exception occurs. The exception occurs due to lack of an opened database connection. When the SessionFactory.getCurrentSession() is used with default version of ThreadLocalSessionContext, it defaults to closing the session on commit.

org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: test.account.UserAccount.mailAccounts, no session or session was closed

The easier and the fattest way to bring a lazy list when the object is loaded is by using "@OneToMany(fetch = FetchType.EAGER)" annotation or setting lazy as false. This approach though impacts server performance.

If the session life cycle is managed with sessionFactory.openSession() and session.close(), the lazy load works fine within the session life cycle, outside the transaction boundaries. The problem of detached or closed session with the collection object can be solved by re-attaching the object to the current session using session.update(object), before loading the collection.

Other approach is to leave a database connection opened until the end of the user request. Such approach involves adding a (OpenSessionInView) Filter which receives all the user requests and opens/closes the database connection. The major issue of this approach is the N+1 effect.
Difference between Session methods clear(), evict() and close() ?

Session.evict(): It removes current instance from the session cache. Changes to the instance will not be synchronized with the database. This operation cascades to associated instances if the association is mapped with cascade="evict". It doesn't evict related entities if relationship is configured without CascadeType.DETACH or CascadeType.ALL.

Session.clear(): It evicts all loaded instances and cancels all pending saves, updates and deletions. Invoking session.clear() detaches all entity objects from the session and is similar to calling evict() on every object associated with the session.

Session.close(): Ends the session by releasing the JDBC connection and cleaning up. It is not strictly necessary to close the session but you must at least disconnect() it.

Session.flush(): Synchronizes the underlying persistent store with the persistable state held in memory. Must be called at the end of a unit of work, before committing the transaction and closing the session (depending on flush-mode, Transaction.commit() calls this method).

Difference between Session methods save(), saveOrUpdate() and persist() ? Save does an insert and will fail if the primary key is already persistent. Persist the given transient instance, first assigning a generated identifier. (Or using the current value of the identifier property if the assigned generator is used.) This operation cascades to associated instances if the association is mapped with cascade="save-update". Hence in summary save() method saves the records into database by INSERT SQL query, generates a new identifier and returns the Serializable identifier back.

Session.saveOrUpdate(): SaveOrUpdate does a select first to determine if it needs to do an insert or an update. Insert data if primary key not exist otherwise update data. Hence saveOrUpdate() method either INSERT or UPDATE based upon existence of object in database. If persistence object already exists in database then UPDATE SQL will be executed, else if there is no corresponding object in database than INSERT will be executed.

Session.persist(): Does the same like But returns the generated identifier (Serializable object) but session.persist() return void.
Persist() method doesn't guarantee that the identifier value will be assigned to the persistent instance immediately, the assignment might happen at flush time.
Persist() method guarantees that it will not execute an INSERT statement if it is called outside of transaction boundaries. Save() method does not guarantee the same, but it returns an identifier if an INSERT has to be executed to get the identifier, and this INSERT happens immediately, no matter if its inside or outside of a transaction.

Difference between Session methods get() and load() ?
Session.get(): Returns the persistent instance of the given entity class with the given identifier, or null if there is no such persistent instance.

Session.load(): Returns the persistent instance of the given entity class with the given identifier, assuming that the instance exists. It might return a proxied instance that is initialized on-demand, when a accessed method does not return an identifier for the object. 

Some of the differences between get and load methods are as follows:
  • The get method returns null if object is not found in cache as well as the database, while load() method throws ObjectNotFoundException if object is not found but never returns null.
  • The get method always hits database while load() method may not always hit the database, depending upon which method is called.
  • The get method never returns a proxy, it either returns null or fully initialized Object, while load() method may return proxy, which is the object with ID but without initializing other properties, which is lazily initialized. If you are just using returned object for creating relationship and only need Id then load() is the way to go.
  • The get method will return a completely initialized object if  Object is not on the cache but exists on Database, which may involve multiple round-trips to database based upon object relational mappings while load() method of Hibernate can return a proxy which can be initialized on demand (lazy initialization) when a non identifier method is accessed. Due to above reason use of load method will result in slightly better performance, but there is a caveat that proxy object will throw ObjectNotFoundException later if corresponding row doesn’t exists in database, instead of failing immediately so not a fail fast behavior.

What is the difference between getCurrentSession() and getOpenSession() ?

The SessionFactory.openSession() method opens a new session and binds it to the current context, associating it with current thread. The Session obtained from getCurrentSession() is flushed and closed automatically. The SessionFactory.getCurrentSession() returns the session that is bound to the current context. The session returned from getCurrentSession() is expected to be closed and flushed by the users themselves.

What is difference between update() and merge(). How to reattach a detached entity object ?

Session.update(): Updates the persistent instance with the identifier of the given detached instance. If there is a persistent instance with the same identifier, an exception is thrown. It is used to modify the persistent object from inside the session only when the session doesn't contain an already persistent state with the same id. 

Session.merge(): Copies the state of the given object onto the persistent object with the same identifier. If there is no persistent instance currently associated with the session, it will be loaded. If the given instance is unsaved then save a copy of the instance and return it as a newly persistent instance. Merge does not associates the given instance with the session. Merge() should be used if the state of the session is unknown and we want to make modifications at any time.

After closing session, its corresponding entity goes in a detached state. Hence it will not be in session cache and any call to update() method on it will throw an error. In order to reattach the entity, we open another session and load with same entity instance. Then on calling merge in session2, changes of e1 will be merged in e2.

When is the Session.lock() method used ? How does it differ with Session.update() method ?
The lock method obtains the specified lock level based upon the given object. This may be used to perform a version check (LockMode.READ), to upgrade to a pessimistic lock (LockMode.PESSIMISTIC_WRITE), or to simply reassociate a transient instance with a session (LockMode.NONE). The lock method is deprecated and instead buildLockRequest(LockMode).lock(object) method should be called.

Both the update() and the lock() method can be used to reattaching a detached object. The session.lock() method simply reattaches the object to the session without checking or updating the database on the assumption that the database in sync with the detached object. It is the best practice to use Session.update(..), while Session.lock() should be used only if we are absolutely sure that the detached object is in sync with the detached object or if it does not matter because we will be overwriting all the columns that would have changed later on within the same transaction.

Note: When reattaching the detached objects, it should be made sure that the dependent objects are reatched as well.

What is the difference between inverse and cascade ?
Inverse decides which side is the relationship owner in order to manage the relationship (insert or update of the foreign key column). It determines which side will update the foreign key. When inverse is true then relation owner is the child table otherwise its the parent. By default inverse is false that means parent is relationship owner. The inverse keyword is always declare in one-to-many and many-to-many relationship (many-to-one doesn’t has inverse keyword). Since the inverse defines which side is the owner to maintain the relationship, it can also be called relationship_owner.
Consider the entities Stock and StockRecords with one to many relationship. Now if save or update operation is performed in “Stock” object, should it update the “StockRecords” relationship ? The answer of this question depends on inverse value.
<class name="com.common.Stock" table="stock" ...>
    <set name="stockRecords" table="stock_record" inverse="{true/false}" fetch="select">
            <column name="STOCK_ID" not-null="true" />
        <one-to-many class="com.common.StockRecord" />
In above example, when inverse=”true”, it means “StockRecord” is the relationship owner, so Stock will NOT UPDATE the relationship. On the other hand when inverse=”false” which is default, it means “Stock” is the relationship owner, and Stock will UPDATE the relationship.

In cascade, after a save, update or delete operation is performed, it decides whether it needs to call other operations (save, update and delete) on another entities which has relationship with each other. When cascade value is save-update, hibernate saves a new instance or changes persistence object of parent and also association object(s). When cascade value is delete, hibernate simply deletes the related entity or entities which was deleted. When cascade is delete-orphan, hibernate deletes the persistent object that has been removed from the association, for example one-to-many collection. In annotation, the CascadeType.SAVE_UPDATE (save, update) and CascadeType.REMOVE (delete) is declared in @Cascade annotation.

What is the difference between optimistic and pessimistic locking ?
Optimistic locking assumes that multiple transactions can complete without affecting each other, and hence the transactions proceed without any locking of the data resources. Before committing, each transaction verifies that no other transaction has modified its data. If the check reveals conflicting modifications, the committing transaction will roll back.
     When long transactions span several database transactions, versioning data can be stored which ensures that if the same entity is updated by two conversations then the last commit is informed as conflict. This scales the application were there are Read-Often but Write-Sometimes.
Hibernate provides two different mechanisms for storing versioning information, a dedicated version number or a timestamp. JPA/Hibernate Optmistic locking works by using some field to store the last modified version and then comparing the version of the entity in the session with the entity in the database to see if the change can be saved. Usually a field with @Version annotation is added to the entity.

Pessimistic locking assumes that concurrent transactions will conflict with each other, and requires resources to be locked after they are read and only unlocked after the application has finished using the data. By specifying an isolation level for the JDBC connections or exclusive pessimistic locks handles database locking.
      The LockMode class defines the different lock levels as below that can be acquired by Hibernate to obtain exclusive pessimistic locks.
  • LockMode.WRITE is acquired automatically when Hibernate updates or inserts a row.
  • LockMode.UPGRADE can be acquired upon explicit user request using SELECT ... FOR UPDATE on databases which support that syntax.
  • LockMode.UPGRADE_NOWAIT can be acquired upon explicit user request using a SELECT ... FOR UPDATE NOWAIT under Oracle.
  • LockMode.READ is acquired automatically when Hibernate reads data under Repeatable Read or Serializable isolation level. It can be re-acquired by explicit user request.
  • LockMode.NONE represents the absence of a lock. All objects switch to this lock mode at the end of a Transaction. Objects associated with the session via a call to update() or saveOrUpdate() also start out in this lock mode.

What is the difference between hibernate and jdbc ?
  • Hibernate is data base independent, hence hibernate queries will work for many databases including ORACLE, MySQL, SQLServer etc. JDBC queries on the other hand must be database specific.
  • As Hibernate is set of Objects, there is no need to learn SQL language, only Java knowledge is needed. JDBC though does require knowledge of SQL for corresponding database.
  • Don’t need Query tuning in case of Hibernate. When Criteria Queries are used then hibernate will automatically tuned the queries and returns the best result with performance. JDBC needs manual query tuning but provides more control over execution of the queries.
  • Hibernate provides benefit of caching, as it supports two levels of cache, First level and 2nd level cache, thus providing better performance. JDBC on the other hand does not provide built in caching and needs third party or custom implementation for caching.
  • Hibernate supports Query cache and It will provide the statistics about the query and database status. JDBC does not provide any statistics.
  • Development faster in Hibernate since writing queries can be eliminated by loading/saving entity objects, which is not the case for JDBC.
  • Hibernate supports variety of connection pooling mechanisms including hibernate's built in c3p0, Apache DBCP and Proxool. JDBC on the other hand requires custom implementation or third party integration of connection pool.
  • Hibernate provides all the relationships one-many, many-many etc between tables in Entity classes using annotations or in HBM files for easy readability. JDBC has no such view of relationships which are visible only at the actual database level.
  • Hibernate provides lazy loading of objects on the start enabling faster loading of objects, or pre-loading of all objects. JDBC does not have such support.
  • Hibernate provides optimistic concurrency control with versioning, were a version number is added to the entity to check whether the row has not been updated since the last time it was retrieved while updating the persistent object. JDBC does not provide such feature and concurrency has to be handled by custom implementation.  

What are the states of object in hibernate ?
There are 3 states of object (instance) in hibernate.
  • Transient: The object is in transient state if it is just created but has no primary key (identifier) and not associated with database session.They can be turned into persistent objects by calling the save method of session or by adding a reference from a persistent object to this object.
  • Persistent: The object is in persistent state if session is open, and the object is just saved or retrieved from the instance of the database. A persistent object has a primary key value set regardless of it being saved into the database. Calling the delete method of session on the persistent object causes removal of persistent object from the database, thus making it transient.
  • Detached: The object is in detached state because they no longer have a connection to a session object usually because the session is closed. The data of such object is stale due to changes since it last synchronised with the database. After detached state, object comes to persistent state if you call lock() or update() method.

What are the core interfaces of hibernate ?
The core interfaces of Hibernate framework are:
  • Configuration
  • SessionFactory
  • Session
  • Query
  • Criteria
  • Transaction

What is proxy in hibernate and how is it configured ?

Proxies are the mechanism that allows Hibernate to break up the interconnected cloud of objects in the database into smaller chunks, that can easily fit in memory. In simple terms proxy is just a way to avoid retrieving an object until its needed and loading only the required instances.
Example: Consider a table PERSON with columns ID as Numeric, NAME as Varchar and BOSS as Numeric. Here BOSS is foreign key and refers to another row in the PERSON table. The corresponding Person class contains Id as Integer, Name as String and Boss as Person class. Hence while mapping the table row with the Person object hibernate needs to assign a Person object to the Boss field but the type of Boss field is Numeric since its a foreign key. Instead of loading the data from the associated row of the foreign key and storing it in the fields of newly created Person object, hibernate creates a Person object and sets the id property to the value found in the BOSS column. Such Person object is a specialized object which  loads the associated data if necessary. When a method is invoked on such object, hibernate will fetch the data from the associated column and populate the object. This is called proxy mechanism. The proxy object is a subclass of the Person class. 
When a proxy is loaded, the data is not stored in the proxy object, but in the "target" object. The fields in the proxy object will remain null forever (or any value initialized), since all methods invoked on the proxy will be delegated to the target object. The proxy object is not of actual implementation type, but an interface or superclass type, hence the type of the proxy will be different than the type of the actual object.
Proxy objects are enabled by enabling lazy loading. To convert the loaded proxy objects to real ones, either evict the object from Hibernate's cache and reload it. Else execute the getImplementation() method on HibernateProxy.getHibernateLazyInitializer() object.

How to access the lazily loaded collections in hibernate ?

Use Hibernate.initialize() within Transactional to initialize lazy objects as below:
 //start Transaction
 //end Transaction

Out side the Transaction lazy objects can be accessed by calling the size() method on children as below:

Define ORM and benefits of ORM with Hibernate ?

ORM is object relational mapping API library which provides a framework to map an object oriented domain model to relational database. It solves the mismatch issues while persisting the objects into the database and handles dependencies between the tables, by replacing direct database persistence with its high level object handling functions. ORM also gives flexibility to the programmers to write application easily whose data outlives application process.
Benefits of ORM with Hibernate are as follows:

Extensibility: Hibernate is highly extensible and configurable
Reliable: Great stability and good quality proven its acceptance by companies including many thousands of java programmers.
High Performance: Compare to conventional method of writing JDBC code to connect database and persist objects hibernate offers excellent performance in terms of run time performance of the application and programmer productivity. Hibernate support many fetching strategies and locking using its functionality, time stamping, automatic versioning and lazy initialization. It does not require any special database tables or fields and generates many of SQL when system initializes instead of generating it at run time.
Scalability: Hibernate has been designed to fit in application server cluster environment and scalable architecture. It fits well in all environments.
JPA Provider: Hibernate has its own native API and also it is an implementation of Java Persistence API (JPA) specification. Because of JPA specification implementation it can be used straightforwardly all environment which supports JPA including Java EE applications, application servers Java SE, enterprise application etc.
Idiomatic persistence: Hibernate fully supports object oriented concept which includes aggregation, composition, inheritance, polymorphism, association, and Java collection framework. Hibernate does not need any base classes or interfaces for persistent classes and it enables class or data structure to be persistent.
Layered Architecture: Hibernate has a layered architecture, so that we are not obligated to use everything provided by Hibernate. We can just use those features which are sufficient for the project.

Below are some disadvantages of using Hibernate ORM:
Overhead: Some overhead is involved while using ORM. When the database is accessed directly the developers have more control over the executed SQL queries and they could fine tune the queries based on use case and indexes to improve its performance.
Learning: There is a learning curve involved in understanding ORM library.

How to prevent concurrent updates in Hibernate ?
In Hibernate, each interaction with the database occurs in a new Session hence multiple threads cannot access the same session for concurrnt transactions. The developer is responsible for reloading all persistent instances from the database before manipulating them in order to get all the updated values. The application is forced to carry out its application version checking in order to ensure a conversation transaction isolation. The version property is mapped using <version>, and Hibernate automatically increments it during flush if the entity is dirty. Manual checking of
version is used to ensure that entity is not dirty before carrying out an update on the entity.

Hibernate offers automatic version checking with either an extended Session or detached instances as the design paradigm. Hibernate checks instance versions at flush time, throwing an exception if concurrent modification is detected. The exception can be caught and handled by either merging the changes or restarting the business conversation with non-stale data. The Session is disconnected when waiting for user interaction. The application does not version check or reattach detached instances, nor does it have to reload instances in every database transaction.

How to represent inheritance hierarchy in Hibernate ?
Hibernate supports the three basic inheritance mapping strategies:
  • table per class hierarchy: create a table with a row for all the fields in entire class hierarchy and a discriminator column to hold type information.
  • table per subclass: convert the object oriented “is a” relationship into a relational “has a” relationship using foreign keys.
  • table per concrete class: mapping ignores the inheritance relationship and maps one table per concrete class.
  • implicit polymorphism
It is possible to use different mapping strategies for different branches of the same inheritance hierarchy. Hibernate although does not support mixing <subclass>, <joined-subclass> and <union-subclass> mappings under the same root <class> element. It is possible to mix together the table per hierarchy and table per subclass strategies under the the same <class> element, by combining the <subclass> and <join> elements.

Table per class hierarchy: In One Table per Class Hierarchy scheme, all the class hierarchy is stored in (or mapped to) a single SQL table. A discriminator is a key to uniquely identify the base type of the class hierarchy. An extra column (also known as discriminator column) is created in the table to identify the class. The discriminator column specifies the type of the record. This hierarchy scheme offers the best performance even for in the deep hierarchy since single select may suffice. There exists a limitation in such case were columns declared by the subclasses, such as CCTYPE, cannot have NOT NULL constraints. The @Inheritance(strategy=InheritanceType.SINGLE_TABLE), @DiscriminatorColumn and @DiscriminatorValue annotations are used for mapping table per hierarchy strategy.

<class name="Payment" table="PAYMENT">
    <id column="PAYMENT_ID" name="id" type="long">
        <generator class="native">
    <discriminator column="PAYMENT_TYPE" type="string">
      <property column="AMOUNT" name="amount">
       <subclass discriminator-value="CREDIT" name="CreditCardPayment">
          <property column="CCTYPE" name="creditCardType">
       <subclass discriminator-value="CASH" name="CashPayment">
       <subclass discriminator-value="CHEQUE" name="ChequePayment">

Table per subclass hierarchy: Inheritance relationships are represented as relational foreign key associations. Every subclass, class including abstract classes has its own tables. The primary key of the super class will became both primary key and foreign key in the sub class. The tables are created as per class and related by foreign key, hence there are no duplicate columns. The subclass mapped tables are related to parent class mapped table by primary key and foreign key relationship. The <joined-subclass> element of class is used to map the child class with parent using the primary key and foreign key relation. The joined-subclass subelement of class, specifies the subclass. The key subelement of joined-subclass is used to generate the foreign key in the subclass mapped table. This foreign key will be associated with the primary key of parent class mapped table. Using annotations we specify @Inheritance(strategy=InheritanceType.JOINED) in the parent class and @PrimaryKeyJoinColumn annotation in the subclasses.

<class name="Payment" table="PAYMENT">
    <id column="PAYMENT_ID" name="id" type="long">
        <generator class="native">
    <joined-subclass name="CreditCardPayment" table="CREDIT_PAYMENT">
        <key column="PAYMENT_ID"/>
        <property name="creditCardType" column="CCTYPE"/>
    <joined-subclass name="CashPayment" table="CASH_PAYMENT">
        <key column="PAYMENT_ID"/>
    <joined-subclass name="ChequePayment" table="CHEQUE_PAYMENT">
        <key column="PAYMENT_ID"/>

Hibernate's implementation of table per subclass does not require a discriminator column. Although a discriminator column can be used with the table per subclass strategy, by nesting the <join> inside the <subclass>.

Table per concrete class: In this strategy, tables are created as per class. A table for each non abstract class is created and all the properties of a class including inherited properties are mapped to columns of the table. Duplicate column is added in subclass tables. There are two ways to map the table with table per concrete class strategy.
  • By union-subclass element
  • By Self creating the table for each class
The union-subclass subelement of class, specifies the subclass. It adds the columns of parent table into this table, thus working as a union. Each table defines columns for all properties of the class, including inherited properties. This creates duplicate columns in the subclass tables. Since tables are created per class there are no nullable values in the table. A limitation of this approach is that if a property is mapped on to superclass, the column name must be the same on all subclass tables. Also the primary key seed has to be shared across all unioned subclasses of a hierarchy. If your superclass is abstract its mapped with abstract="true", else an additional table is needed to hold instances of the superclass.
The @Inheritance(strategy = InheritanceType.TABLE_PER_CLASS) annotation is used in the parent class which specifies that its using table per concrete class strategy. The @AttributeOverrides annotation is used in subclasses and defines that parent class attributes will be overriden in subclass. In table structure, parent class table columns will be added in the subclass table.

<class name="Payment" table="PAYMENT">
    <id column="PAYMENT_ID" name="id" type="long">
        <generator class="native">
    <property name="amount" column="AMOUNT"/>
    <union-subclass name="CreditCardPayment" table="CREDIT_PAYMENT">
        <property name="creditCardType" column="CCTYPE"/>
    <union-subclass name="CashPayment" table="CASH_PAYMENT">
    <union-subclass name="ChequePayment" table="CHEQUE_PAYMENT">

In the second approach a table is created for each class, and all the inherited properties are added to the subclasses without any hibernate mapping.

Implicit Polymorphism: Implicit polymorphism means if a class or interface is used in HQL, criteria or named queries, hibernate fetches the records from the table mapped to the used class along with all the tables mapped to its subclasses, at any hierarchy level. In such query hibernate performs following operations to fetch the results:
  • Identify all the subclasses of the entity, which are mapped to any table
  • Fire individual query to all of the tables
  • Combine the result of all the queries into list and return from hibernate layer 
A disadvantage of using implicit polymorphism is that the user might fire a query at very generalized class, causing hibernate to extract lots of data resulting into out of memory error.

What is Component and a Composite element in Hibernate ?
A component is a contained object that is persisted as a value type and not an entity reference. For example, Name object (with fields initial, first, last) can be persisted as component of Person and defines getter and setter methods for its persistent properties. Components do not support shared references.

Below are the details regarding Composite element:
  • Collections of components such as array of Name objects is represented using component collection with the <composite-element> tag. 
  • Composite elements can in turn contain components using the <nested-composite-element> tag, but should not contain collections. There is no separate primary key column in the composite element table. Hibernate uses each column's value to identify a record hence only not-null properties are used in a composite-element along with <list>, <map>, <bag> or <idbag>.
  • The <composite-map-key> element allows to map a component class as the key of a Map.
  • Component can be used as an identifier of an entity class. It must implement and re-implement equals() and hashCode() consistently with the database's notion of composite key equality. Custom identifiers must be used instead of IdentifierGenerator to generate composite keys. The <composite-id> tag, with nested <key-property> elements are used in place of the usual <id> declaration. 

What are the different types of collections supported by hibernate ?
Hibernate supports the basic java collections types namely List, Set, Array, Map and also provides support to other collections, such as bag and idbag.

Persisting Lists: Hibernate maintains a separate table to store the list of values apart from the main table. Hibernate mapping of the separate table for list has a foreign key of main table and an index column which holds the list indexes. The index column is populated with each list element’s position, and is used later to reconstruct the list elements in their original positions. Below is the mapping, were showroom entity contains a list of cars.

<hibernate-mapping package="com.hibernate.collections.list">
  <!-- Showroom class mapping definition -->

  <class name="Showroom" table="SHOWROOM_LIST">
    <id column="SHOWROOM_ID" name="id">
      <generator class="native"/>
    <list name="cars" cascade="all" table="CARS_LIST">
      <key column="SHOWROOM_ID"/>
      <index column="CAR_INDEX"/>
      <one-to-many class="Car"/>

  <!-- Car class mapping definition -->

  <class name="Car" table="CARS_LIST">
    <id column="CAR_ID" name="id">
      <generator class="native"/>
    <property column="NAME" name="name"/>
    <property column="COLOR" name="color"/>

A Foreign Key is used with the JoinColumn annotation to represent a list:

  @JoinColumn(name="SHOWROOM_ID")  // foreign key
  private List<car> cars = null;

Persisting Sets: The set implementation is similar to list implementation in hibernate. The datatype of the collection in entity class is changed to java.util.Set. In hibernate mapping, the <list> element is replaced by <set> element and the index column tag is removed. Also the class of collection item must implement equals and hashCode methods.

Join table strategy of annotation is used for mapping (join) table, which is an intermediary table holding the primary keys from both tables. The @JoinTable annotation indicates that an intermediary table with specified name will be used.
   joinColumns = @JoinColumn(name="SHOWROOM_ID")
  private Set<car> cars = null;

Persisting Maps: The map implementation is also similar to list implementation with that collection datatype as Map<String, Car>. The hibernate mapping contains a <map> element replacing the <list> element which contains foreign key of main table and <map-key> attribute which defines the key of the map.

Persisting Arrays: The array implementation is similar to list implementation were the datatype is an array e.g. String[]. In hibernate mapping, the <list> element is replaced by <array> element which contains the foreign key and the index column attributes similar to the list.

Persisting Bags: Bags is a special type of collection which is unordered and have no indexing of elements. They also allow duplicate elements. The java code uses List data type within the entity class, but hibernate mapping uses bag element instead of list element. Also the bag element does not contain the index element.

Persisting Idbags: idbags is a collection that provides a mechanism to have a surrogate key on the persisted collection itself, unlike bags where no key exists. The collection-id element creates a primary key on the join table. In addition to its own primary key, the join table will also carry primary keys from the other two tables.

Below is the mapping, were showroom entity contains a list of cars.

<hibernate-mapping package="com.hibernate.collections.list">
  <!-- Showroom mapping definition with cars variable
  mapped to CARS_MAP table using map tag -->

  <class name="Showroom" table="SHOWROOM_MAP">
    <id column="SHOWROOM_ID" name="id">
      <generator class="native"/>
    <property column="MANAGER" name="manager"/>
    <map name="cars" cascade="all" table="CARS_MAP">
      <key column="SHOWROOM_ID"/>
      <map-key column="CUST_NAME" type="string" />
      <one-to-many class="Car"/>

  <!-- Simple Car class-table mapping -->
  <class name="Car" table="CARS_MAP">
    <id column="CAR_ID" name="id">
      <generator class="native"/>
    <property name="name" column="CAR_NAME" />
    <property name="color" column="COLOR" />

How to improve performance using Hibernate ?
1) Reduce primary key generation overhead. In processes that are 'insert-intensive', the choice of a primary key generation strategy matters a lot. A common way to generate id's is to use database sequences, which avoids network round-trips made to the database in order to obtain id. Also optimized generators should be used by using the SEQUENCE strategy instead of AUTO key generation strategy.

2) Use JDBC batch updates: For batch programs, JDBC drivers usually provide an optimization for reducing network round-trips named 'JDBC batch inserts/updates'. When these are used, inserts/updates are queued at the driver level before being sent to the database. When a threshold is reached, then the whole batch of queued statements is sent to the database in one go. This prevents the driver from sending the statements one by one, which would waist multiple network round-trips. This is the entity manager factory configuration needed to active batch inserts/updates:

3) Periodically flush and clear the Hibernate session: When adding/modifying data in the database, Hibernate keeps in the session a version of the entities already persisted, just in case they are modified again before the session is closed. Mostly these entities can be safely discarded once the corresponding inserts where done in the database. This releases memory in the Java client process, preventing performance problems caused by long running Hibernate sessions. The entityManager's flush() and clear() methods are used to persist new entities if any and release the entities in session.

4) Reduce Hibernate dirty-checking overhead: Hibernate internally uses a mechanism to keep track of modified entities called dirty-checking. It uses to keep the performance cost of dirty-checking to a minimum, but adds overhead especially for tables with a large number of columns. This can be avoided by setting read only transactions using the @Transactional(readOnly=true) annotation.

5) Search for 'bad' query plans: Find bad queries which carry out full table scans or full cartesian joins, using query execution plans and modify them.

6) Check for wrong commit intervals: In case of batch processing, the commit interval makes a big difference in the performance results. Hence check the commit interval to be the expected value usually around 100-1000.

7) Use the second-level and query caches: If some data is identified as being eligible for caching, then setup the Hibernate Second-Level / Query Caches for storing the frequently accessed data.

When to use orphan remove in Hibernate ? When to use CascadeType.PERSIST ?
If we have a Parent entity with a collection of child entities in hibernate, and we removed the child entity from the parent, then want to delete the parent entity as well. But hibernate will not delete the parent as the OneToMany relationship has CascadeType.ALL enabled works from parent to child. In such case we need to add "orphanRemoval = true" to the OneToMany relationship mapping.

Imagine now we want to update in such a way that we remove the specified child entity from the parent and add the "same child entity" to another parent. When we add the child entity to new parent's collection and remove the same child entity from its old parent as below, we get an exception "deleted entity passed to persist":
The issue occurs when "orphanRemoval = true" is specified for OneToMany mapping in the Parent entity to a collection of child entities. The alternative to orphan delete is by calling the explicit remove method as below:
When we update the child entity in such a way that we remove the specified child entity from the parent and add the "same child entity" to the another parent with the below code.
      parentExEntity = findOrCreateParentEntity();
This works fine when the parent entity already exists in the database. But when we want to add the existing child entity to the new parent entity then hibernate throws the below "unsaved transient instance" exception.

Caused by: org.hibernate.TransientObjectException: object references an unsaved transient instance - save the transient instance before flushing:
                at org.hibernate.engine.CascadingAction$9.noCascade(
                at org.hibernate.engine.Cascade.cascade(

The reason behind this exception is that hibernate has the child entity in its context but the new parent entity is not present in the context. As the OneToMany relationship mapping specifies from Parent to Child (and not inverse), hibernate cannot create parent for the child. Also when we say set this new parent (which is not present in hibernate context) to this existing child hibernate throws "save the transient instance". The solution to this issue is to add "cascade = CascadeType.PERSIST" to the child entity for Child-Parent mapping (ManyToOne Relationship).

Note: In JPA-Hibernate, the order of the JPA operations performed within the method does not matter as hibernate performs the summary of all the operations in a single transaction.

No comments:

Post a Comment