Problems with ORMs – Obsidian Scheduler

Object Relational Mapping tools like Hibernate have helped developers make huge productivity gains in dealing with relational databases in the past several years. ORMs free developers to focus on application logic and avoid writing a lot of boilerplate SQL for simple tasks like inserts or queries.

However, the well-documented problems with object-relational impedance mismatch inevitably cause headaches for developers. Relational databases are a specialized technology built on sound concepts, but they don’t necessarily line up to the object-oriented world. There are a few approaches and styles to using ORMs which have various pros and cons.

One of the fundamental choices to using an ORM is deciding whether the you will generate your ORM mappings from the database schema or the other way around, generating your database schema off of some ORM definition (possibly an XML configuration file, annotations or something like XDoclet).

The former approach of generating your ORM layer from your database schema means you have to deal with the database in its own language and terms, whether you deal with DDL specific to the database or have some abstraction layer, but nevertheless you are forced to treat the database as what it is. Unfortunately, it means you need expertise in the technology and it may take more work than allowing your schema to be generated. However, this forces developers to understand and deal with the RDBMS properly – it is dangerous and harmful to treat a DBMS as a simple datastore. Developers need to consider the impact of keys, indexes, etc. when designing applications, and shielding them from the realities of relational databases can be dangerous, and in my experience this always turns out badly. A related issue is the use of POJOs, which end up being manipulated by the ORM framework. While this sounds nice in theory, in practice you can hit various kinds of issues and it can be tempting to mix application logic with what should really amount to data access objects. Developers and architects like to praise the separation of concerns by using Spring and other frameworks, and there’s no real reason why the same concept shouldn’t be applied here. One other minor issue is the need to maintain the POJOs and mapping definition, but this is usually not too much work.

The second approach of generating your ORM mappings and code from your schema is my preferred approach. In my experience of using both approaches, having beans generated off of your schema allows your beans to be intelligently designed and be only as complicated as required, while making available fetch by PK, by index, etc. all for free. It also becomes easier to manage things like lazy collections and referenced objects since it is all managed within the persistent class itself. This approach also avoids the need for writing boilerplate POJOs and forces you to treat your data access objects separately from your domain objects and business logic. In my experience with generating data access beans off your schema, the beans end up being richer, more usable, perform better, and once you have your infrastructure in place, maintenance costs are lower. One may think that you end up needing additional data-wrapper classes, but in practice the need for separate bean classes is independent of what is going on in your data access layer. One issue here is the availability of frameworks to do this generation work for you – in the past, I have worked with custom-built solutions that worked well and paid off, but required initial up-front work. On smaller projects there may not be enough pay-off to be worth investing in such an effort. At the same time, there are ORMs out there that take this approach and generate persistent entity classes, such as jooq, but I have to try them.

Hibernate is the most popular ORM in the Java world and while it is a step up from dealing with writing copious amounts of SQL, it has its problems. Generally the approach is to define your mappings and POJOs and then let Hibernate manage your SQL generation. The problem with this is your defined schema is often less than ideal and things like proper indexing end up overlooked. Hibernate also forces you to end up using their transaction and query mechanisms, though the extent you choose to use their proprietary stuff is up to you. This isn’t necessarily a bad thing in all cases, but personally I have a distaste for the HQL language which is frequently used, since it introduces a familiar-yet-different language to developers which others will later have to maintain and try to figure out. There are also issues with query optimization that can crop up, and having done significant work on performance tuning in the past, access to actual queries for tuning is a must for me. I also believe that trying to implement inheritance in persistence classes is just a bad idea – it’s not worth trying to force a concept onto a technology which naturally does not accommodate it. Hibernate tempts developers to implement inheritance in the database through support for table-per-hierarchy and table-per-class mechanisms, but it is a mistake in my mind since you end up with a poor data model and problems later managing and extending the hierarchy. I also do not like to populate what should be a clean relational model – you can’t pretend relational DBs are object-oriented datastores.

If you take one thing away from this post, it should be to not ignore the actual technologies you are using. Treat an RDBMS for what it is, and learn to use it. Do the same for object-oriented systems. By all means, try to make your life easier by using ORMs to avoid writing boilerplate code and unnecessary SQL, but don’t think you can avoid dealing with some kind of translation or code to deal with the natural mismatch that occurs. Do not expect a framework or tool to solve the problem for you. Developers are paid to think and discerning the best road to take, so we shouldn’t be afraid to deal with problems as they come up and solve them intelligently. As with many things, the 80-20 rule seems to apply here. Use ORMs to take care of 80% of the work for you, and be prepared to write SQL and some persistence code for the other 20%. Don’t expect too much or you will end up with several types of problems – strange bugs, performance issues, a poorly designed object-oriented model, etc.

I’d love to hear your experience and thoughts on ORMs in any language and the issues you’ve faced, along with how you dealt with them. This is one of those topics where experience is incredibly valuable, so please share your thoughts.

3 thoughts on “Problems with ORMs”

Lukas Eder

September 2, 2011 at 7:12 am

Nicely put. The Object-Relational Impedance mismatch is not going to go away, as the two paradigms will continue to coexist and JPA/CriteriaQuery just copies off of Hibernate (including its problems), instead of going into a similar direction as .NET’s LINQ. Truly, ignoring the fact that the two worlds are different will always be the most important source of problems in any software dealing with an RDBMS.

While many applications don’t really need SQL, some do, extensively. It makes a big difference whether you use IN (…) or EXISTS (…), whether you use OR, or UNION’s. And many times, grouping is better/worse than using window functions. As you said, this requires lots of expertise, which should not be neglected in larger projects. See also this article:

http://lukaseder.wordpress.com/2011/09/02/oracle-scalar-subquery-caching/

Thanks anyway for linking to jOOQ. I’d be curious about your feedback, should you decide to try it for yourself!

Cheers
Lukas
Pingback: Problems with ORMs Part 2 – Queries | Carfey Software Blog
Pingback: Java Enterprise Software Versus What it Should Be | Carfey Software Blog

Comments are closed.