Object-oriented databases complement relational databases in
important ways, says Anat Gafni,
VP of Engineering at db4objects, the company behind the open-source object
database db4o. In this interview with Artima, Gafni explains how OO databases support agile development,
and how they co-exist with relational databases in an enterprise.
Although object-oriented databases are a topic of curiosity
for many developers, few enterprise developers deploy their applications using
an OO database. The initial period of over-excitement about OO databases has
long abated, and gave way to a strong dose of skepticism about OO database
technology.
In this interview from JavaOne
2008, Anat Gafni, CTO of
db4objects, explains that a second generation of object databases addresses
important problems in application development and maintenance. She also talks
about some of the features of db4o, the open-source object-oriented database
for Java and C#:
db4o
is an object database, which is not what developers typically turn to when they
think about persisting information. Although most developers program in objects
and object-oriented languages, they tend to use relational databases, and may
not look beyond relational database technology.
Object databases
have been around for a long time. And, in fact, we are now at a stage when a
second-generation of object-oriented databases have started to spring up that
actually provide very useful properties for developers as well as for
enterprises. The way object-oriented databases work, and the way in which they
differ from relational databases, requires that developers think of persistence
a bit differently.
With the first
generation of OO databases, people got overly excited, and thought that object
databases would replace relational ones. OO databases, however, are instead
complementary to relational databases, a realization that was not understood
until relatively recently. OO databases have strengths and weaknesses, and they
can live side-by-side with relational databases.
How do you know
when an OO database is more useful than a relational one? You want to identify areas
where the data and the data schema is modified frequently, or areas where you
have complex relationships, if you need to store your in-memory objects very
fast, or if you need those objects to be retrieved in the same form your
program uses. If you have multimedia data, then an OO database is useful also.
On the other hand,
areas that are pretty standard, where fields and your schema are static and
don't change, and where you need a lot of ad-hoc reporting with standard
queries, those could use a relational database.
You also need to
consider who is in charge of the data schema. If the developer is in charge,
and if there is no central administrator, then an OO database is a good fit. On
the other hand, if the company has a central data repository,
and in fact a central administrator owns the data, and if that data is outside
the developer's purview, then you'd want to go with a relational database.
db4o
goes beyond what even the second-generation object-oriented databases provide. db4o set it as a goal to maintain and preserve the
object-oriented nature of an environment by using a natural interface—natural
to the programming language—and by allowing as much flexibility and
transparency in the schema and the object model as possible.
Although
object-relational technology has come a long way, it still does not solve the
object/relational impedance mismatch. Some developers have come to believe that
the latest O/R mapping technologies bridge the gap of objects and relational
databases seamlessly. On the surface, that may appear so. But you need to look
a bit deeper, and realize that object systems and relational databases have
different purposes and foundations.
Object-oriented
languages became popular because they facilitate change well. All the
object-oriented constructs, such as inheritance, encapsulation, abstraction,
polymorphism, or interfaces, and many of the OO patterns, shine in comparison
with other approaches to program design when it comes time to change your data
model or object model and, in general, when you need to make changes to your
application. And changes happen all the time to many types of applications. The
more complex your objects and your data model, the more changes are likely to
occur.
By definition,
relational databases have a schema. The schema is something that's supposed to
be set, and is something that doesn't change very often. As soon as you break
your objects into tables, those tables are constrained by the schema, and you
loose many advantages OO programming languages and systems give you. That's
true even if you can relatively easily map your objects to tables with an O/R
mapping technology: object systems and relational schema differ in their
notions of change and flexibility.
In db4o, you can
modify your objects, and on the fly db4o will adjust to those changes. You
don't need to incur any overhead for that. There are no special steps to take,
no schema migration.
When you want to
add a field to an object, or change the type of a field, for instance, in db4o
you can just make those changes to your objects and classes. db4o
will identify the changes and manage them behind the scenes, automatically.
When you retrieve an instance of your old object, whose property type is
different, for example, the new object's property value will either be a
default value, or you can even specify to db4o a method that the database will
use to assign a new field value. All of that is done transparently. db4o upgrades your schema dynamically, as you go. That's a
great value when you're making changes to you program. You can even work with
the old and the new version of an object together. When the new version has a
field that the old one didn't, that field in the new object will be null.
db4o
provides a simple, programming language-based query system, and you can persist
your objects with a single line of code. We support Java and C# currently.
Objects you persist in Java can be accessed from C#, for example.
db4o
also goes beyond many other OO databases in making it easy to use OO and a
relational databases side-by-side. To that effect, we have replication
capabilities that can replicate the object data store to a relational store and
vice versa. The way that works now is that you provide Hibernate-based mappings
between a db4o data model and a relational model. We also have something called
the db4o Replication System, or DRS, that facilitates db4o-to-db4o replication.
It queries one database to determine the differences from another database, and
makes a list of objects that need to be synchronized and performs the
synchronization in both directions. So that allows db4o to live happily with
relational databases, and makes it possible for the developer to use each
technology as it fits best.