VAPOR: Usage Examples of Implemented Features

(c) Oliver M. Bolzer, 2003
$Id: example.html 278 2003-11-22 08:26:58Z bolzer $

Before running VAPOR

In order to use VAPOR, several Ruby libraries need to be installed on the target system. Additionally a PostgreSQL database is required as the backend storage. The following libraries are required:

Making a Class Persistence-capable

Before an object can be stored using VAPOR, a database that will hold the repository needs to be created and the repository must be initialized. How the database is create depends on the backend RDBMS used, but usually by issuing a CREATE DATABASE statement or using a specific tool. Read your RDBMS's documentation on how to create a new database. The database is then initialized as a Repository by using the vaporadmin tool.

The following example creates a new PostgreSQL database with it's encoding set to Unicode (s. PostgreSQL Administrator's Guide Ch. 7) and initializes it for use with Vapor.


  foo@bar: ~> psql -h host -U user template1
  Password:  password

  Welcome to psql 7.3.2, the PostgreSQL interactive terminal.
  template1=> CREATE DATABASE database_name WITH ENCODING = 'UNICODE';
  CREATE DATABASE
  template1=> \q

  foo@bar: ~> vaporadmin user@host/database_name init
  Password: password
  foo@bar: ~>

Next, information about the classes to be stored in the Repository must be made known to the Repository. Some metadata that can not be deducted from the class's definition in Ruby is written into a XML-file, namely which attributes are to be persistently stored. Such a XML-file (tutorial.xml) with two simple classes looks like:


<vapor>
  <class name="Person">
    <attribute name="first_name" type="String" />
    <attribute name="last_name" type="String" />
    <attribute name="inhabits" type="Reference" />
  </class>
  <class name="City">
    <attribute name="name" type="String" />
    <attribute name="altitude" type="Integer" />
  </class>
</vapor>

The information from this XML-file, is imported to the Repository by again using the vaporadmin tool. Any number of XML files can be specified on the command line. vaporadmin will abort with an error if a class' supposed superclass is not already registered with the Repository.


  foo@bar: ~>vaporadmin user@host/database_name add tutorial.xml
  Password: password
  Attempting to add classes to repository:
    Person
    City
  Added 2 new classes to Repository.
  foo@bar: ~>

Now that the database is prepared, the actual Ruby classes need to to be made aware of their Persistent-capability. This is done by including the Vapor::Persistable module. No other change is needed to the class' code itself. The only requirement VAPOR makes is that the class must have a constructor without any arguments. Otherwise VAPOR will not be able to instantiate an "empty" object to which it feeds it's attributes when reinstantiating the object from the Repository.

  require 'vapor'

  class City
    include Vapor::Persistable

    def initialize( n = nil , a = nil )
      @name = n
      @altitude = a
    end

    attr_reader :name, :altitude

  end

Preparing the PersistenceManager

Before any operations on the storage can be executed, a PersistenceManager, the frontend for all operations on the Repository, must be instantiated with the proper access credentials for the backend Datastore. The credentials are passed in form of an Hash-like object that responds to [] with the proper keys and returns nil for unknown keys. Most often, it will be a real Hash.

The following code creates a PersistenceManager with the specified access credentials and keeps it in the variable @pmgr. For this example, Autocommit-Mode is turned off, to better illustrate state changes of persistable objects.

  require 'vapor'

  properties = Hash.new
  properties[ 'Vapor.Datastore.Name'     ] = 'bolzer_vapor_test'
  properties[ 'Vapor.Datastore.Host'     ] = 'db'
  properties[ 'Vapor.Datastore.Port'     ] = 5432
  properties[ 'Vapor.Datastore.Username' ] = 'bolzer'
  properties[ 'Vapor.Datastore.Password' ] = 'foo02bar'
  properties[ 'Vapor.Autocommit'         ] = false

  @pmgr = Vapor::PersistenceManager.new( properties )

Populating the Storage

Once the classes are marked as Persistable and the DB-schema has been created, instances of Persistable classes can be instantiated and added to the storage.

The following code instantiates two Citys and marks the objects as persistent inside a transaction. In Vapor, all changes to states of persistent objects must occur in a transaction. Only when the transaction is commited, are the changes written to the Datastore.

  tokyo = City( 'Tokyo', 0 )
  berlin = City( 'Berlin', 34 )
  
  tokyo.oid                  # => nil
  tokyo.state                # => Vapor::Persistable::TRANSIENT

  @pmgr.transaction{|tx|    # start transaction
    tokyo.make_persistent
    berlin.make_persistent
    tokyo_oid = tokyo.oid   # => 64212
    tokyo.state             # => Vapor::Persistable::NEW
  }                         # commit transaction, writing changes to Datastore

  tokyo.state               # => Vapor::Persistable::PERSISTENT

Retrieving a single Object from the Storage

A single object can be specifically retrieved from storage, if it's OID is known. The following code retrieves the City-object for Tokyo, saved above

  tokyo = @pmgr.get_object( tokyo_oid )

  tokyo.name             # => 'Tokyo'
  tokyo.altitude         # => 0

  tokyo.oid == tokyo_oid  # => true

Retrieving all instances of a class

Knowing the OID of stored objects beforehand is often not practical. If the have to be stored elsewhere, it defeats half the purpose of having a convenient object repository.

Very often, all instances of a specific class have to be processed. For this purpose, Vapor can retrieve all instances of a class at once. If all of them are created into memory at once, the procedure would be very heavy-weight. So Vapor returns an special Enumerable object (Extent) that contains information about multiple Persistables, but actually instantiates them only when they are needed via the Extent#each Iterator.

By default, persistent instances of persistent subclasses are also returned. If only instances from the class itself should be returned, specify false as the second argument to get_extent().

The following code reinstantiates the City objects saved earlier

  cities = @pmgr.get_extent( City )

  cities.type          # => Vapor::Extent
  cities.empty?        # => false
  cities.size          # => 2

  cities[0].name       # => 'Tokyo'
  cities[1].name       # => 'Berlin'

Updating an persistent Object

In order to modified an Persistable's content, it has to be first loaded, then modified in the usual ways using the object's methods and then flushed back to the repository. Unfortunately, there exists no easy way to detect that an object has changed in Ruby, short to hooking all methods. So an object must explicitly be marked as "dirty" before it is considered by the PersistenceManager to be flushed out, using the Persistable#mark_dirty method. The object should mark itself dirty after changing one of it's persistent attributes so that the user of the object does not have to call the mark_dirty.

  class City    # extend class defined above
    def altitude=( x )
      @altitude = x
      self.mark_dirty
    end
  end

  @pmgr.transaction{|tx|                   # begin transaction
    tokyo = @pmgr.get_object( tokyo_oid )  # load object
    tokyo.altitude = 100                   # object marked as dirty in here
  }                                        # commit changes 

Getting rid of an persistent Object

Objects that are persistently stored in the repository can be deleted. Deleted objects become transient again and only exist in-memory and not in the repository anymore, losing all persistent identity. If a deleted object is made persistent again, it will have a different OID than before.

  tokyo = @pmgr.get_object( tokyo_oid )

  @pmgr.transaction{|tx|     # begin transaction
    tokyo.delete_persistent
    tokyo.state              # => Vapor::Persistable::DELETED
    tokyo.oid == tokyo_oid   # => true 
  }                          # commit changes 

  tokyo.state                # => Vapor::Persistable::TRANSIENT
  tokyo.oid                  # => nil

  @pmgr.transaction{|tx|     # start another transaction
    tokyo.make_persistent
    tokyo.state              # => Vapor::Persistable::NEW
    tokyo.oid == tokyo_oid   # => false
  } 

Retrieve an collection of Objects by search

Very often, it is needed to retrieve only a subset of the instances of a class matching a certain (search) criteria. For this purpose, the PersistenceManager#query() method exists. The following code retrieves a City object, who's name is 'Tokyo'

By default, persistent instances of persistent subclasses that match the query are returned too. Should only persistent instances of the class itself be returned, specify false as the third argument to query().

  name = "Tokyo"
  altitude =  0

  cities = @mgr.query( City, "name = ? AND altitude = ?", [ name, altitude ] )

  cities.type            # => Vapor::Extent
  cities.candidate_class # => City
  cities.size            # => 1
  
  tokyo = cities[0]

The query string is made up of pairs of attribute-name and place-holders for their values. The values are specified in an array that contains them in their order of appearance. Several pairs can be specified with 'AND' and only objects matching all pairs are returned.

Currently supported comparison operators are "exact match" (=) for all types of attributes and "similar to" (~) for Strings, where an question mark (?) matches any single character and an asterix (*) matches any string of zero or more characters. The "similar to" operator always covers the entire attribte value and is case-sensitive. To match a pattern anywhere within the attribute value, the pattern must therefore start and end with an asterix. If the search includes literal asterixes or question marks, they need to be escaped by a backslash (\). Other operators will be supported upon request.

Examples:

Match any City or subclass with an altitude of zero:
@mgr.query( City, "altitude = ?", [ 0 ] )

Same as above, but only match instances of City
@mgr.query( City, "altitude = ?", [ 0 ], false )

Match all City and subclass instances that have "heim" anywhere in their name
@mgr.query( City, "name ~ ?, ["*heim*" ] )

The query language expressed in a BNF-like syntax:

   QUERY_STRING    := <SEARCH_CRITERIA> [ AND <SEARCH_CRITERIA> ]
   SEARCH_CRITERIA := <ATTRIBUTE_NAME> <COMP_OP> ?
   ATTRIBUTE_NAME  := [A-Za-z_]+
   COMP_OP         := = | ~

Future plans include the ability to support literal values where variables do not need to be bound to the query.

Improving search performance with Index hints

If some attributes are searched very often, it might be useful to hint indexes to be created for these attributes to speed up search queries. For single attribute indexes, just add a index="true" attribute to the <attribute /> tag in the metadata file. For multi-attribute indexes, use a special <index> tag inside the class' <class> tag. A multi-attribute index for the attributes A and B are useful when searching for attribute A or (A and B). It will not be used for searches involving only B. When queries for A only and B only and for A and B are equally often executed, consider creating a single attribute index for each of them. They will be used also for the A and B case.

Let's imagine, for the above defined Person class, most of the search queries are either for the inhabits attribute only or for both first_name and last_name. To improve performance, we create an single attribute index for inhabits and a multi-attribute index over last_name and i first_name, assuming that we occasionally also search for last_name alone. The XML metadata definition would now look like:

  <class name="Person">
    <attribute name="first_name" type="String" />
    <attribute name="last_name" type="String" />
    <attribute name="inhabits" type="Reference" index="true" />
    <index>
      <attribute name="last_name" />
      <attribute name="first_name" />
    </index>
  </class>

Note:Index hints from superclasses are NOT inherited. If inherited attributes are to searched for often, too, create indexes for them using <index<. Multi-attribute indexes over attributes from the class itself and attributes from superclasses are also possible.

Adding uniqueness constraints to a class

Limiting the valid range of values for an attribute is the job of the class' methods. But sometimes, some constraints are required to be satisfied by the all instances together, like that a specific attribute's value must be unique over all instances of the same class. E.g. the student ID should be unique for all students or no two cities shall have the same name. Of course, the class could search for duplicates in the Repository before setting attribute values but that would contradict with out goal that the classes should know as little as about Vapor.

Vapor has support for uniqueness constraints for single or multiple attributes of a class. If an object is newly made persistent and another already persistent object has the same value for a unique attribute, an UniquenessError is raised.

Again using our Person and City classes, we add the requirement, that no two cities can have the same name and that no two persons can have the same first and last name.

  <class name="Person">
    <attribute name="first_name" type="String" />
    <attribute name="last_name" type="String" />
    <attribute name="inhabits" type="Reference" index="true" />
    <index>
      <attribute name="last_name" />
      <attribute name="first_name" />
    </index>
    <unique>
      <attribute name="last_name" />
      <attribute name="first_name" />
    </unique>
  </class>
  <class name="City">
    <attribute name="name" type="String" unique="true"/>
    <attribute name="altitude" type="Integer" />
  </class>

An index is automatically created for the attributes of each uniqueness constraint.

Now based on a Repository using above metadata, let's try to save two Cities with the same name. When committing the second city, an UniquenessError will be raised.

   tokyo = City( 'Tokyo', 0 )
   tokyo2 = City( 'Tokyo', 8332 )

   @pmgr.transaction{|tx| 
     tokyo.make_persistent
   }             #  tokyo in Repository 

   @pmgr.transaction{|tx| 
     tokyo2.make_persistent
     tokyo2.state           # => Persistable::NEW
   }                        # => Vapor::Exceptions::UniquenessError, "uniqueness constraint violated"
                            # automatical rollback
   tokyo2.state             # => Persistable::TRANSIENT
   

Note:uniqueness constraints are NOT inherited. If fields that are defined in a superclass should have an uniqueness constraint, specify it using <index> tag. Of course, multi-argument constraints over arguments from the class itself and arguments from superclasses are possible.

Note 2:Currently attributes that are Arrays can't be part of an uniqueness constraint due to restrictions of PostgreSQL which enforces the constraints. (no unique index creatable on arrays)

Refreshing Objects

Vapor caches loaded persistent objects to preserve identity of in-memory objects and to avoid accessing the Datastore unless neccersary. While an object is loaded in-memory by one PersistenceManager instance, another PersistenceManager might have changed it in the Datastore. This is detected at commit, but an object can also manually be refreshed to the current data in the Datastore using the Persistable#refresh_persistable method. This sets the values of the object's persistent attributes to those currently in the Datastore and stets the objet's state to PERSISTENT. All uncommited changes are discarded.

However, be aware that right after the object is refreshed, it can be changed again in the Datastore by another PersistenceManager.

  munich.altitude            # => 550
  sleep 300                  # somebody does something while we rest

  munich.refresh_persistent  # retrieve that work

  munich.altitude            # => 560

Transactions

Regular Transactions

Multiple PersistenceManagers could be accessing the same Repository at the same time. In order to avoid inconsistency like lost changes, Vapor implements transaction that guarantee that objects don't change in the repository without the application noticing.

Under normal circumstances, all changes to persistent Persistables must occur inside a transaction. Transactions are started by acquiring it from the PersistenceManager using PersistenceManager#transaction() and finished either by committing it, which means that all changes are written to the Datastore, or by rolling the transaction back, whereby all changes are discarded and the state before the transaction is restored.

Vapor uses optimistic locking to prevent concurrent PersistenceManager instances overwriting each other's changes. When a transaction is about to be committed, Vapor checks wheter all objects that are going to be changed have not changed in the Repository since they were loaded or last refreshed by another PersistenceManager instance. Should this be the case, a StaleObjectError (when the object has changed) or a DeletedObjectError (when the object has been deleted) is raised.

When an error such as an UniqnessError, DeletedObjectError or StaleObjectError occurs during commit, the transaction is automatically rolledback before the exception is raised to prevent Repository inconsistencies. The object that caused the error can be determined by calling the causing_object() method on one of these errors.

There are two ways to begin and commit or rollback transaction. One way is to make the apropriate begin(), commit() or rollback() methods on the PersistenceManager's Transaction-object, which can be obtained through PersistenceManager#transaction. Alternately, PersistenceManager#transaction() can be called with an block to which the Transaction-object is passed. The transaction is automatically started before the block and automatically committed when the block terminates. If the block raises any exception, the transaction gets automatically rolledback.

   munich = City.new( 'Munich', 530 )

   munich.make_persistent     # => Vapor::NotInTransactionError
   munich.state               # => Vapor::Persistable::TRANSIENT

   # using an Transaction-object
   t = @pmgr.transaction
     munich.make_persistent
     munich.state             # => Vapor::Persistable::NEW
   t.commit
   t.commit                   # => Vapor::StaleTransactionError

   munich.state               # => Vapor::Persistable::PERSISTENT

   # Transaction as a block
   @pmgr.transaction{|t|
     munich.altitude          # => 530
     munich.altitude = 550
     munich.altitude          # => 550
     munich.state             # => Vapor::Persistable::DIRTY
     t.rollback
   }
     
   munich.altitude            # => 530
   munich.state               # => Vapor::Persistable::PERSISTENT

   # nested Transactions
   @pmgr.transaction{|t|
     t2 = @pmgr.transaction    # => Vapor::NestedTransactionError
   }                           # automatic rollback

Autocommit-Mode

By default, all changes to persistent objects ( Persistable#make_persistent, Persistable#delete_persistent, Persistable#mark_dirty) are instantly saved back to the Repository. This might have negative impact on performance and does not guard against Repository inconsistency, if the application crashes or gets interrupted, without making all intended changes.

This Autocommit-Mode can be turned off by setting PersistenceManager#autocommit= to false. If Autocommit-Mode is off, changes to persistent objects are only saved when a transaction is committed. All changes will be lost if the transaction aborts or the application terminates without committing it's last transaction. Errors like UniquenessError will be raised during commit.

Current status of Autocommit-Mode can be determined through PersistenceManager#autocommit?. When Autocommit-Mode is turned on again, after being turned off, the current transaction will be committed.

Autocommit-Mode can also be enabled/disabled by default by setting the Vapor.Autocommit property to true or false (actual boolean value, a String will be interpreted as true) when creating the PersistenceManager.

  @pmgr.autocommit?            # => false; disabled above
  @pmgr.autocommit = true      # => enable Autocommit-Mode, commit transaction 
                               #    up until now
  madrid = City.new( 'Madrid', 650)

  madrid.make_persistent       # immediatly saved to Repository
  madrid.sate                  # => Persistable::PERSISTENT (not NEW)

Logging committer and commit message

Each time a transaction is committed, you have the chance to record the committer (who/what initiated the commit) and a message describing the commit, by setting them via the Transaction#committer= and Transaction#message= methods. The committer, once set is recorded for all subsequent transaction, until the value is changed or the PersistenceManager instance is discarded. The commit message is cleared each time a transaction is committed, because it is supposed to describe the current transaction specifically.

  @pmgr.transaction{ |t|
    t.committer = $0
    t.message = "something"
    ....
  }

Reading the changelog of an object

The TransactionLog-object associated with the transaction that last changed a Persistable object, thereby creating the current version of the object (s. Version Management below), can be accessed by the Persistable#vapor_changelog() method. The TransactionLog object contains information such as the time of the commit, list of objects modified by the transaction, committer and commit message. The last two might be empty if the user didn't supply these. The commit message will be empty if the transaction was triggered by an autocommt.

        last_change = munich.vapor_changelog
        last_change.class            # => Vapor::TransactionLog
        
        last_change.committer        # => "Somebody"
        last_change.message          # => "Modification Test"
        last_change.date             # => Thu Oct 30 16:36:38 CET 2003
        last_change.modified_objects # => [munich]
     

Version Management

Vapor supports Version Control of objects stored in the Repository. When an object is changed in the Repository by an transaction commit, the old state of the object is not thrown away, but archieved in the Repository. Each state of the object at the end of a modifiying transaction is called a "version" and is given a unique "revision number" that is unique among all versions of an unique object. The most recent version, that has not (yet) been modified, is the "current version" of the object. This is the version which applications load by default and references point to.

An important property of Version Management is stability. One created, the content of an object version should not change for the entire lifetime of the Repository. Change always creates a new version.

Vapor also keeps information about the transaction that created a object version by modifiying the previous version, such as time and date when the transaction was committed, the name of the user making the changes and an optional message explaining the transaction.

Determining the current revision number of a persistent object

The current revision number of an persistent Persistable object is returned by calling the revision() method on the Persistable object. Transient objects return nil to this method.

Non-current versions of persistent objects

Versions of persistent objects other than the current version can be loaded into memory using the various methods described below. The loaded objects are normal instances of their class with two important differiences to in-memory-instances of current versions.

  1. The object's persistence state is READONLY. All operations on the object that change the object's persistence state will raise a PersistableReadOnlyError. This includes operartions that would change the object's persistence state to DIRTY, e.g. setter methods that properly call Persistable#mark_dirty().
  2. The mapping between a non-current object version in the Repository and in-memory is 1:m. If the same non-current version of an object is requested multiple times from the PersristenceManager, different in-memory objects with the same attribute values will be returned. Because non-current versions are never modified, there is no need that all references to a specific non-current version point to the same in-memory object for consistency. This behavious might change in the future if need for caching and consistency arises.

A reference from a non-current version to another persistent object points to to the current version of the persistent object and not the version that was current when the version that holds the reference became non-current. In the future, stronger references that point to a specific version of an object might be implemented, should the need arise to support them.

Accessing specific versions of an persistent object

Specific revisions of an persistent object can be loaded and accessed in two different ways. In both caseses, nil is returned if a version with the specific revision number does not exist.

Repository Management

Administration of the Repository itself, like adding or removing classes is done using the vaporadmin command.

Removing classes

Persistent classes that are not needed anymore can be deleted from the Repository using the remove command.


  foo@bar: ~> $ bin/vaporadmin help remove
  remove: Remove a class and all it's insstances from the
          Repository.
  usage: vaporadmin REPOSITORY remove [-r|-f|-rf] KLASSNAME

  -r    delete recursivly, including all subclasses
  -f    don't ask for conformation

Caution: All instances (the actual data) of the class will be permanently removed.

By default, classes that have subclasses registered to the Repository will not be deleted. By using the -r option, all subclassses will be deleted recursivly, too.

Usage Example:


  foo@bar: ~>vaporadmin user@host/database_name remove City
  Password: password
  Attempting to remove class(es) from repository:
  Really remove `City' including all instances and subclasses from the Repository? (y/N)
  y
    City
  foo@bar: ~>

Changing the definition of a class

During the development and evolution of an application that uses Vapor, changes to the class' definition will become neccersary. Most of the time this will be the introduction of new attributes. Instead of reinitializing the Repository and losing all the instances stored in it, the metadata definition can be modified using the update command to vaporadmin. It takes the names of the same XML files as arguments as the add command.

Updating class definitions is (currently) basically limited to adding new attributes. In order to preserve instances, type redefinition is not possible and attributes that do not exist anymore in the updated definiton will not be deleted.