This document is a guide to installing and configuring the Ontopia Topic Maps Engine RDBMS Backend Connector. It assumes that the reader is familiar with relational databases and has some knowledge of how to configure them.
5.1 2011-04-10
Note This is a short summary of the installation process for the impatient. If you want more detail, jump to the introduction and read on from there.
In order to use the RDBMS backend you need:
rdbms/setup
.apache-tomcat/common/lib
.rdbms/config
. Put it in apache-tomcat/common/classes
.apache-tomcat/common/classes/tm-sources.xml
file. You’ll find examples commented out at the bottom of the file.That’s it!
The RDBMS Backend Connector adds relational database persistence support to the Ontopia Topic Maps Engine. The persistence is transparent and users of the topic map interfaces don’t have to take any additional steps in order to persist topic maps. Applications that uses the RDBMS Backend Connector must properly demarcate transactions and follow the general rules in a transactional system. This includes managing the life-cycle of topic map stores and transactions.
Before you can install and start using the RDBMS Backend Connector you must have an RDBMS server installed, and you must have created a database inside it. (The database can have any name since Ontopia finds the database name from the JDBC URL.) You also need a JDBC 2.0 driver, and to have set everything up so that you can connect to the database through the JDBC driver. If you want to use batch writing or connection pooling you will also have to install the JDBC 2.0 optional package. Once this is done, you are ready to start.
A JDBC 2.0 driver for your database must be available on your CLASSPATH. Your database vendor should be able to provide you with an appropriate JDBC driver.
If you want to use connection pooling you need the JDBC 2.0 Optional Package API (the javax.sql packages) on your CLASSPATH. This API can be downloaded from Sun Microsystems.
Ontopia fully supports Unicode, but for the RDBMS Connector to support Unicode the database must support storing Unicode text. Most databases do not enable this by default, and require special options to be set when creating the database. This section documents these options for various databases. Note that you need only do this if you actually need to store Unicode characters in your database.
If you don’t do this, the tests in the test suite that verify Unicode support will fail. This will give failure reports like the ones shown below.
There were 4 failures:
1) testFile(net.ontopia.topicmaps.xml.test.AbstractCanonicalTests$CanonicalTestCase) "test file unicode.xtm canonicalized wrongly"
2) testFile(net.ontopia.topicmaps.xml.test.AbstractCanonicalTests$CanonicalTestCase) "test file unicode.iso canonicalized wrongly"
3) testExport(net.ontopia.topicmaps.xml.test.AbstractCanonicalExporterTests$CanonicalTestCase) "test file unicode.xtm canonicalized wrongly"
4) testExport(net.ontopia.topicmaps.xml.test.AbstractCanonicalExporterTests$CanonicalTestCase) "test file unicode.iso canonicalized wrongly"
To create a database which stores text in UTF-8, give the -E UNICODE
option to createdb
, as in createdb -E UNICODE topicmaps
. More information can be found on the PostgreSQL website.
To create a database which stores text in UTF-8 and uses case-insensitive collation, use the following DDL command:
CREATE DATABASE topicmaps DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_bin;
To modify a database to store text in UTF-8, execute the following commands. More information is available on Oracle technet.
STARTUP MOUNT;
ALTER SYSTEM ENABLE RESTRICTED SESSION;
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
ALTER DATABASE OPEN;
ALTER DATABASE CHARACTER SET UTF8;
SHUTDOWN NORMAL;
The next step is to create the topic map database schema in the database. This is done by executing the SQL scripts provided in one of the *.create.sql
files, found in the $ONTOPIA_HOME/rdbms/setup
directory. If there is no script for the database you are using, use the generic.create.sql
script.
There are also scripts for dropping the database schema in the same directory.
These scripts create the tables with primary keys and indexes in the database. They also initialize the TM_ADMIN_SEQUENCE
table by inserting a single row. The TM_ADMIN_SEQUENCE
table contains the sequence counters used to generate object identities.
In later versions we will provide a command-line tool which initializes the database for you automatically. In the meantime, if you run into problems when initializing the database, please send an email to support@ontopia.net.
The next step is to create a properties file that contains the information necessary to connect to the database.
Only SQL Server 2005 or newer versions are supported. Older versions may work, but there will be contention in the database if used in a concurrent environment.
In order to enable the snapshot isolation and row versioning features (both needed by Ontopia) the following two statements need to be executed in the database. Replace the topicmaps
name with the name of your database. Note that this will not work with the master
database as the read_committed_snapshot
is not supported in that database. Use a custom database instead.
alter database topicmaps set allow_snapshot_isolation on;
alter database topicmaps set read_committed_snapshot on;
More information about the database features can be found here.
Before we can connect to the database we need to tell the backend how to connect to the database. This is done by writing a Java properties file that provides the necessary information. Below are listed the properties that need to be defined. You can put the properties file anywhere you like, but it is recommended to place it together with other configuration files. For Ontopia to find this file you have to point to it from the tm-sources.xml
file. (See the Navigator Configuration guide for details, but by default this file is located in apache-tomcat/common/classes
.)
The default properties file for use in applications can be designated by setting the net.ontopia.topicmaps.impl.rdbms.PropertyFile
system property to the property filename. Note that the default can always be overridden through the APIs.
A collection of sample database properties files can be found in the $ONTOPIA_HOME/rdbms/config
directory.
oracle8
, oracle9i
, oracle10g
, postgresql
, mysql
, h2
sqlserver
and generic
. If you don’t find your database in this list try using the generic
setting.true
all database modifications will be performed through JDBC batch updates.If multiple instances of the RDBMS Backend Connector are to access the same topic map(s) the clustering feature must be enabled to prevent inconsistencies. As each instance holds an internal data cache any modifications to the data needs to be replicated across all nodes in the cluster. This must be done to prevent any of the nodes from holding stale data.
Warning Note that if your application uses the Web Editor framework then another clustering configuration needs to the set up. See the Web Editor Framework Developer’s Guide for more information.
jgroups:
, e.g. jgroups:cluster:my-topicmaps
. JGroups is currently the only clustering implementation for cache invalidation.Note that there is a critical bug in pre-1.6 JVMs related to IPv6 network support that prevents them from being used with JGroups out of the box. To make JGroups use IPv4 make sure that you pass the following system property to the JVM on startup:
-Djava.net.preferIPv4Stack=true
The RDBMS Backend Connector retrieves all its data from the backing relational database. The communication with the database is relatively fast, but can introduce a big performance hit when the network traffic is high. In order to offer high performance Ontopia provides a number of client-caches that makes it possible to avoid repeated reads from the database.
Warning Do not disable the shared cache when there are concurrent transactions updating the same topic map as this would mean that the transactions might not see each other’s committed changes.
TopicIF.getRolesByType(TopicIF roleType)
. This cache is also used by the tolog query engine to improve association traversal performance.TopicIF.getRolesByType(TopicIF roleType, TopicIF associationType)
. This cache is also used by the tolog query engine to improve association traversal performance.The RDBMS Backend Connector uses JDBC connections to access the relational database. Being able to reuse connections across interactions with the database is important because creating connections is expensive. The RDBMS Backend Connector supports different JDBC connection pool implementations. Use the following properties to enable and configure connection pooling:
true
, or any of the named collection pool implementations, connection pooling will be used. The default connection pool implementation is ‘dbcp’. It is recommended that one use connection pooling if one needs to open more than a single topic map transaction.MaximumIdle
connections in the pool, the connection is closed. Setting the timeout to -1 means idle connections are never closed.grow
, block
, fail
), Default: blockgrow
: the pool will temporarly grow (see SoftMaximum
). When set to fail
: requesting a connection when the pool is exhausted will lead to an exception. When set to block
: the request for a connection will block until a connection is available. Note: overrides the SoftMaximum
setting.SoftMaximum
is set to false, or WhenExhaustedAction
is set to block, this is the maximum time that the connection request will block. If exceeded an exception is thrown. If set to -1, the request will block indefinitely.Ontopia supports pooling of topic map stores in the same way as you can pool JDBC connections with the provided connection pool implementations. The topic map store pooling is currently only available via the topic map reference implementation in the RDBMS Backend Connector. When using the RDBMS Backend Connector with the Navigator and Web Editor frameworks the topic map store pooling is always enabled. Note that when using the Topic Maps Engine API one must access the topic map stores through one of the topic map source implementations in the RDBMS Backend to make use store pooling. These topic map source implementations always use store pooling.
The main reason for using topic map store pooling is that it greatly improves performance in transactional environements. This is possible because topic map objects, topic map data, cached information and other state information can be reused across transaction boundaries.
You can override the store pool defaults by setting the following properties:
The RDBMS Backend Connector has its own tolog implementation which it will use if it finds that it is possible to translate the tolog query into SQL. In most of the cases the entire tolog query can be translated fully into a SQL query, but for some more complex tolog queries only parts or none of it can be translated. Sometimes it makes sense for a topic map application to not use the RDBMS tolog implementation. These applications can change the default tolog implementation by setting the following property:
The tolog query language with its value-like predicate uses an underlying Fulltext integration. At the moment there are two options. The first is ‘generic’, which means that the search terms is turned into a SQL expression using the LIKE operator with wild cards on both sides. The second is ‘oracle_text’ which translates the search into an Oracle Text search expression. It is possible to specify exactly which Fulltext integration that should be used:
Note that this property will also have an effect with direct use of the net.ontopia.infoset.fulltext.impl.rdbms.RDBMSSearcher
class.
Sample properties file (Oracle)
net.ontopia.topicmaps.impl.rdbms.Database=oracle10g
net.ontopia.topicmaps.impl.rdbms.ConnectionString=jdbc:oracle:thin:@127.0.0.1:1521:TOPICMAPS
net.ontopia.topicmaps.impl.rdbms.DriverClass=oracle.jdbc.driver.OracleDriver
net.ontopia.topicmaps.impl.rdbms.UserName=scott
net.ontopia.topicmaps.impl.rdbms.Password=tiger
Warning Please note that the backslash character (
\
) is used as an escape character in Java property files. If you need to write a backslash (for example in DOS paths), make sure you write them as\\
. Lines ending in a single backslash will cause the following property to be ignored.
Instead of using the Lucene integration with the RDBMS Backend Connector one can take advantage of the native full-text support that the database may have. This support can range from simple LIKE operator queries to advanced full-text indexing support.
In the net.ontopia.infoset.fulltext.impl.rdbms
package there is a class called RDBMSSearcher
that implements the SearcherIF
interface. Pass your RDBMS TopicMapIF
instance to the single constructor of this class and you have a searcher instance that let you search for topic name, variant name, and occurrences using the native pattern language.
The actual queries that are issued and the syntax that are supported is usually database specific, but the default behaviour is to use the LIKE operator. This means that you can use the ‘%’ symbol to specify wildcards in your queries. Most relational databases support the LIKE operator, and therefore the default should be ok for most databases. Note that the actual query will make use of the lower
function to make it a case-insensitive query.
In addition to the generic RDBMS full-text support Ontopia comes with support for several database-specific full-text implementations. They are all described below.
Note that enabling native RDBMS full-text through the net.ontopia.infoset.fulltext.impl.rdbms.RDBMSSearcher.type
database property will also enable it for the value-like
tolog predicate.
Recent versions of Oracle comes with full-text support included. The component is called Oracle Text. In older versions Oracle Text was an optional component and had to be purchased and installed separately.
To enable use of Oracle Text in Ontopia you will have to set the net.ontopia.infoset.fulltext.impl.rdbms.RDBMSSearcher.type
database property to oracle_text
. You will also have to make sure that the columns TM_TOPIC_NAME.content
, TM_VARIANT_NAME.content
, and TM_OCCURRENCE.content
are setup to be indexed by Oracle Text.
Here are some sample statements to set this up for Oracle. Note that they require the Oracle Context Option, which for Oracle 8.x needs to be installed separately, but for Oracle 9i and later it should be installed by default.
create index TEXT_TOPIC_NAME on TM_TOPIC_NAME(content)
indextype is ctxsys.context
parameters ('sync(on commit) datastore ctxsys.default_datastore filter ctxsys.null_filter section group ctxsys.auto_section_group');
create index TEXT_OCCURRENCE on TM_OCCURRENCE(content)
indextype is ctxsys.context
parameters ('sync(on commit) datastore ctxsys.default_datastore filter ctxsys.null_filter section group ctxsys.auto_section_group');
create index TEXT_VARIANT_NAME on TM_VARIANT_NAME(content)
indextype is ctxsys.context
parameters ('sync(on commit) datastore ctxsys.default_datastore filter ctxsys.null_filter section group ctxsys.auto_section_group');
These indexes will be kept up to date on every commit. If you don’t need to have them synced that often, consider changing the sync parameter.
An error message will be issued by the Oracle RDBMS if one attempts to use the Oracle Text query when the columns have not been full-text indexed, so if this happens you will have to run the create index statements above or similar ones.
Please consult your database documentation for more information on Oracle Text.
The full-text support in PostgreSQL is implemented through a module called tsearch2. This module will usually have to be installed manually before it can be used. This is typically done by logging in as the PostgreSQL administrator user and running the ${PSQL_LIB}/contrib/tsearch2.sql
script in the database, e.g. using the psql
command line utility. If you later which to uninstall tsearch2 that can be done with the ${PSQL_LIB}/contrib/untsearch2.sql
script.
The next step is to grant access to the tsearch2 tables to the database user that is used by Ontopia. Replace myuser
with the name of your database user below.
grant all on pg_ts_cfg to myuser;
grant all on pg_ts_cfgmap to myuser;
grant all on pg_ts_dict to myuser;
grant all on pg_ts_parser to myuser;
tsearch2 will then have to be told which of its configurations that are to be used. Ontopia uses a configuration called default
. The configurations are mapped to the server encoding. Issue the following statement to set the locale for the default configration.
update pg_ts_cfg set locale='en_US.UTF-8' where ts_name='default';
If you don’t know what the server encoding is then the following query will tell you:
select setting from pg_settings where name='lc_ctype';
The last step is to add a tsearch2 column and an index to each of the three tables that are to be searched. To keep the index up-to-date a set of triggers will also have to be added.
/* add columns */
alter table TM_TOPIC_NAME add column tsearch2 tsvector;
alter table TM_VARIANT_NAME add column tsearch2 tsvector;
alter table TM_OCCURRENCE add column tsearch2 tsvector;
/* create indexes */
create index tsearch2_index_tbn on TM_TOPIC_NAME using gist(tsearch2);
create index tsearch2_index_tvn on TM_VARIANT_NAME using gist(tsearch2);
create index tsearch2_index_to on TM_OCCURRENCE using gist(tsearch2);
/* populate indexed columns */
update TM_TOPIC_NAME set tsearch2=to_tsvector('default', coalesce(content,''));
update TM_VARIANT_NAME set tsearch2=to_tsvector('default', coalesce(content, ''));
update TM_OCCURRENCE set tsearch2=to_tsvector('default', coalesce(content, ''));
/* add triggers */
create trigger tsearch2_update_tbn before update or insert on TM_TOPIC_NAME for each row execute procedure tsearch2(tsearch2, content);
create trigger tsearch2_update_tvn before update or insert on TM_VARIANT_NAME for each row execute procedure tsearch2(tsearch2, content);
create trigger tsearch2_update_to before update or insert on TM_OCCURRENCE for each row execute procedure tsearch2(tsearch2, content);
To enable use of tsearch2 in Ontopia you will have to set the net.ontopia.infoset.fulltext.impl.rdbms.RDBMSSearcher.type
database property to postgresql
or tsearch2
.
The full-text support in SQL Server is not enabled by default, so it will have to be enabled:
EXEC sp_fulltext_database 'enable';
Then a full-text catalog has to be created and one full-text index added to each of the three tables that are be indexed. The database server will keep the full-text indexes automatically up to date as changes occur.
create fulltext catalog okscatalog as default;
create fulltext index on TM_TOPIC_NAME(content) key index TM_TOPIC_NAME_pkey;
create fulltext index on TM_VARIANT_NAME(content) key index TM_VARIANT_NAME_pkey;
create fulltext index on TM_OCCURRENCE(content) key index TM_OCCURRENCE_pkey;
alter fulltext catalog okscatalog reorganize;
Note that the TM_TOPIC_NAME_pkey
, TM_VARIANT_NAME_pkey
and TM_OCCURRENCE_pkey
indexes referenced above are primary key indexe names, but they may not actually have those names in your database. You will have to look up the actual names of those three indexes of the given three names don’t work.
To enable use of Microsoft SQL Server full-text in Ontopia you will have to set the net.ontopia.infoset.fulltext.impl.rdbms.RDBMSSearcher.type
database property to sqlserver
.
In order for the Java Virtual Machine to be able to load the RDBMS Backend Connector classes you need to add the ontopia.jar file to your CLASSPATH
environment variable. See the Ontopia Installation and getting started document for more information.
Note that you also need to put your selected JDBC-driver classes on the CLASSPATH
.
If you are using batch writing or connection pooling, then you must make sure that the JDBC 2.0 optional package is available on the CLASSPATH
.
Now that you have an empty database we are ready to verify that the system actually works. We do this by running the test suite on the database. If you don’t want to run the test suite, feel free to skip this section.
Before you can run the RDBMS Backend Connector test suite make sure that you have the Ontopia test environment correctly set up. This includes having the ontopia.jar and ontopia-test.jar-files on the CLASSPATH. See the Ontopia, Installation and getting started document for more information.
You can now run the test suite by following the procedure below:
${basedir}
directory. Set the environment variable ONTOPIA_HOME
to the ${basedir}
, using the command set ONTOPIA_HOME=${basedir}
. Then run the tests\runtests-rdbms.bat
script passing the property file as an argument.${basedir}
directory. Then run the tests/runtests-rdbms.sh
script passing the property file as an argument.The result should be similiar to what is shown below. If it is not, please send an email to support@ontopia.net with the output of the test run as well as details about your platform and the database you are using.
Ontopia RDBMS Backend Connector [version]
Success: All required classes found.
Running tests
.........................................
.........................................
.........................................
.........................................
.........................................
.........................................
.........................................
..............................
Time: [xxx.yyy]
OK ([zzz] tests)
Asserts: [nnn]
Congratulations. Now that you’ve gotten this far it is time to start developing software using the RDBMS Backend Connector. See the Ontopia RDBMS Backend Connector, Developer’s Guide for information about how to develop applications using the backend connector.