Note: All documentation should be considered to be in-progress, working documents unless otherwise noted.
This document provides instructions for installing and configuring an instance of RMap. Suggestions for refinements and improvements to the configuration or the documentation are welcome.
For convenience and simplicity, all components described will be installed or at least symlinked to the folder /rmap/ and, where possible, run under a user called “rmap”. Local installations may vary according to local requirements or practices. None of the internal features of RMap are dependent on applying a specific path or username.
The following installation instructions are based on Red Hat Enterprise Linux OS.
Later versions of these tools may work but have not been tested.
- Linux OS
- Tomcat 8
- Java 8
- GraphDB 6.6.3+. Other triplestores that use the Sesame Framework API (recently renamed to Eclipse RDF4J) should work. Early iterations of RMap used the Native Java Store option of Sesame, which is fine for smaller datasets (<500,000 DiSCOs)
- MySQL 5.5. Other versions will probably work. Only basic MySQL functionality is used.
- Apache Server (optional). Apache Server can be used for port redirects, permissions, and SSL. It is possible to replace this by configuring TCP port access through Tomcat and iptables
Server requirements will vary. One server or multiple servers can be used. As a benchmark, our hosted RMap uses 2 AWS servers and should have stable performance up to an estimated 20 million DiSCOs or between 600 million and 1 billion statements:
RMap Web Services server: AWS m3.large (2 vCPU @2.5GHz, 7.5G RAM, 32 GB SSD)
GraphDB server: AWS m4.xlarge (4 vCPU @2.4GHz, 15G RAM, 256 GB SSD)
The rmap user runs the Tomcat instance and owns all /rmap directories. If multiple servers are used e.g. one for the RMap web services and another for the triplestore/graph database, then the steps in this section will need to be carried out on both servers.
To configure the user on a new server, the following steps can be used:
Create user and user group
First, log into an account that has sudoer privileges, and run the following commands.
When you create the user, a group of the same name may be created automatically and the user added to that group, if not run the following:
Make rmap user a sudoer
For convenience so that you don’t need to continually switch from an admin user and back, you may wish to make rmap a sudoer while performing the installation.
Then add this line to the end:
Switching to the rmap user
During installation of RMap, sudoer access will be required at certain points. If you are in the process of installing RMap, do not switch to the RMap user until specifically instructed later in the document.
Once RMap is installed, however, it will be necessary to switch to the rmap user to configure the .bashrc file and to start or stop Tomcat. If you need to switch to the rmap user, use the following command:
To support direct downloads from 3rd party sites throughout the install, you may need to install wget. First check to see if you already have it installed:
If you get a command not found error, run the following to install:
Enter wget -h again to ensure you now get a list of options.
Create the rmap directories, make the rmap user the owner, and ensure the rmap user has appropriate rights to modify that directory. The commands for this are:
Check that the rights have been applied:
Switch to the rmap user and create some other directories as follows:
Server location: /rmap/oracle
Next we will install and configure Java 8 on the server. If multiple servers are used, Java should be installed on all instances. Described below is a standard install, using Java 8 from the Oracle downloads site.
Download and unzip Java
First download and unzip the Java Development Kit to the /rmap/oracle folder:
In this section, we will set the JAVA_HOME and JAVA_OPTS environment variable in the rmap user’s .bashrc file. Before doing the following changes, ensure that the current shell user is the rmap user. If not run the “su rmap” command. First open the .bashrc file:
Ensure the file contains the following information in .bashrc:
Note that the path may vary depending on the exact Java release used. JAVA_OPTS memory allocations will need to be adjusted based on the server’s available RAM.
Use “:w” to save the changes and “:q” to exit.
At this point close the shell and reopen it, then do the following to test the environment variables have been applied:
Set server’s default Java version [optional]
In some cases it might also be useful to set this newly installed version of Java as the server default. To do this, the user needs to be a sudoer:
In the example above, only one version of Java is installed, but where there are other versions this allows you to unambiguously define the default.
Server location: /rmap/apache/tomcat8
In this section we will perform a standard Tomcat 8 install and add some configuration. This should all be done as the rmap user. Note that if you there multiple web servers, this will need to be repeated on each. Installation is described below, but if needed additional installation instructions can be found here: http://tomcat.apache.org/tomcat-8.0-doc/appdev/installation.html
Download and unzip Tomcat
First download and unzip Tomcat 8 to the /rmap/apache folder:
Tomcat is now installed with default settings - it will use the JAVA_HOME and JAVA_OPTS settings in the .bashrc to start up.
Create and configure a setenv.sh file in Tomcat [optional]
Create file /rmap/apache/tomcat8/bin/setenv.sh to give you a space to set additional Tomcat environment variables. This gets automatically run each time you start/stop the Tomcat server, so you can use it output some information about the server when you start Tomcat.
Add the following to output JAVA_OPTS to the screen when starting or stopping Tomcat:
CATALINA_BASE and CATALINA_HOME can also be set here if required.
Test Tomcat Startup
At this point it may be worth verifying that things are working. Some of these tests will require remote access to the server. Enter the startup command, and then watch Tomcat start. Ensure the paths are as expected:
You can also ensure the process is running using the following command, which will output the process information:
Finally, if remote access is available to your server, you can load the default Tomcat page in the browser by accessing http://[serverip]:8080. 8080 is Tomcat’s default port.
To continue with the installation, shut down Tomcat and verify the process has ended:
Many of the web service paths in RMap accept encoded URIs as part of the request path e.g.
By default the server will reject this path because of the http encoded “http://”. To ensure the server accepts requests of this nature, do the following:
Add the following line to the end of the file so that encoded HTTP URIs can be accepted by the server:
Enter “:w” to save and “:q” to exit vi. If Tomcat is running, a restart will be required for this to be applied. Tomcat can be stopped and started using the following commands.
Once running, Tomcat will automatically deploy any *.war files placed in the /tomcat8/webapps folder.
Secure Tomcat [optional]
Depending on whether your server will be publicly accessible, it may be a good idea to apply the following security configurations for Tomcat 8.
Configuring server to use port :80 / :443 [optional]
Configuring the application to use port 80 will allow you to remove the port number from the URL. This will allow you to enter the URL in a browser without defining the port number, since :80 is applied by default.
iptables is a tool for firewall configuration on Linux, but it can be used to filter and re-route requests. The instructions here show how to redirect HTTP/HTTPS requests on ports 80/433 to Tomcat’s ports 8080/8443. This supports clean URL requests without the port number appended. To configure this, run the following commands:
Save the setting using this command, so that they will be recreated on reboot:
If the save does not work, you may need to install the iptables services, then try again:
Finally verify the settings:
At this point ports will automatically forward the ports to the Tomcat port in the URL.
For smaller databases or to test out the system, Eclipse RDF4J (previously OpenRDF Sesame) has a Native Java Store that can be hosted as a web service. For larger databases, the free version of GraphDB is a good start. The test instance of RMap currently uses GraphDB and holds 4.2million DiSCOs. Several paid versions are available if additional scaling across multiple servers is required.
The triplestore / graph database can be installed on the same server as other web services, or a separate one. In general the machine containing the triplestore uses more memory than the server holding the RMap web services. When there is heavy read and write activity, faster storage can significantly improve triplestore performance.
Web-based admin tool: /rmap/apache/tomcat8/webapps/graphdb
GraphDB is a graph database that implements the Sesame Framework API. There is a free version that works well, you can register to download a copy - http://info.ontotext.com/graphdb-free-ontotext. An enterprise edition is available for a fee if further scaling is required.
Once registered, Ontotext will send a link to download the software. You can download this file to the your local machine and move the war file to the server, or follow these steps.
As the admin user:
If unzip is not installed:
Set up a directory:
Carry out the following steps as the rmap user. Note that your “zipFileName” will be unique:
Make sure Tomcat is running
This should output two processes (described in the Test Tomcat section). If not, start up Tomcat:
Then do the following (you should still be at the path /rmap/graphdb/graphdb-free-6.6.4:
If Tomcat is running, within a few seconds a new graphdb folder will have automatically deployed:
Confirm that GraphDB is running by accessing the GUI through a web browser:
The next steps require that you can access GraphDB through a browser.
Create and configure first GraphDB repository
By default there is no login for the admin tool, and data will be stored in your /home/rmap directory. To configure this take the following steps:
To start, load the newly installed GraphDB administrative tool: http://[servername]:8080/graphdb
Go to the Admin menu and select “Locations and Repositories”
Click the “Attach Location” button, bottom-right.
Enter the file path “/rmap/graphdb/data” and click Add. This will be added as an “Inactive location”.
Use the slider to the right of the new path to activate the location. You will see the path at the top of the page be replaced with this new path.
You will now see the original /home/rmap/ path in the inactive location. This can be removed by clicking the “x” to the right of it.
Click the “Create repository” button. The settings here may vary depending on you server specification and database size. For now, change the following settings, the rest can stay at their default value. You can tweak these later as needed. Note that the GraphDB developers offered the following rule of thumb:
On an average dataset the "Total cache memory" value should be between 10-15% of the GraphDB's storage folder.
This configuration assumes you have at least around 4GB of RAM available to the JVM.
Entity index size
10000000 (can’t change this)
Total cache memory
Tuple index memory
Use predicate indices
Predicate index memory
Use context index
Throw exception on query time-out
Click the Create button. The new repository will now be listed on the Locations and Repositories page.
The GraphDB site includes clear instructions for configuring the server and the team are very helpful if you have any questions. http://graphdb.ontotext.com/documentation/6.6/free/toc.html
Create and configure GraphDB user
By default the GraphDB GUI allows public access to all databases. To configure the database accounts, take the following steps:
Go to the Admin menu and click “Users and Access”. You will see a single user account for “admin”. The default password is “root”. Change this by clicking the “Edit User” icon to the right of the account name. Enter a new password and press save.
Back in the Users and Access screen, click the “Create new user” button.
Enter user name “rmap” and a password, then assign read and write access to the new rmap database. The screen should look like this:
Click the “Create” button. The new account will appear in the Users and Access window.
Enable security by clicking the button in the top left of the Users and Access window:
When you click this, GraphDB will log you out and you will need to log back in. This gives you the opportunity to test your new account login.
Sesame Native Java Triplestore
Note: This is an alternative to GraphDB. If you have installed GraphDB - this section can be skipped
Web-based admin tool: /rmap/apache/tomcat8/webapps/openrdf-workbench
Web-based API path: /rmap/apache/tomcat8/webapps/openrdf-sesame
GraphDB has proven to be much more stable in our tests and is easier to administer, but if you prefer to avoid the registration process, we experienced few problems using this option up to around 500,000 DiSCOs. Note that OpenRDF has recently morphed into this: http://rdf4j.org/, but there is not yet a full release available. In the meantime the following version has been successfully used for RMap:
Configure database storage folder
Before installing Sesame, it is best to set the base directory for the data. The “basedir” property in JAVA_OPTS overrides the default storage path for the triplestore.
To set this, first ensure you are logged in as the rmap user, then create the directory for the data:
Next edit the .bashrc file, as described in the Java 8 configuration section (vi ~/.bashrc). Modify the line that sets the JAVA_OPTS variable to now include the base directory property for Sesame as follows:
Configure database storage folder
To complete the installation, continue to perform these steps as the rmap user:
Make sure Tomcat is running
This should output two processes (described in the Test Tomcat section). If not, start up Tomcat:
Then do the following:
If Tomcat is running, within a few seconds a new openrdf-sesame and openrdf-workbench folder will have automatically deployed in webapps:
You can verify this is running by accessing http://[yourserveraddress]:8080/openrdf-sesame. You should see a “Welcome” screen.
Configure Sesame users
The Sesame database utilizes Tomcat users to restrict access. Here is the current user config, which can be adapted per requirements. More information can be found here (note: you have to view source to see the embedded code!):
Add the following users to the <tomcat-users> tag in /rmap/apache/tomcat8/conf/tomcat-users.xml file
Next add the following sections to /rmap/apache/tomcat8/webapps/openrdf-sesame/WEB-INF/web.xml within the <web-app> tags.
(full file not shown, just parts for user config)
Create Sesame triplestore
Since the settings have been changed, you will need to close and re-open your terminal. Then do the following:
Once Tomcat has started back up, visit http://[yourservername]:8080/openrdf-workbench. The first time you access this, it make take a few minutes to load the page. To create your repository, take the following steps:
Log in as the administrator using the login box:
The server path should be the path you loaded earlier to see the welcome page - http://[yourServerName]:8080/openrdf-sesame and the user name and password will be the ones you set in the configuration file earlier. Once again there will be a lag the first time you log in.
Next click the link on the left for “New repository”.
Fill in the form and click Next:
The next page will give you the option to add indexes. Enter the following:
then click “Create”. You now have a new triplestore.
MySQL Server is currently being used for the user database. There are numerous methods for installing MySQL, and it may be best to use the MySQL website to find the one best suited to your environment. In institutions that use other MySQL databases, it may be easiest to create an RMap database on the existing databsae server and just configure the MySQL users and connection properties in the RMap web service. With that said, here is one method for installing MySQL on the same server as everything else.
If the yum install does not run, you may need to add the mysql yum repository to your server. This link has instructions: http://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/ Once the install has been run, do the following to start the service:
For MySQL 5.7, a temporary password will be generated and placed in the MySQL log file. You will need this to log in. Run the following command to retrieve the temporary password:
Next, secure the installation:
Login and create the initial database and users. Note that you may wish to add access rights for another IP that will be used to administer the database remotely, is so the second GRANT command can do this:
Finally, you will need to create the database tables. First make sure you are using the new database:
Next copy and paste the table creation scripts, which can be found here:
Verify that the tables were created successfully:
RMap web services
Each RMap web application is available as a Java web archive (.war) file. There are two main applications for RMap:
RMap account manager and visual navigator
Though the applications are installed as separate services, both are required in order to use RMap. The web-based account manager tool allows the user to log in using Google, Twitter, or ORCID then manage API keys that can be used to access the RMap API.
Before either service will work, both services will need to be able to access to a web service that can mint unique IDs. Instructions to configure a NOID minter for these service are provided here.
RMap includes a generic ID minting function that can be configured to work with an HTTP web service. The service supports basic authentication, some simple validation and customization of a prefix. If you do not have access to such a service, for convenience, below are instructions for setting up a local NOID minter that can be used to mint RMap IDs. It can also be configured to mint ARK IDs if your institution has an ARK NAAN assigned. By default, this service will mint IDs in the format of e.g. rmap:akd2kjw6f. The service does not include authentication features, so you will need to limit access the service either by configuring Tomcat to include a remote access filter, or if using Apache Server over Tomcat, that could be configured to restrict access.
Start by setting up a directory for the NOID installation:
Next, install Perl / CPAN and the various Perl module pre-requisites. Note: if the Fcntl install fails, it may already be installed and this step can be skipped
Install Berkeley DB, which will keep track of the IDs already used:
Install the NOID minter:
Edit the noid file and remove the "T" in -Tw from first line of noidminter/noid
Copy the NOID perl library to the perl5 folder
Set up the noid minter database
Test the minter is working
Install the web service for minting noids by doing the following:
Edit the noid.sh and make sure the paths are pointing to the correct directories. The first path should point to the minter script, the second to the noid database folder. Also, replace the code after the "mint" command with the argument $1 so that your file is similar to the one below.
Add environment variables to define the path for the noid script. The web service will use this to mint IDs.
Add the following lines to the file (note the Perl paths may vary depending on your operating system).
Log out of shell and back in to set the environment variables, then test the service:
If you configure the NOID minter in this way, the default settings in the rmapcore.properties should work for API and Web application. An example is shown in Appendix A.
Note that during performance testing, the NOID minter was found to be a bottleneck since repeated round trips to the minter were required for every ID. The minter described here supports a request for multiple IDs at once. These are held in memory until used. For high volume use it may be worth requesting batches of e.g. 100 IDs at once to improve performance.
The RMap API supports management of DiSCOs and navigation of RDF data through various API paths. The various API paths and available parameters are fully documented on the RMap wiki. Read only access is available without an API key. To manage DiSCOs, the user must create an account through the RMap account manager application.
The RMap API can be deployed by dropping the .war file into the webapps folder of a running tomcat (/rmap/apache/tomcat8/webapps). Here are the step by step instructions for installation to be completed as the rmap user:
Before accessing the API, you will need to modify the configuration files. To do this:
Follow the instructions in the properties file to define your application properties. Do the same for the files rmapauth.properties, and rmapcore.properties. Appendix A has examples of how your properties files might look after being configured. Once you have configured all 3 properties files, backup the settings for future reference and restart Tomcat
It may take a few minutes for the application to start up for the first time. Once the service has started you should be able to see a message through the following URL:
RMap account manager and visual navigator
Installation of this service is similar to the API. In this instance we will deploy the webapp as the /app/ directory, but you can deploy it under any name provided you update the properties as needed.
As with the API web service, before accessing the website, you will need to modify the configuration files. To do this:
Follow the instructions in the properties file to define your application properties. Note that you will need to visit each of 3 OAuth providers (Twitter, ORCID, and Google) and sign up for the OAuth Key/Secret.
Next, configure the properties in rmapauth.properties, and rmapcore.properties. As a shortcut you can just copy these from /rmap/rmap) as follows:
Appendix A has examples of how your properties files might look after being configured.
In addition to the 3 main properties files, there are 3 others in the classes folder. Each of these additional properties files are used to generate the RMap visualizations and contain some default values. These may be appropriate for initial use, but you may choose to update these over time:
ontologyprefixes.properties - Allows you to define display prefixes for specific ontology paths e.g. http://purl.org/dc/elements/1.1/ can be replaced with “dc:”. Any unidentified ontology paths will display as “x:”
nodetypes.properties - Allows you to configure the node types that are distinguishable in the visualization and what these will look like (color and shape).
typemappings.properties - Allows you to configure which rdf:type URIs map to the various node types defined.
Once you have configured all properties files, backup the properties files for future reference. Finally, restart Tomcat so that the new settings are applied:
You should now be able to see the webapp at http://[yourservername]:8080/app.
You now have a basic working copy of the RMap system.
Note about Apache Server
Some institutions prefer to put an Apache Server instance over Tomcat for a variety of reasons. Typically the Apache Server runs on port 80 and forwards requests for Tomcat webservices to port 8080. Port 8080 can then be closed for public access. If this is the case in your environment, there are certain properties that need to be set in Apache Server’s httpd.conf file. Below is an example of the virtual host configuration for an RMap instance with SSL applied. Note the ProxyPreserveHost and AllowEncodeSlashes settings - these are vital for RMap to function correctly:
If all of the instructions were followed successfully, you should have a working copy of RMap. Further configuration in a production environment may vary widely according institutional standards. Work with a server configuration specialist to determine the best approaches to securing the server using SSL and constraining port access through a firewall.
Appendix A - Properties files examples
The following are examples of how your properties files might look if you followed the instructions in this document. Note that comments have been removed for brevity.