Instantiating an OpenStreetMap Database

Table of Contents

Introduction

In this document we focus on the configuration and instantiation of PostgreSQL / PostGIS database dedicated to OpenStreetMap data, for analyzing purposes (not for mirroring). We use here the Osmosis tool with the PGSnapshot schema. Instantiating an OSM database for mirroring (with regular updates) is another topic: it also requires Osmosis, but another schema: the API DB schema, which apparently requires the Rails Port, as described here.

Creation and configuration of the database

User creation

Connection to the administration database on the PostgreSQL server:

[sylvain@msnordlys]~ $ psql -U <admin-username> -h <server> -d <admin-base>
Password for user bouveres: 
psql (9.1.11, server 9.3.2)
WARNING: psql version 9.1, server version 9.3.
	 Some psql features might not work.
Type "help" for help.

admin-base=#

User creation and password change:

admin-base=# CREATE USER osm;
CREATE ROLE
admin-base=# ALTER USER osm ENCRYPTED PASSWORD '********';
ALTER ROLE

Database creation

Creating the database and setting the right owner:

admin-base=# CREATE DATABASE osm;
CREATE DATABASE
admin-base=# ALTER DATABASE osm OWNER TO osm;
ALTER DATABASE
admin-base=# \q

(logout)

Enabling PostGIS functions

To enable PostGIS functions, we can use the extension system provided by PostgreSQL 9.1+. For that, we must connect to the osm database as a superuser (otherwise PostgreSQL refuses to load the extensions).

[sylvain@msnordlys]~ $ psql -U <admin-username> -h <server> -d osm
Password for user osm: 
psql (9.1.11, server 9.3.2)
WARNING: psql version 9.1, server version 9.3.
	 Some psql features might not work.
Type "help" for help.

osm=# CREATE EXTENSION postgis;
CREATE EXTENSION
osm=# CREATE EXTENSION postgis_topology;
CREATE EXTENSION

Enabling HStore extension

The HStore extension is useful to manage key/value pairs like in the OpenStreetMap database (actually, this extension is required for the PGSnapshot version of the database). Some documentation about this extension can be found here: http://www.postgresql.org/docs/current/static/hstore.html.

osm=# CREATE EXTENSION hstore;
CREATE EXTENSION

Data downloading

We can download some OSM data extracts here: http://download.geofabrik.de/. For the sake of demonstration, we will use the data extract for the Rhône-Alpes region, to be found here: http://download.geofabrik.de/europe/france/rhone-alpes.html (241MB at the time of downloading – 2013-12-29T20:55:02Z).

Database instantiation

Now it is time to populate the database with OpenStreetMap data. We use the Osmosis tool to read the data file and populate the PostgreSQL / PostGIS database. Here we will use the PGSnapshot schema which is suitable for data analysis (not for mirroring), and loss less.

Scheme creation

Before loading the data, it is necessary to create the database scheme. For that, we use the SQL script files provided with the Osmosis distribution. Under Ubuntu GNU / Linux 13.10, these files are in /usr/share/osmosis/osmosis/osmosis/package/script/.

Here is the command for the bare PGSnapshot schema:

[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6.sql          14:41
Password for user osm: 
psql:/usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6.sql:4: NOTICE:  la table « actions » n'existe pas, poursuite du traitement
DROP TABLE
(...)
CREATE FUNCTION
CREATE FUNCTION

It is possible to load the following optional files:

  • pgsnapshot_schema_0.6_action.sql - Adds the optional "action" table which allows derivative tables to be kept up to date when diffs are applied.
  • pgsnapshot_schema_0.6_bbox.sql - Adds the optional bbox column to the way table.
  • pgsnapshot_schema_0.6_linestring.sql - Adds the optional linestring column to the way table.
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6_action.sql   14:42
Password for user osm: 
CREATE TABLE
ALTER TABLE
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6_bbox.sql     14:50
Password for user osm: 
		addgeometrycolumn                 
--------------------------------------------------
 public.ways.bbox SRID:4326 TYPE:GEOMETRY DIMS:2 
(1 row)

CREATE INDEX
CLUSTER
CREATE FUNCTION
CREATE AGGREGATE
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6_linestring.sql 
Password for user osm: 
		   addgeometrycolumn                    
--------------------------------------------------------
 public.ways.linestring SRID:4326 TYPE:GEOMETRY DIMS:2 
(1 row)

CREATE INDEX
CLUSTER

Database population

Note: This process is quite demanding in terms of time and space required. By default, Osmosis (actually, the Java Virtual Machine) puts the necessary temporary files into /tmp, but if can be necessary to tell the JVM to put them elsewhere, where there is more space left:

[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ export JAVACMD_OPTIONS="-Djava.io.tmpdir=/home/sylvain/tmp"
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ osmosis --read-pbf file=./rhone-alpes-latest.osm.pbf --write-pgsql host="<server>" database="osm" user="osm" password="**********"
janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.40.1
janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.
log4j:WARN No appenders could be found for logger (org.springframework.jdbc.datasource.DataSourceUtils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
janv. 06, 2014 9:10:14 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline complete.
janv. 06, 2014 9:10:14 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Total execution time: 3186148 milliseconds.

Testing

List of tables:

osm=> \d+
			   List of relations
 Schema |       Name        | Type  |  Owner   |  Size   | Description 
--------+-------------------+-------+----------+---------+-------------
 public | actions           | table | osm      | 0 bytes | 
 public | geography_columns | view  | bouveres | 0 bytes | 
 public | geometry_columns  | view  | bouveres | 0 bytes | 
 public | nodes             | table | osm      | 2812 MB | 
 public | raster_columns    | view  | bouveres | 0 bytes | 
 public | raster_overviews  | view  | bouveres | 0 bytes | 
 public | relation_members  | table | osm      | 17 MB   | 
 public | relations         | table | osm      | 5728 kB | 
 public | schema_info       | table | osm      | 40 kB   | 
 public | spatial_ref_sys   | table | bouveres | 3224 kB | 
 public | users             | table | osm      | 248 kB  | 
 public | way_nodes         | table | osm      | 1862 MB | 
 public | ways              | table | osm      | 2266 MB | 
(13 rows)

Detail of table "ways":

osm=> \d+ ways;
			       Table "public.ways"
    Column    |            Type             | Modifiers | Storage  | Description 
--------------+-----------------------------+-----------+----------+-------------
 id           | bigint                      | not null  | plain    | 
 version      | integer                     | not null  | plain    | 
 user_id      | integer                     | not null  | plain    | 
 tstamp       | timestamp without time zone | not null  | plain    | 
 changeset_id | bigint                      | not null  | plain    | 
 tags         | hstore                      |           | extended | 
 nodes        | bigint[]                    |           | extended | 
 bbox         | geometry(Geometry,4326)     |           | main     | 
 linestring   | geometry(Geometry,4326)     |           | main     | 
Indexes:
    "pk_ways" PRIMARY KEY, btree (id)
    "idx_ways_bbox" gist (bbox)
    "idx_ways_linestring" gist (linestring)
Has OIDs: no

Looking for relation "Saint-Martin-d'Hères" and its members (this is an example of querying via the hstore extension):

osm=> SELECT relation_id, member_id, member_type, member_role, sequence_id FROM relations JOIN relation_members ON id = relation_id WHERE tags->'name' LIKE '%Hères%' AND tags @> '"type"=>"boundary", "boundary"=>"administrative"' ORDER BY sequence_id ASC;
 relation_id | member_id  | member_type | member_role  | sequence_id 
-------------+------------+-------------+--------------+-------------
       80349 |   32403074 | W           | outer        |           0
       80349 |   31340292 | W           | outer        |           1
       80349 |  200605145 | W           | outer        |           2
       80349 |   28390954 | W           | outer        |           3
       80349 |  119237572 | W           | outer        |           4
       80349 |   31341467 | W           | outer        |           5
       80349 |  188747852 | W           | outer        |           6
       80349 |   31357779 | W           | outer        |           7
       80349 |  188747855 | W           | outer        |           8
       80349 |   32000425 | W           | outer        |           9
       80349 | 1704314483 | N           | admin_centre |          10
(11 rows)

Date: <2013-11-28 jeu.>

Author: Sylvain B.

Created: 2014-01-06 lun. 10:33

Emacs 23.4.1 (Org mode 8.0.6)

Validate XHTML 1.0