Instantiating an OpenStreetMap Database
Table of Contents
Introduction
In this document we focus on the configuration and instantiation of PostgreSQL / PostGIS database dedicated to OpenStreetMap data, for analyzing purposes (not for mirroring). We use here the Osmosis tool with the PGSnapshot schema. Instantiating an OSM database for mirroring (with regular updates) is another topic: it also requires Osmosis, but another schema: the API DB schema, which apparently requires the Rails Port, as described here.
Creation and configuration of the database
User creation
Connection to the administration database on the PostgreSQL server:
[sylvain@msnordlys]~ $ psql -U <admin-username> -h <server> -d <admin-base> Password for user bouveres: psql (9.1.11, server 9.3.2) WARNING: psql version 9.1, server version 9.3. Some psql features might not work. Type "help" for help. admin-base=#
User creation and password change:
admin-base=# CREATE USER osm; CREATE ROLE admin-base=# ALTER USER osm ENCRYPTED PASSWORD '********'; ALTER ROLE
Database creation
Creating the database and setting the right owner:
admin-base=# CREATE DATABASE osm; CREATE DATABASE admin-base=# ALTER DATABASE osm OWNER TO osm; ALTER DATABASE admin-base=# \q
(logout)
Enabling PostGIS functions
To enable PostGIS functions, we can use the extension system provided by PostgreSQL 9.1+. For that, we must connect to the osm database as a superuser (otherwise PostgreSQL refuses to load the extensions).
[sylvain@msnordlys]~ $ psql -U <admin-username> -h <server> -d osm Password for user osm: psql (9.1.11, server 9.3.2) WARNING: psql version 9.1, server version 9.3. Some psql features might not work. Type "help" for help. osm=# CREATE EXTENSION postgis; CREATE EXTENSION osm=# CREATE EXTENSION postgis_topology; CREATE EXTENSION
Enabling HStore extension
The HStore extension is useful to manage key/value pairs like in the OpenStreetMap database (actually, this extension is required for the PGSnapshot version of the database). Some documentation about this extension can be found here: http://www.postgresql.org/docs/current/static/hstore.html.
osm=# CREATE EXTENSION hstore; CREATE EXTENSION
Data downloading
We can download some OSM data extracts here: http://download.geofabrik.de/. For the sake of demonstration, we will use the data extract for the Rhône-Alpes region, to be found here: http://download.geofabrik.de/europe/france/rhone-alpes.html (241MB at the time of downloading – 2013-12-29T20:55:02Z).
Database instantiation
Now it is time to populate the database with OpenStreetMap data. We use the Osmosis tool to read the data file and populate the PostgreSQL / PostGIS database. Here we will use the PGSnapshot schema which is suitable for data analysis (not for mirroring), and loss less.
Scheme creation
Before loading the data, it is necessary to create the database scheme. For that, we use the SQL script files provided with the Osmosis distribution. Under Ubuntu GNU / Linux 13.10, these files are in /usr/share/osmosis/osmosis/osmosis/package/script/.
Here is the command for the bare PGSnapshot schema:
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6.sql 14:41 Password for user osm: psql:/usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6.sql:4: NOTICE: la table « actions » n'existe pas, poursuite du traitement DROP TABLE (...) CREATE FUNCTION CREATE FUNCTION
It is possible to load the following optional files:
- pgsnapshot_schema_0.6_action.sql- Adds the optional "action" table which allows derivative tables to be kept up to date when diffs are applied.
- pgsnapshot_schema_0.6_bbox.sql- Adds the optional bbox column to the way table.
- pgsnapshot_schema_0.6_linestring.sql- Adds the optional linestring column to the way table.
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6_action.sql 14:42 Password for user osm: CREATE TABLE ALTER TABLE [sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6_bbox.sql 14:50 Password for user osm: addgeometrycolumn -------------------------------------------------- public.ways.bbox SRID:4326 TYPE:GEOMETRY DIMS:2 (1 row) CREATE INDEX CLUSTER CREATE FUNCTION CREATE AGGREGATE [sylvain@msnordlys]~/Enseignement/SIG/TP/data $ psql -U osm -d osm -h ensibd.imag.fr -f /usr/share/osmosis/osmosis/osmosis/package/script/pgsnapshot_schema_0.6_linestring.sql Password for user osm: addgeometrycolumn -------------------------------------------------------- public.ways.linestring SRID:4326 TYPE:GEOMETRY DIMS:2 (1 row) CREATE INDEX CLUSTER
Database population
Note: This process is quite demanding in terms of time and space required. By default, Osmosis (actually, the Java Virtual Machine) puts the necessary temporary files into /tmp, but if can be necessary to tell the JVM to put them elsewhere, where there is more space left:
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ export JAVACMD_OPTIONS="-Djava.io.tmpdir=/home/sylvain/tmp"
[sylvain@msnordlys]~/Enseignement/SIG/TP/data $ osmosis --read-pbf file=./rhone-alpes-latest.osm.pbf --write-pgsql host="<server>" database="osm" user="osm" password="**********" janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run INFO: Osmosis Version 0.40.1 janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run INFO: Preparing pipeline. janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run INFO: Launching pipeline execution. janv. 06, 2014 8:17:08 AM org.openstreetmap.osmosis.core.Osmosis run INFO: Pipeline executing, waiting for completion. log4j:WARN No appenders could be found for logger (org.springframework.jdbc.datasource.DataSourceUtils). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. janv. 06, 2014 9:10:14 AM org.openstreetmap.osmosis.core.Osmosis run INFO: Pipeline complete. janv. 06, 2014 9:10:14 AM org.openstreetmap.osmosis.core.Osmosis run INFO: Total execution time: 3186148 milliseconds.
Testing
List of tables:
osm=> \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+-------------------+-------+----------+---------+------------- public | actions | table | osm | 0 bytes | public | geography_columns | view | bouveres | 0 bytes | public | geometry_columns | view | bouveres | 0 bytes | public | nodes | table | osm | 2812 MB | public | raster_columns | view | bouveres | 0 bytes | public | raster_overviews | view | bouveres | 0 bytes | public | relation_members | table | osm | 17 MB | public | relations | table | osm | 5728 kB | public | schema_info | table | osm | 40 kB | public | spatial_ref_sys | table | bouveres | 3224 kB | public | users | table | osm | 248 kB | public | way_nodes | table | osm | 1862 MB | public | ways | table | osm | 2266 MB | (13 rows)
Detail of table "ways":
osm=> \d+ ways;
			       Table "public.ways"
    Column    |            Type             | Modifiers | Storage  | Description 
--------------+-----------------------------+-----------+----------+-------------
 id           | bigint                      | not null  | plain    | 
 version      | integer                     | not null  | plain    | 
 user_id      | integer                     | not null  | plain    | 
 tstamp       | timestamp without time zone | not null  | plain    | 
 changeset_id | bigint                      | not null  | plain    | 
 tags         | hstore                      |           | extended | 
 nodes        | bigint[]                    |           | extended | 
 bbox         | geometry(Geometry,4326)     |           | main     | 
 linestring   | geometry(Geometry,4326)     |           | main     | 
Indexes:
    "pk_ways" PRIMARY KEY, btree (id)
    "idx_ways_bbox" gist (bbox)
    "idx_ways_linestring" gist (linestring)
Has OIDs: no
Looking for relation "Saint-Martin-d'Hères" and its members (this is an example of querying via the hstore extension):
osm=> SELECT relation_id, member_id, member_type, member_role, sequence_id FROM relations JOIN relation_members ON id = relation_id WHERE tags->'name' LIKE '%Hères%' AND tags @> '"type"=>"boundary", "boundary"=>"administrative"' ORDER BY sequence_id ASC;
 relation_id | member_id  | member_type | member_role  | sequence_id 
-------------+------------+-------------+--------------+-------------
       80349 |   32403074 | W           | outer        |           0
       80349 |   31340292 | W           | outer        |           1
       80349 |  200605145 | W           | outer        |           2
       80349 |   28390954 | W           | outer        |           3
       80349 |  119237572 | W           | outer        |           4
       80349 |   31341467 | W           | outer        |           5
       80349 |  188747852 | W           | outer        |           6
       80349 |   31357779 | W           | outer        |           7
       80349 |  188747855 | W           | outer        |           8
       80349 |   32000425 | W           | outer        |           9
       80349 | 1704314483 | N           | admin_centre |          10
(11 rows)