Kevin McMahon

Chicago based mobile developer.

Importing CSV data with location information into PostGIS

Sometimes location data isn't neatly packaged in a SHP file, and it requires some massaging before it can be used in spatial queries. In this post I'll take a look at importing latitude and longitude data from a CSV file into our PostGIS database and transforming those coordinates into geometry data that can be spatially queried against.

Step 0 - Initial Starting Point

  • Get a CSV dataset. For this example I'll be using the Individual Landmarks dataset exported as CSV. You can get a copy of the CSV file that I am using for the example here. We already have the latitude and longitude and the extra location field isn't something that we'll use so I removed the location column prior to exporting the dataset. You can hide the column by selecting the arrows on the column heading in the table (shown here) or via the manage screen and deselecting the location column from the list of available columns (shown here).

  • A PostgreSQL database with PostGIS installed.

Step 1 - Create the table in the database

Use the following SQL to create a table to hold the landmark data.

Step 2 - Copy the CSV data into the database

To run the copy command, login to the database and execute the following command :

There are a couple of things to note with the copy command:

  1. The first argument passed to the copy command specifies the tablename (landmarks) and the columns which the CSV files map to. Our table includes a column gid that is an auto-incremented id which is used as a primary key as well as the_geom geometry column. The CSV does not include either of these columns in the data we are copying. If we don't specify the specific columns that the fields in the CSV map to, errors will occur when attempting to copy the data.
  2. The DELIMITERS argument specifies your delimiter.
  3. The CSV argument at the end lets PostgreSQL know that the file is CSV.
  4. The HEADER argument lets PostgreSQL know that the file includes the headers on the first line.

Here is a look at our data after the import. Note that the_geom has no data. We'll take care of that in the next step.

A look at the newly imported landmark data sans geometry.

Step 3 - Translate latitude and longitude into POINT geometry

Now that we have the data imported, we need to convert the latitude and longitude columns to a PostGIS POINT geometry. The following SQL will update the_geom column in the landmarks table with a POINT geometry created from the row's latitude and longitude columns. In order to create the POINT, the ST_GeomFromText method is used. This method allows us to create a POINT, which is a PostGIS geometry type, textually with the latitude and longitude values being specified in the construction of the type.

Here is a look at the data with the the_geom column populated with POINT data.

Geometry populated landmarks table

Step 4 - Run some queries

Now that all the POINT geometries have been created for our landmark rows we can write some spatial queries against our data. Here are just a few of the types of queries we can run.

Using the ward geometry we have from a previous example, we can run a query that returns all the landmarks in the 43rd Ward.

This query returns the 5 closest landmarks to a given latitude and longitude :

Tagged: gis / csv / postgis / postgresql / gdal / opendata / gov / opengov / gov20 / chicago

blog comments powered by Disqus