Finding Ley-Lines with PostGIS

I’ve recently been playing with PostGIS. This post will summarise a simple attempt to manipulate data and draw it on top of Open Streetmap. I wanted to produce a Brighton version of Steven Kay’s Pub Ley Lines. As you will see, the outcome was less interesting than the process.

What follows is, basically, a How-to-Draw-An-Owl tutorial. It summarises the steps I usually look up and is intended to share with a few specific people. If you find yourself here via Google and want more information, leave a comment and I will add more detail.

1 – Download data from OSM

The first thing I needed was the data from Open Streetmap, which contains pub locations among the points-of-interest. There are a number of options. Downloading direct from OSM failed when I last tried it, but I had an older version of the data available.

2 – Set up Postgis and osm2pgsql

I’d previously installed Postgresql and Postgis on my laptop but somehow the installation has become broken and won’t be easily repaired or uninstalled. I should fix this, but I wanted to get on with this experiment. I’ve been meaning to set up AWS for some time, and using a micro instance on Amazon allowed me to get a version of PostGIS running very quickly.

I created a new EC2 micro instance based on the basic ubuntu instance, ran ‘sudo apt-get update’ followed by ‘sudo apt-get install osm2pgsql’ and I had everything I needed.

AWS is awesome, and I love being able to run up an instance for a small task and throw it away once I’m done.

3 – Create postGIS user and table

Setting up a new user on postgres is certainly less of a hassle than doing it on MySQL, but there are a few gotchas – such as needing to add a line to the /etc/postgresql/9.3/main/pg_hba.conf file. Also, when creating a new database, remember to enable PostGIS with the command “CREATE EXTENSION POSTGIS”

4 – Load the OSM data in Postgres

Having transferred my PBF file to the AWS instance using scp, I could then load the data. The command here is a little different to the one I used previously because of this server’s limitations:

osm2pgsql -U ley -d ley  --slim --cache-strategy sparse --number-processes 4 brighton.pbf

5 – Create a table containing all of the ley lines

I pretty-much followed the recipe given by Stephen Kay here.

6 -Extract the data

This time, rather than a tab-separated format, I selected the WKT (well-known text) format for all the ley-lines with more than 8 pubs. I’m sure there are better ways to extract this. The query used was:

psql -U ley -w ley -c "COPY (select st_astext(st_transform(geom,4674)) from leys where ct> 8) TO STDOUT WITH CSV" > lines.txt

This may not be the most portable format for the data but I can bully it into something Openlayers can use.

7 – Create a page to display the data

Since the data is constant and will be low-traffic, I am using OSM via static files to display the results.  The linestring WKT representations of the leys have been copied into the javascript file rather than being loaded from a file. All quick-and-dirty, but it has worked. The source is on github and the results are online.

An Enterprise Java Hello World

The ‘hello world’ program is of great importance to developers. It’s usually the first thing written when using a new language or framework, pretty much the simplest thing you can do: output 12 characters (assuming a newline). Writing the hello world program in C, the first time I’d written a compiled program, was an incredible moment for me: I could make the computer do something. This simple idea has an entry on wikipedia and a list of examples.

Hello World is supposed to be simple and there are a few jokes about Java EE hello world programs, mocking the framework for being long-winded and unwieldy – see for example, item 8 in The top Java viral jokes of 2014. In its defence, the contents page shown in the article covers a lot more than code, and is intended to get someone up and running from a bare-bones structure. But Java definitely doesn’t have the concision of Python’s print “Hello, world!”

One of the proud boasts for Spring Boot was how simple it was, with an example application that fitted into a single tweet (the link includes instructions for running it):

spring-boot

As easy as it is to get a Spring Boot application working, this is only part of the task of the development life-cycle. By focussing on the output, it’s easy to miss a lot of the non-functional aspects of an application. You can deploy a new piece of software to a server without making it easy for developers to work with. This is particularly dangerous with non-technical stakeholders who only see these non-functional requirements indirectly. It’s hard to prioritise infrastructure against features and bug-fixes. They are also incredibly difficult to fit into an application after it goes live.

I’ve recently set up a new application. The initial project was produced using Spring Initializr but, rather than start cutting code, I’ve been thinking about what else I need for a basic application. The essentials include:

  • Source control – and, preferably, some sort of branching and versioning strategy
  • Continuous integration and related process to make sure that new commits don’t break tests
  •  A deployment process allowing the same binary to deployed to immutable servers – preferably using some sort of container or virtualisation – preferably with the binary produced once and stored in a centralised location.
  • A means of externalising configuration for different environments
  • Some sort of monitoring and log management

It’s easy to drop a Spring Boot jar file onto a server and run it, but that’s not going to work in the long term and the easiest time to sort these infrastructural items in place is at a project’s start. The more complicated things become, the harder it is to add them in.

In short: an application isn’t just the software you are writing: it’s also the infrastructure that you put around it. In order to release and maintain an application you need to do a certain amount of work beforehand. Your hello-world application isn’t ready for production until these things are done.

Brighton Java – Continuous Deployment

IMG_20150304_195142
I had a break this month as Mr. Stanier hosted the meeting

 

Last Wednesday we had Brighton Java’s March event. It was another good turn-out, with about 35 people turning up to hear Jose Baena talk about his experience of continuous delivery.

Hearing about other people’s experiences with introducing a technology is incredibly valuable. The talk was followed by a discussion, chaired by James Stanier. We’ve not often used this format but it drew out some interesting discussion points.

Jose’s presentation was great (especially the hypnotic footage of an apple-slicing machine), with some useful suggestions on how to get Maven, Nexus, Ansible and Jenkins working together – with Jenkins acting as the driving force. There was also a detailed explanation of the importance of versioning.

The discussion underlined something I’ve been thinking about for a while – that things like continuous delivery need to be put in place early on, that these sort of infrastructural things are hard to retrofit. But that’s a story for another post.

Dan Chalmers has also posted a response to the meet-up: Continuous Deployment and Developers on-call. Dan does a good job of explaining the issues around making developers responsible for their code. I still think this is important but making it work in practise is a subtle, difficult problem.

Getting Open Streetmap working on Ubuntu

It’s taken me quite a bit of time but I’ve finally got an open streetmap tileserver working on my laptop. The main issue appears to have been that I was following the manual Ubuntu set-up instructions rather than building a tile server from packages. Which isn’t to say that using packages was plain sailing. This post will outline some of the problems I ran into and how I solved them.

(Bear in mind that I’d already tried to install OSM using the manual method on the laptop and fragments of these installs were still on the system. Installation might be a lot easier on a clean OS. However, it’s worth documenting the issues in case anyone runs into the same issues and is googling for them)

The main point of this post is to say that I have managed the open streetmap set-up and now have a map of Brighton working on my laptop:

Screenshot from 2015-03-03 08:11:32
I can see my house from here!

The installed slippymap.html page is great as the radio buttons on the top right allow you to compare a remote instance against the local one – useful for finding out where you are when things aren’t working, or you only have a small area set up. Now I’ve fixed this, I can get on with actually using the server.

The main problems I had to solve with the packaging method were:

Mod-tile dpkg installtion hangs

The command to install mod-tile hung. Googling around, I found a patch for this  – a line was missing from the libapache2-mod-tile.postinst script. Editing the script and re-running the installation fixed this issue.

osm2psql didn’t work

When I ran the command osm2pgsql I received an error:

Osm2pgsql failed due to ERROR: SELECT AddGeometryColumn('planet_osm_point', 'way', 900913, 'POINT', 2 ) failed: ERROR:  AddGeometryColumns() - invalid SRID

This seemed to have been caused by not having set up data in the spatial_ref_sys table. To check this see if there are any lines from the SQL query

SELECT * FROM spatial_ref_sys;

If not, then this is easily fixed:

psql --username=postgres --dbname=gis --file=/usr/share/postgresql/9.1/contrib/postgis-1.5/spatial_ref_sys.sql

After re-running this script, the osm2psql script worked. Note that this may have been a problem due to me incorrectly removing previous installation attempts.

Tiles not rendering

This is more of a generic issue. Basically, when the slippymap page is working, but the images are not being generated, a lot of missing images will be displayed:

Pink tiles

You can open the blank tiles in a new page and see there where they are being generated from. Some of the help online suggested that it could be down to the slippymap.html pointing to the wrong OSM server (it defaults to localhost). In my case it was a permissions issue:

Database permissions not set correctly

There seems to be a minor issue with the installation scripts. This was fixed by explicitly setting up the postgis user.

I was able to diagnose this issue by running renderd as a foreground process (ie renderd -f) and seeing permissions errors such as the following:

renderd[26466]: An error occurred while loading the map layer 'default': :
FATAL: role "www-data" does not exist (encountered during parsing of layer 'landcover' in map '/etc/mapnik-osm-data/osm.xml')
renderd[26466]: An error occurred while loading the map layer 'default': :
FATAL: role "www-data" does not exist (encountered during parsing of layer 'landcover' in map '/etc/mapnik-osm-data/osm.xml')

Book Review: Release It!

Michael T. Nygard’s Release It! was referred to several times at MuCon and keeps coming up when people discuss microservices. Despite being released in 2007 the book feels up-to-date. There was a lot of useful advice in here and, even when material was familiar, it was still entertaining. I wish I’d read this years ago – it  would have saved a few mistakes.

The book is a good mix of instruction and case studies. Seeing how Nygard coped with real-life outages is both instructive and fascinating. The advice ranges from high-level (an excellent overview of networking) to low-level (when doing load tests, make sure some connections don’t log out).

Some of the sections I found particularly useful were the one on third party SLA’s and how to deal with them; QA vs production (one of those issues that keeps returning for me); the need to consider data purging from the start of a platform. There is also the discussion of the circuit breaker pattern, which seems to have been one of the most influential parts of the book.

As my responsibility has increased, I’ve needed to think more about the terrible things that might happen to a system. A lot of junior developers are perhaps too optimistic and books like this one are good for giving an idea of how subtly brittle systems can be and the need to develop a certain cynicism. For example, in a section on testing, Nygaard writes:

“A good test harness should be devious. It should be as nasty and vicious as real-world systems will be. The test harness should leave scars on the system under test. Its job is to make the system under test cynical”

Parts of Release It! feel like a horror novel for software developers. It will open your eyes to places where your software is vulnerable and how bad things might get. The book is also funny – laugh out loud funny in a few places. The best endorsement I can give is that this is one of those books you want to force everyone around you to read.

Brighton Java – Agile Testing and Spring

IMG_20150204_195616

We had a packed session at Brighton Java last Wednesday, with Kim Knup from Crunch starting by discussing Agile Testing. The testing community is absolutely incredible and they’ve done so much to define their role within software development, moving it away from the unsophisticated idea of simply catching bugs.

The second talk was Luke Whiting on Micro services, micro effort. The slides and source code are now online.  I’m looking forward to playing with the tools that Brandwatch have been working with in this area.

The next Brighton Java meeting is on March 4th, with speakers to be announced nearer the time.  I don’t know who it is yet, as I’m not organising this one. I’m looking forward to being able to relax a little more on the night.

2015 – Brighton Java

Weblogs are generally quietest when there are a lot of things going on in the writer’s life. I have notes on several posts but haven’t finished any yet. I’ve been researching a lot of topics such as RabbitMQ and Docker, thinking about microservices and handed in my notice at Crunch. There’s certainly a lot to talk about.

I also need to write up some notes on my talk at Brighton Java in January. For various reasons (some of them out of my control) the talk was less successful than I’d hoped, but I think there were some useful points in there.

The February Brighton Java event is on the 4th, and there are only a few spots left. Kim Knup is talking about Agile Testing and Luke Whiting is returning to talk about developing services in Spring. I also need to find speakers for the March event.

First steps with OpenStreetMap

(3/3/15 EDIT – it turns out that installing OSM is simpler than I thought. I’ve published an update to this post which details the steps taken to a successful installation)

Over the past few weeks I’ve spent a lot of time wrestling with installing an OpenStreetMap server. In theory this should be simple. The tutorial for the version of Ubuntu that I use makes it look easy. Instead this is the most difficult install I worked with since the days when Oracle released their shrinkwrap products with incorrect documentation.

The OSM install involves five different products, some of which need to be compiled from source, and one which doesn’t work in the documented version. Fortunately, a lot of the problems I faced could be resolved via Google. Although using make only increases my love for maven. I don’t miss old-school build systems.

I’ve had multiple stabs at installing OSM, which seems about par for the course. I’ve not succeeded yet, but I seem to be making progress. I thought I would document what I’ve done here – partly for my own benefit and partly for anyone who comes by on Google and is as lost as I am. I’m determined to succeed at this – it’s a challenge and a puzzle and will probably seem quite simple once I’ve cracked it.

The Stack

The OpenStreetMap stack consists of four elements:

  1. mod_tile, an apache module that serves and caches map tiles
  2. renderd – a “priority queueing system” to manage the load from mod_tile’s rendering requests.
  3. mapnik – the software library that does the actual rendering of the map tiles.
  4. postgresql/postgis the datastore containing the mapping elements

OSM also needs osm2pgsql to load the OSM data into postgis. So far, I have mapnik working and mod_tile passing requests to renderd, but I cannot link renderd to mapnik. The only thing I’ve served from mod_tile is this world map, which is produced outside the main mapnik process:

0

Installation

The OSM installation is fiddly, which means Googling for the issues. There are also lots of fileswhere a small mistake can need debugging, for example replacing variables in DTD entities. Things like these would be easier if the installation files were opinionated – this works well in Spring Boot and Maven.

The main issue I’ve had is with versions of software (which is where I think I’m currently stuck). One bug had to be fixed by using a version of mod_tile from github.com:springmeyer/mod_tile.git. However, I am not convinced this is playing nicely with some versions of mapnik.

I had decided to use a limited area of the world, rather than the full-on 29GB download. To work out what tile co-ordinates are needed by mod_tile requires a mapping from long/lat to the OSM cordinates (I used the python scripts on that page). These co-ordinates can be checked against the OSM public servers. For example, Brighton is http://a.tile.openstreetmap.org/12/2046/1374.png.

As I said above, I still can’t get renderd to talk to mapnik, although mapnik is working, shown by this attempt to render Brighton:

out2

Apparently, the East Sussex data I had didn’t include Brighton. But a query in East Sussex produced a decent map:

out

What Next?

Despite my issues, I’m still excited about OSM and have a couple of ideas for using it. And I’m convinced that I can figure out what’s going wrong, although I will need to sit down and actually read the mod_tile documentation rather than trusting to installation guides and online discussions.

I regret not attempting the original installation on a Vagrant VM, as this would have enabled me to start from scratch. I may have to do a little work to remove the different versions of libraries on my laptop before I can continue the installation.

Hopefully, another few hour’s work should see me with a working version of OSM. Once I’ve done that, I will write up a step-by-step list of the steps required.

Speaking at Brighton Java on January 7th

I’m speaking at the next Brighton Java event on January 7th about ‘microservices for monoliths’. I’ve not started writing the talk yet but, as it’s three weeks away, I need to get started soon.

I work on a monolithic piece of software. Moving to a microservice architecture is some way off but, even now, there are incredibly useful insights and techniques from the microservice world. The methods needed to support hundreds of different servers have spin-offs that can help when you have just one. I also hope to look at how treating your software as a monolith is a dangerous abstraction, as well as giving quick demonstrations of Wiremock and Hystrix.

Also speaking is my colleague Danielle Ashley, expanding a talk she gave as a lightning session at the LJC Openconf about learning Scala: “What happens when you start your first exploration of Scala by picking one of the most unlikely, unsuitable applications for a language of that type – real-time number crunching – and still end up with kind-of-usable functional code?”

Brandwatch are providing pizza and beer. We kick off at 7pm at the Skiff. Sign up if you’d like to join us.

Book Review: The Leprechauns of Software Engineering

Last Saturday, while waiting to be picked up by my friends Joh and Simon, I was reading the original paper on Conway’s Law. It is one of the reference points of microservices and I wondered exactly what it said. How had Conway proved his Law? Talking about this in the car, Joh and Simon said I should read Laurent Bossavit‘s Leprechauns of Software Engineering.

In this small book (which is a good thing – too many software books are much larger than they should be) Bossavit investigates the common-sense things that ‘everybody knows’ about software development and finds that many of them are based on hearsay and poor citation. The book is a sort of Bad Science for computer programmers, what Bossavit refers to as “epistemic hygiene”. Some of the myths examined are the idea of the 10x programmer, the idea that bugs cost more the later that they are found, and the problems of the waterfall method.

I love this sort of research. I spent an hour during my MA investigating the often-quoted story that Tristan Tzara caused a riot by reading random poetry. Since my essay was on William Burroughs and detournement, I barely managed to fit the research into a footnote, but the work was fascinating.

Bossavit wants to train software developers to be more sceptical, and outlines his method. Of course, the big flaw with this book like this is that it demands a lot from the reader. Without going back and reading the original sources for myself, I can’t be sure that Bossavit’s claims aren’t hearsay themselves. But, even without doing that work, there is value in the critical doubt that it stirs up.

For me, the most interesting part of the book was the discussion of waterfall, and the suggestion that the attacks on waterfall are based on a straw-man. While I love agile, and feel that it produces more humane projects, a well-run traditional project can be more effective than a poorly run agile one. Indeed, one of the problems with agile is that failed projects are often dismissed as ‘not being properly agile’. As Bossavit writes: “Software engineering is a social process, not a naturally occurring one- it therefore has the property that what we believe about software engineering has causal impacts on what is real about software engineering.