First steps with OpenStreetMap

(3/3/15 EDIT – it turns out that installing OSM is simpler than I thought. I’ve published an update to this post which details the steps taken to a successful installation)

Over the past few weeks I’ve spent a lot of time wrestling with installing an OpenStreetMap server. In theory this should be simple. The tutorial for the version of Ubuntu that I use makes it look easy. Instead this is the most difficult install I worked with since the days when Oracle released their shrinkwrap products with incorrect documentation.

The OSM install involves five different products, some of which need to be compiled from source, and one which doesn’t work in the documented version. Fortunately, a lot of the problems I faced could be resolved via Google. Although using make only increases my love for maven. I don’t miss old-school build systems.

I’ve had multiple stabs at installing OSM, which seems about par for the course. I’ve not succeeded yet, but I seem to be making progress. I thought I would document what I’ve done here – partly for my own benefit and partly for anyone who comes by on Google and is as lost as I am. I’m determined to succeed at this – it’s a challenge and a puzzle and will probably seem quite simple once I’ve cracked it.

The Stack

The OpenStreetMap stack consists of four elements:

  1. mod_tile, an apache module that serves and caches map tiles
  2. renderd – a “priority queueing system” to manage the load from mod_tile’s rendering requests.
  3. mapnik – the software library that does the actual rendering of the map tiles.
  4. postgresql/postgis the datastore containing the mapping elements

OSM also needs osm2pgsql to load the OSM data into postgis. So far, I have mapnik working and mod_tile passing requests to renderd, but I cannot link renderd to mapnik. The only thing I’ve served from mod_tile is this world map, which is produced outside the main mapnik process:

0

Installation

The OSM installation is fiddly, which means Googling for the issues. There are also lots of fileswhere a small mistake can need debugging, for example replacing variables in DTD entities. Things like these would be easier if the installation files were opinionated – this works well in Spring Boot and Maven.

The main issue I’ve had is with versions of software (which is where I think I’m currently stuck). One bug had to be fixed by using a version of mod_tile from github.com:springmeyer/mod_tile.git. However, I am not convinced this is playing nicely with some versions of mapnik.

I had decided to use a limited area of the world, rather than the full-on 29GB download. To work out what tile co-ordinates are needed by mod_tile requires a mapping from long/lat to the OSM cordinates (I used the python scripts on that page). These co-ordinates can be checked against the OSM public servers. For example, Brighton is http://a.tile.openstreetmap.org/12/2046/1374.png.

As I said above, I still can’t get renderd to talk to mapnik, although mapnik is working, shown by this attempt to render Brighton:

out2

Apparently, the East Sussex data I had didn’t include Brighton. But a query in East Sussex produced a decent map:

out

What Next?

Despite my issues, I’m still excited about OSM and have a couple of ideas for using it. And I’m convinced that I can figure out what’s going wrong, although I will need to sit down and actually read the mod_tile documentation rather than trusting to installation guides and online discussions.

I regret not attempting the original installation on a Vagrant VM, as this would have enabled me to start from scratch. I may have to do a little work to remove the different versions of libraries on my laptop before I can continue the installation.

Hopefully, another few hour’s work should see me with a working version of OSM. Once I’ve done that, I will write up a step-by-step list of the steps required.

Speaking at Brighton Java on January 7th

I’m speaking at the next Brighton Java event on January 7th about ‘microservices for monoliths’. I’ve not started writing the talk yet but, as it’s three weeks away, I need to get started soon.

I work on a monolithic piece of software. Moving to a microservice architecture is some way off but, even now, there are incredibly useful insights and techniques from the microservice world. The methods needed to support hundreds of different servers have spin-offs that can help when you have just one. I also hope to look at how treating your software as a monolith is a dangerous abstraction, as well as giving quick demonstrations of Wiremock and Hystrix.

Also speaking is my colleague Danielle Ashley, expanding a talk she gave as a lightning session at the LJC Openconf about learning Scala: “What happens when you start your first exploration of Scala by picking one of the most unlikely, unsuitable applications for a language of that type – real-time number crunching – and still end up with kind-of-usable functional code?”

Brandwatch are providing pizza and beer. We kick off at 7pm at the Skiff. Sign up if you’d like to join us.

Book Review: The Leprechauns of Software Engineering

Last Saturday, while waiting to be picked up by my friends Joh and Simon, I was reading the original paper on Conway’s Law. It is one of the reference points of microservices and I wondered exactly what it said. How had Conway proved his Law? Talking about this in the car, Joh and Simon said I should read Laurent Bossavit‘s Leprechauns of Software Engineering.

In this small book (which is a good thing – too many software books are much larger than they should be) Bossavit investigates the common-sense things that ‘everybody knows’ about software development and finds that many of them are based on hearsay and poor citation. The book is a sort of Bad Science for computer programmers, what Bossavit refers to as “epistemic hygiene”. Some of the myths examined are the idea of the 10x programmer, the idea that bugs cost more the later that they are found, and the problems of the waterfall method.

I love this sort of research. I spent an hour during my MA investigating the often-quoted story that Tristan Tzara caused a riot by reading random poetry. Since my essay was on William Burroughs and detournement, I barely managed to fit the research into a footnote, but the work was fascinating.

Bossavit wants to train software developers to be more sceptical, and outlines his method. Of course, the big flaw with this book like this is that it demands a lot from the reader. Without going back and reading the original sources for myself, I can’t be sure that Bossavit’s claims aren’t hearsay themselves. But, even without doing that work, there is value in the critical doubt that it stirs up.

For me, the most interesting part of the book was the discussion of waterfall, and the suggestion that the attacks on waterfall are based on a straw-man. While I love agile, and feel that it produces more humane projects, a well-run traditional project can be more effective than a poorly run agile one. Indeed, one of the problems with agile is that failed projects are often dismissed as ‘not being properly agile’. As Bossavit writes: “Software engineering is a social process, not a naturally occurring one- it therefore has the property that what we believe about software engineering has causal impacts on what is real about software engineering.

Brighton Java: 2014 to 2015

Brighton Java has settled into its traditional Christmas break. And it’s great that we’re established enough to have a tradition! When we started in 2012 it was difficult to get going but this year things have taken off. We have 260 people in the group and our last session was full with very little promotion.

A lot of this is due to the sponsorship we’ve had. The Skiff provide us with a great venue, and we’ve had a great deal of support from Brandwatch – the promise of pizza and drinks definitely draws people in. I’ve also had some help from James Stanier and Luke Whiting with organisation and planning.

It’s now time to start planing the 2015 events. Next year I’d like to try to have an event every month. We had hoped to have a Hack Day as part of the Brighton Science Festival. That wasn’t possible, but I hope we can arrange something similar later in the year. I’m also hoping to bring in more students from the universities, as well as some academic speakers. I’d also like to see some smaller, more technical workshop events.

This is an exciting time for Java, and the claims that it was dying or “21st century COBOL” are quietening down, replaced by excitement over new JVM languages, microservices and the possibility of finally getting some long-promised features in Java 9. I’m very excited about the talks and events to come from Brighton Java in 2015.

So, thank you to the Skiff, Brandwatch, everyone who came to the talks and of course to all to the speakers, who for 2014 were:

I’ve enjoyed the sessions, and am grateful to all the speakers – I think we’ve provided a varied and up-to-date range of talks. If you’d like to be involved in 2015, please get in touch.

The next session will be on Wednesday February 4th 2015, at the Skiff. Details and signup will be via the meetup group Wednesday January 7th 2015. I will be speaking about applying microservice techniques to monoliths and my colleague Danielle Ashley will discuss an inappropriate Scala project.

(PS – I’ve set up a Linked-In group for people who like that sort of thing)

How not to do terrible things with scheduled jobs

Every system I’ve worked on has had scheduled jobs. Regular tasks need to be automated, particularly if they need to run in the middle of the night. Often these jobs are used for billing. For example, a job might need to:

  • Find all customers who need  subscription renewals
  • Bill each customer
  • Update that customer’s records, allowing them to continue accessing a service
  • Email a billing confirmation

Writing a job like that is easy. The problem comes with remembering all the different things that might go wrong. Some developers are good at this, but others are optimistic, happy-go-lucky souls that never consider all the terrible things that might happen. I thought it would be useful to make a list of the sort of things I ask myself when thinking about a scheduled job:

  • What happens if it fails to run? How do I find out it has failed (ie, where do we see the effects of the job not running?)
  • What happens if there is a problem? Who gets notified? How do they know what to do next?
  • If there is a problem with processing one of the records, does the job continue? What if every record is failing? Do we give up or keep going?
  • If a run of the job doesn’t happen, does the job run as normal on the next execution? Does it catch up on the previously missed work? Should it?
  • Can I run the job manually if I need it to? How should this be done? Who should be allowed to do it? How do they know when they should do this?
  • What happens if the job runs more often than it should? What if it’s running once a minute rather than once a day?
  • What happens if the job is taking too long? How can I tell if the job has failed, is paused, or is just taking a really long time?
  • What if an execution of the job is still going on when it’s time for the next execution to begin? Can the different instances of the job interact safely? Does the job check whether it is already running?
  • What happens if two instances of the job are running simultaneously?
  • What happens if the job fails halfway? Can it be restarted safely?
  • What happens if the data used by the job is changed by another process? For example if a user cancels their subscription after the job has started?
  • What happens if one of the steps fails? Email and Billing are often third party systems. Which order should the events happen to ensure the safest failure?
  • What is the worst possible thing this job could do if it were to go wrong?

Not all these steps are relevant to all jobs. Writing code to handle every eventuality can sometimes be more expensive than clearing up the mess when things go wrong. And, obviously, a lot of the issues are handled by frameworks. But it’s worth running through these sorts of questions before writing any code. Thinking about terrible things and knowing what will happen produces more robust code.