Java Infrastructure Part 1 – Version Control

If your code isn’t in version control, then it doesn’t really exist. It’s too easy to lose the code on a single computer – a hard drive failure, or maybe a mistaken rm command. And it’s also easy to make a change that breaks something and not be able to get back to a working version. At those times a stored copy is a lifesaver. The basic requirement of a Version Control System is to keep code safe. But, over the years it’s become much more than that.

I’ve used a lot of different VCS – sourcesafe, RCS, CVS, SVN and git. At my first job, we used a series of network folders, with a lotus notes DB to keep track of who was working on each file. Different versions of the folders identified the different development environments, from development through to live. Promoting the site  would involve copying code from one set of folders to the next.

At first, version control is about making sure the software is safe and providing a history. This in itself makes it invaluable. But tools like SVN and git make collaboration easier. Of all the tools I’ve used, git is the first one that I’ve loved.

A lot of the tutorials on git treat it like a distributed SVN. This may be helpful in getting started, but soon leads to confusion. My favourite tutorial is Git from the Bottom Up, which discusses git in terms of the objects it uses internally. That makes it much easier to understand: git is a time machine, allowing you to open up alternative timestreams, recombine them, and do it all without opening up paradoxes. As long as you understand what you’re doing.

Git adds a lot of great features. git bisect is great for finding where bugs were introduced. Git stash is great for when you need to change what you’re working on. Git detects moved files better than svn does. But the best thing about git is the branching model. Rather than have branching being something difficult, as it can be in SVN, git is based treats branching as something that should be commonplace.

The big problem with version control systems is how you fit them into the company’s working methods. Git enables people to collaborate effectively but also provides challenges. This is a topic that I owe a whole post on its own. The Death of Continuous Integration is an excellent talk on the topic by Steve Smith.

So, before we write any code for our new project, we need to set up a repository. Github is a convenient place to host these archives, and that’s where I’ll be putting the java-infrastructure code.

Screenshot - 160216 - 18:37:27

Our repository is created with a Java gitignore file and a readme. To that I am going to add a single file, a TODO. The Readme provides a quick overview of what this project is for. The TODO is a simple reference to track things that need to be added to this project. The first item in this file is a note to add a better issue tracking system.

It’s not much of a project yet, but at least we know anything we add is safe in a repository. The current state can be found as commit 94d34c6.

Next up: writing some Java code.

Java Infrastructure Part 0 – The Long Way

I’ve been programming for a long time and worked with a lot of different companies. I’ve seen a range of architectures, organisations and processes. I started coding before the Agile Manifesto was signed, so I’m old enough to remember that projects were sometimes still successful under waterfall – but that’s another story.

Writing classes and putting applications online are easy enough. Most companies are working on well-understood problems. Despite this, two issues tend to emerge. The first is maintenance. A lot is written about refactoring and managing software, but it rarely works in practise. No matter how clever the devs, code tends to end up more complicated than it needs to be and change becomes difficult.

The second issue is linked to the first, and that is infrastructure. It’s easy enough to write a new piece of code and put it live. It’s so easy that a lot of people focus on writing features for a new application. Deployment tends to be figured out in the closing weeks of the project. After all, the first deployment is relatively straightforward. The problem comes as things grow more complicated.

Once a piece of software is live and has users, it’s hard to switch the deployment strategies.The Internet is now sufficiently established that it’s not  appropriate to shut down the system every time you need to make a change. The first few deployments are simple, quick. As the system grows, it takes longer and longer to redeploy by which time there are a lot of other things competing for attention.

Adding infrastructure to a large project is a challenge. One doesn’t want to risk breaking those obscure sections of config files, placed there to handle one specific situation. Obsolete sections are left in the config because nobody is quite sure if a line does something or not. In the end, only one or two long-established developers are able to change the infrastructure. After they leave things become even more difficult.

What I want to do with this series – both on my blog and on github – is to build up a generic piece of software with simple Java code, but to build a rich infrastructure around it. I think there is a lot to learn from this – and at the end I’ll have a good base to work from with my own future projects. I hope to learn about making infrastructure flexible, which, as I’ve said above, it a rare thing.

Java is easy, but being a professional developer requires much more: version control, continuous delivery/deployment, build management, monitoring, IDEs, logging frameworks, email management etc, etc. This is what I’m going to focus on.

Some years back, I studied deconstruction for my MA. At the start of the course, the Professor read a short Kafa story, the Next Village:

My grandfather used to say: Life is astoundingly short. To me, looking back over it, life seems so foreshortened that I scarcely understand, for instance, how a young man can decide to ride over to the next village without being afraid that -not to mention accidents- even the span of a normal happy life may fall far short of the time needed for such a journey.”

After reading those sixty-six words, the Professor sighed. “We could spend all ten weeks on that piece”. I’m not planning to be quite that meticulous, but this is going to be quite detailed. Based on the notes I’ve made so far, no code gets compiled until Part 3.

Deploying sites quickly on Spring Boot

According to the author Thomas Mann, “A writer is someone for whom writing is more difficult than it is for other people”. And, working as a tech-lead, I found myself turning into someone for whom writing Java was more difficult than for other people. When a piece of software might need supporting for years there are a lot of things to consider.

One of the attractions of Spring Boot is that  it offers a way to get sites live very quickly. In practise, a production environment offers so many potential issues that a Spring Boot deployment still requires a significant amount of work. A simple hobby site… should, in theory, be very quick.

So, I’m going to try to put a Spring Boot site live and see how long it takes. The site will be simple – a map of Brighton where users can add points of interest. I’m going to use heroku to simplify deployment and cobble together some Javascript.

I’m not racing to get this live, but taking this at a leisurely pace – and the hours won’t be contiguous. Let’s see how it goes.

Hour 1

  • Set up a new git repo.
  • 0c7b180 – Created a basic Spring Boot project, using Accessing JPA Data with REST as a template. Rather than storing Person objects, I used PlaceOfInterest objects. Within 15 minutes I could write and read from an in-memory JPA datastore.
  • 2efcede Added some static pages to the Spring Application, which I had lying around already.
  • 1cad492 Next was setting up Heroku (which meant retrieving an ancient login). The free service should suffice for the basics. Heroku have a guide to deploying Spring Boot applications. This didn’t mention the Procfile, which is described in a post by Nicholas Paul Smith.  Testing the app through curl, it is persisting data, albeit to an in-memory database.

So, in the first hour, I have a Spring Boot project running on heroku. The next step is to add a persistent postgres database.

Hour 2

Hour 2 was spent in Emporium Cafe, struggling to connect the Spring Boot application through hibernate to a local PSQL database. Since I was trying to do things fast, this was something of a hack-and-slash effort – for example, resorting to creating a hibernate_sequence object manually. Setting up postgres was a bit of a drag as I’m used to MySQL.

The database URL was passed via a JDBC_URL environment variable as I’ll need this for Heroku. At the end of the hour I had a local application writing to the postgres database. The next session will start with tidying up the source code that was added and committing it.

  • 179356f – Added a basic SQL file. This was generated by Hibernate with the addition of a sequence. This needs to be tidied up.

Hour 3 (and a bit)

  • 30b5346 Starting by committing the database work from the day before
  • 4fdc2c7 Added a new file to keep track of what needs doing for this site.
  • 2a217b0 I added a new layer to the map, which loaded in a tab-separated data from a new controller. We’re using Openlayers 2 for expediency, as this simple format is not available in Openlayers 3.  This was a rather frustrating process: the markers were not displaying but there was no error. This commit has a single marker appearing about 400 miles South of Ghana.

The next session will start with working out why Openlayers is not displaying the marker where I expected,  in Sussex, England. This was annoying enough to keep me looking into it for another 15 minutes, so hour 3 was a little longer than planned. As usual with mapping bugs, the problem is not setting the projection.

Hour 4

A slightly shorter hour, to make up for overrunning the previous night.

  • ba69d68 Added the projection to the POI layer, and generated the TSV file from the database. This means that the markers on the map are now connected to Postgres.
  • 74b718d Set the app up on Heroku – which required changing the database url value; and, for some reason, having to reinstall  some heroku command line plugins, which slowed things down. I also had to add database objects  to the external database.
  • 1a436bc Cosmetic changes, including a title for the page and removing the default favicon – although this is still appearing for some reason.

At the end of the fourth hour, I have a site running on a remote server that can read/update a database to display locations. Given my slowness setting up postgres and fixing the Openlayers issue, it looks like a basic CRUD site could be put live in a couple of hours. Next up: adding a form to the page for new data.

Hour 5

At this point, I’m into playing with Javascript, setting up a form to submit data. Clicking on the map now updates a form, and submitting the form adds entries to the database. It’s a little clunky – the new object doesn’t appear on the map without refreshing; the form looks ugly; and there is an error handler firing on the AJAX request. A slightly slow hour, getting to grips with Javascript, but the main user-facing functionality is now there.

  • 4452c5d Added a form to the site so that new places-of-interest can be submitted.

To Be Continued?

There are a few obvious things to do next. The TSV file should only show visible points, the page needs tidying and I’d like to use a proper domain in place of the Heroku-supplied one. I’m very pleased with the speed and simplicity of progress so far.

Finding Ley-Lines with PostGIS

I’ve recently been playing with PostGIS. This post will summarise a simple attempt to manipulate data and draw it on top of Open Streetmap. I wanted to produce a Brighton version of Steven Kay’s Pub Ley Lines. As you will see, the outcome was less interesting than the process.

What follows is, basically, a How-to-Draw-An-Owl tutorial. It summarises the steps I usually look up and is intended to share with a few specific people. If you find yourself here via Google and want more information, leave a comment and I will add more detail.

1 – Download data from OSM

The first thing I needed was the data from Open Streetmap, which contains pub locations among the points-of-interest. There are a number of options. Downloading direct from OSM failed when I last tried it, but I had an older version of the data available.

2 – Set up Postgis and osm2pgsql

I’d previously installed Postgresql and Postgis on my laptop but somehow the installation has become broken and won’t be easily repaired or uninstalled. I should fix this, but I wanted to get on with this experiment. I’ve been meaning to set up AWS for some time, and using a micro instance on Amazon allowed me to get a version of PostGIS running very quickly.

I created a new EC2 micro instance based on the basic ubuntu instance, ran ‘sudo apt-get update’ followed by ‘sudo apt-get install osm2pgsql’ and I had everything I needed.

AWS is awesome, and I love being able to run up an instance for a small task and throw it away once I’m done.

3 – Create postGIS user and table

Setting up a new user on postgres is certainly less of a hassle than doing it on MySQL, but there are a few gotchas – such as needing to add a line to the /etc/postgresql/9.3/main/pg_hba.conf file. Also, when creating a new database, remember to enable PostGIS with the command “CREATE EXTENSION POSTGIS”

4 – Load the OSM data in Postgres

Having transferred my PBF file to the AWS instance using scp, I could then load the data. The command here is a little different to the one I used previously because of this server’s limitations:

osm2pgsql -U ley -d ley  --slim --cache-strategy sparse --number-processes 4 brighton.pbf

5 – Create a table containing all of the ley lines

I pretty-much followed the recipe given by Stephen Kay here.

6 -Extract the data

This time, rather than a tab-separated format, I selected the WKT (well-known text) format for all the ley-lines with more than 8 pubs. I’m sure there are better ways to extract this. The query used was:

psql -U ley -w ley -c "COPY (select st_astext(st_transform(geom,4674)) from leys where ct> 8) TO STDOUT WITH CSV" > lines.txt

This may not be the most portable format for the data but I can bully it into something Openlayers can use.

7 – Create a page to display the data

Since the data is constant and will be low-traffic, I am using OSM via static files to display the results.  The linestring WKT representations of the leys have been copied into the javascript file rather than being loaded from a file. All quick-and-dirty, but it has worked. The source is on github and the results are online.

An Enterprise Java Hello World

The ‘hello world’ program is of great importance to developers. It’s usually the first thing written when using a new language or framework, pretty much the simplest thing you can do: output 12 characters (assuming a newline). Writing the hello world program in C, the first time I’d written a compiled program, was an incredible moment for me: I could make the computer do something. This simple idea has an entry on wikipedia and a list of examples.

Hello World is supposed to be simple and there are a few jokes about Java EE hello world programs, mocking the framework for being long-winded and unwieldy – see for example, item 8 in The top Java viral jokes of 2014. In its defence, the contents page shown in the article covers a lot more than code, and is intended to get someone up and running from a bare-bones structure. But Java definitely doesn’t have the concision of Python’s print “Hello, world!”

One of the proud boasts for Spring Boot was how simple it was, with an example application that fitted into a single tweet (the link includes instructions for running it):

spring-boot

As easy as it is to get a Spring Boot application working, this is only part of the task of the development life-cycle. By focussing on the output, it’s easy to miss a lot of the non-functional aspects of an application. You can deploy a new piece of software to a server without making it easy for developers to work with. This is particularly dangerous with non-technical stakeholders who only see these non-functional requirements indirectly. It’s hard to prioritise infrastructure against features and bug-fixes. They are also incredibly difficult to fit into an application after it goes live.

I’ve recently set up a new application. The initial project was produced using Spring Initializr but, rather than start cutting code, I’ve been thinking about what else I need for a basic application. The essentials include:

  • Source control – and, preferably, some sort of branching and versioning strategy
  • Continuous integration and related process to make sure that new commits don’t break tests
  •  A deployment process allowing the same binary to deployed to immutable servers – preferably using some sort of container or virtualisation – preferably with the binary produced once and stored in a centralised location.
  • A means of externalising configuration for different environments
  • Some sort of monitoring and log management

It’s easy to drop a Spring Boot jar file onto a server and run it, but that’s not going to work in the long term and the easiest time to sort these infrastructural items in place is at a project’s start. The more complicated things become, the harder it is to add them in.

In short: an application isn’t just the software you are writing: it’s also the infrastructure that you put around it. In order to release and maintain an application you need to do a certain amount of work beforehand. Your hello-world application isn’t ready for production until these things are done.

Brighton Java – Continuous Deployment

IMG_20150304_195142
I had a break this month as Mr. Stanier hosted the meeting

 

Last Wednesday we had Brighton Java’s March event. It was another good turn-out, with about 35 people turning up to hear Jose Baena talk about his experience of continuous delivery.

Hearing about other people’s experiences with introducing a technology is incredibly valuable. The talk was followed by a discussion, chaired by James Stanier. We’ve not often used this format but it drew out some interesting discussion points.

Jose’s presentation was great (especially the hypnotic footage of an apple-slicing machine), with some useful suggestions on how to get Maven, Nexus, Ansible and Jenkins working together – with Jenkins acting as the driving force. There was also a detailed explanation of the importance of versioning.

The discussion underlined something I’ve been thinking about for a while – that things like continuous delivery need to be put in place early on, that these sort of infrastructural things are hard to retrofit. But that’s a story for another post.

Dan Chalmers has also posted a response to the meet-up: Continuous Deployment and Developers on-call. Dan does a good job of explaining the issues around making developers responsible for their code. I still think this is important but making it work in practise is a subtle, difficult problem.

Getting Open Streetmap working on Ubuntu

It’s taken me quite a bit of time but I’ve finally got an open streetmap tileserver working on my laptop. The main issue appears to have been that I was following the manual Ubuntu set-up instructions rather than building a tile server from packages. Which isn’t to say that using packages was plain sailing. This post will outline some of the problems I ran into and how I solved them.

(Bear in mind that I’d already tried to install OSM using the manual method on the laptop and fragments of these installs were still on the system. Installation might be a lot easier on a clean OS. However, it’s worth documenting the issues in case anyone runs into the same issues and is googling for them)

The main point of this post is to say that I have managed the open streetmap set-up and now have a map of Brighton working on my laptop:

Screenshot from 2015-03-03 08:11:32
I can see my house from here!

The installed slippymap.html page is great as the radio buttons on the top right allow you to compare a remote instance against the local one – useful for finding out where you are when things aren’t working, or you only have a small area set up. Now I’ve fixed this, I can get on with actually using the server.

The main problems I had to solve with the packaging method were:

Mod-tile dpkg installtion hangs

The command to install mod-tile hung. Googling around, I found a patch for this  – a line was missing from the libapache2-mod-tile.postinst script. Editing the script and re-running the installation fixed this issue.

osm2psql didn’t work

When I ran the command osm2pgsql I received an error:

Osm2pgsql failed due to ERROR: SELECT AddGeometryColumn('planet_osm_point', 'way', 900913, 'POINT', 2 ) failed: ERROR:  AddGeometryColumns() - invalid SRID

This seemed to have been caused by not having set up data in the spatial_ref_sys table. To check this see if there are any lines from the SQL query

SELECT * FROM spatial_ref_sys;

If not, then this is easily fixed:

psql --username=postgres --dbname=gis --file=/usr/share/postgresql/9.1/contrib/postgis-1.5/spatial_ref_sys.sql

After re-running this script, the osm2psql script worked. Note that this may have been a problem due to me incorrectly removing previous installation attempts.

Tiles not rendering

This is more of a generic issue. Basically, when the slippymap page is working, but the images are not being generated, a lot of missing images will be displayed:

Pink tiles

You can open the blank tiles in a new page and see there where they are being generated from. Some of the help online suggested that it could be down to the slippymap.html pointing to the wrong OSM server (it defaults to localhost). In my case it was a permissions issue:

Database permissions not set correctly

There seems to be a minor issue with the installation scripts. This was fixed by explicitly setting up the postgis user.

I was able to diagnose this issue by running renderd as a foreground process (ie renderd -f) and seeing permissions errors such as the following:

renderd[26466]: An error occurred while loading the map layer 'default': :
FATAL: role "www-data" does not exist (encountered during parsing of layer 'landcover' in map '/etc/mapnik-osm-data/osm.xml')
renderd[26466]: An error occurred while loading the map layer 'default': :
FATAL: role "www-data" does not exist (encountered during parsing of layer 'landcover' in map '/etc/mapnik-osm-data/osm.xml')

Book Review: Release It!

Michael T. Nygard’s Release It! was referred to several times at MuCon and keeps coming up when people discuss microservices. Despite being released in 2007 the book feels up-to-date. There was a lot of useful advice in here and, even when material was familiar, it was still entertaining. I wish I’d read this years ago – it  would have saved a few mistakes.

The book is a good mix of instruction and case studies. Seeing how Nygard coped with real-life outages is both instructive and fascinating. The advice ranges from high-level (an excellent overview of networking) to low-level (when doing load tests, make sure some connections don’t log out).

Some of the sections I found particularly useful were the one on third party SLA’s and how to deal with them; QA vs production (one of those issues that keeps returning for me); the need to consider data purging from the start of a platform. There is also the discussion of the circuit breaker pattern, which seems to have been one of the most influential parts of the book.

As my responsibility has increased, I’ve needed to think more about the terrible things that might happen to a system. A lot of junior developers are perhaps too optimistic and books like this one are good for giving an idea of how subtly brittle systems can be and the need to develop a certain cynicism. For example, in a section on testing, Nygaard writes:

“A good test harness should be devious. It should be as nasty and vicious as real-world systems will be. The test harness should leave scars on the system under test. Its job is to make the system under test cynical”

Parts of Release It! feel like a horror novel for software developers. It will open your eyes to places where your software is vulnerable and how bad things might get. The book is also funny – laugh out loud funny in a few places. The best endorsement I can give is that this is one of those books you want to force everyone around you to read.

Brighton Java – Agile Testing and Spring

IMG_20150204_195616

We had a packed session at Brighton Java last Wednesday, with Kim Knup from Crunch starting by discussing Agile Testing. The testing community is absolutely incredible and they’ve done so much to define their role within software development, moving it away from the unsophisticated idea of simply catching bugs.

The second talk was Luke Whiting on Micro services, micro effort. The slides and source code are now online.  I’m looking forward to playing with the tools that Brandwatch have been working with in this area.

The next Brighton Java meeting is on March 4th, with speakers to be announced nearer the time.  I don’t know who it is yet, as I’m not organising this one. I’m looking forward to being able to relax a little more on the night.

2015 – Brighton Java

Weblogs are generally quietest when there are a lot of things going on in the writer’s life. I have notes on several posts but haven’t finished any yet. I’ve been researching a lot of topics such as RabbitMQ and Docker, thinking about microservices and handed in my notice at Crunch. There’s certainly a lot to talk about.

I also need to write up some notes on my talk at Brighton Java in January. For various reasons (some of them out of my control) the talk was less successful than I’d hoped, but I think there were some useful points in there.

The February Brighton Java event is on the 4th, and there are only a few spots left. Kim Knup is talking about Agile Testing and Luke Whiting is returning to talk about developing services in Spring. I also need to find speakers for the March event.