Java Peaked With Version 7

Next week sees the launch of Java 17. This is the 8th release since Java moved to its six-monthly release cycle. This means nine releases have taken place in the same time that it took to go from version 6 to 7.

But, personally, I think the best version of Java is Java 7. The first time I said this I was joking, having just wrestled with some very complicated code that someone had written using streams. But the more I think about this, the more certain I am that this is right.

I can understand the pressure to keep adding features to improve Java. I remember the noughties, when everyone said Java was going to be replaced by newer, cooler languages. Java might be verbose and clunky but it’s also consistent. Pre-Java 8, there tended to be only one or two ways of doing things, and code would look relatively similar between different companies. Being a simple language, it was easy for developers to follow what was happening. There was less space for clever code that junior developers couldn’t understand. 

Java 8 was exciting, providing new paradigms for Java. But it’s made code more idiomatic and it’s easier to write obfuscated code. Compare it with perl, a language that was intentionally designed to be expressive. It’s not used in many large-scale systems. 

I can understand the pressure to add new features. Sometimes Java is frustrating. But Kotlin and Scala have set out their stalls as advanced languages on the JVM. They’re compatible with Java too, so there’s a strong argument for keeping Java as the dull boring option. I might want to use Scala or Kotlin in my own projects, but where I’m collaborating with multinational teams working agilely (which, in practise, means no documentation) I like my Java code as simple as it can be.


What Do We Mean By Technical Debt?

The cat is a metaphor for technical debt

Recently, I’ve been thinking a lot about technical debt. It’s a great metaphor for how time is lost in programming projects but, like all metaphors, you have to be aware of where it stops working.

For example, when people compare national debts to household debts, this ignores how national banks control the supply of money, and that countries run into different problems to households when they’re over-extended.

Technical debt is, basically, time saved now which costs more later – taking tactical short cuts. It’s usefully compared to any other corporate debt, providing leverage for growth that would be impossible otherwise. The problem is that people don’t seem to track how much technical debt they’ve accrued and need to ‘pay back’. I’ve worked at a number of places where dealing with inefficient processes that were set up in a hurry took up a substantial amount of day-to-day work.

Having seem companies struggling with debt, I wonder if technical debt should be treated more like personal or household debt. Advice to people who are struggling with debt generally focusses on three points:

  1. Reduce outgoings to focus on paying off the debt
  2. Pay off the highest-interest debts first
  3. Do not save or invest anything until any costly debts are dealt with.

The software equivalent of expensive credit card debt is any inefficiencies or manual steps in repeated processes. If your regular work is taking longer than it should, there’s less time available for new things. So, like a household in debt, don’t try to invest in new things before sorting these existing inefficiencies. Fix the automated tests so no-one needs to reduce the manual checks on a release. Automate the entire release process. Improve testing to prevent disruptive production bugs.

Technical debt is a good thing when it’s used to provide leverage for growth. But, if you’re not tracking this debt accurately then maybe you should treat technical debt more like household debt, and prioritise reducing it to a manageable level. There’s no point investing in new features when you’re paying high interest on broken processes.


First steps in servlerless

I’m starting a new job next month where I’ll be using AWS Lambda. In preparation, I’ve been cramming on the topic. The main resource I’ve used is O’Reilly’s Programming AWS Lambda, and I’m enjoying learning from an actual physical book with an animal on the cover.

Here’s a quick summary of some of the other sources I’ve been looking at:

  • Mike Robert’s Serverless Architectures post is massive, and full of really useful discussion. This includes: a comparison between serverless and stored procedures (vendor locking, difficulty testing and versioning); the value of reduced time to market; environmental benefits of serverless; and the challenges of integration testing.
  • Gunnar Morling produced a good infoQ talk, Serverless Search for my blog, which discusses AWS Lambda used for a Lucene-based blog search. Morling uses Quarkus to avoid lock-in, and also suggests this gets around the cold-start problem. He also suggested Funqy as an vendor independent abstraction for serverless code. Morling points out that serverless has a smaller attack service, but looked in detail at dealing with a ‘denial of wallet’ attack.
  • Bruce Schneier discussed The Misaligned Incentives for Cloud Security, warning that it has a few large providers making technical decisions for millions of users; and that security problems such as data breaches affect their customers more than it affects them.
  • Guy Podjarny talks about the security issues in greater detail in Serverless security: What’s left to protect. He points out that one still needs to consider dependency vulnerabilities. While security permissions in serverless can be very granular, there is also a risk of this sprawling. Podjarny makes a number of suggestions including having critical and non-critical functionality in different accounts or regions.
  • Serverless and Chatbots: A Match Made in the Cloud by Gillian Armstrong was focussed on chatbots, but had a good overview of a lambda-based platform in production. Armstrong also noted that while lambdas scale every quickly, other parts of an infrastructure such as datastores might not.
  • A 2020 article Why the Serverless Revolution Has Stalled takes a more cynical approach, looking at four potential issues: limited programming languages; vendor lock; performance; inability to replace monolithic applications. Some of these issues have been solved by some teams, but all these points are worth considering.
  • Cloud study by the writer Robin Sloan discusses his use of cloud functions to provide simple support for running his newsletter. His solution to the cold start problem is, he admits, not best practise, but works for him: “Instead of deploying each of my functions as Actually Different cloud functions, I’ve rolled them up into one “mega function”—really almost a tiny app.” This solves a lot of issues for this small piece of functionality, not least that it fails fast: “if something isn’t working, nothing is working
  • Another post on cold starts suggested reducing the artefact size and had a good discussion of using pings to keep services live.
  • Operational Best Practices #serverless talked about how serverless limits the amount of code an enterprise needed, and that BaaS, FaaS and BaaS can all help speed up dev, particularly early in the process “You get to rent engineers from Google, AWS, Pagerduty, Pingdom, Heroku, etc for much cheaper than if you hired them in-house — if you could even get them, which you probably can’t because talent is scarce.”
  • That piece also contains a stern warning: “there is no such thing as having the luxury of not having to understand how your storage systems work. Queries will get slow, and you’ll need to be able to figure out why and fix them. You’ll hit scaling cliffs where suddenly a perfectly-usable app just starts timing everything out because of that extra second of latency coming from … The more you understand about your storage system (and the more you stay in the lane of how it was intended to be used), the happier you’ll be.

Using serverless for hobby projects does look attractive. But, having tried to get S3 and IAM working on AWS, I’d be reluctant to suggest that to anyone – particularly given the financial perils of AWS.


Using postgis on docker to Find Ley-Lines

Back in 2015, I wrote a blog post about finding ley-lines with postgis, based on some work by Steven Kay.

I described the outline of how to produce as leylines as a How-to-Draw-An-Owl tutorial, since I skipped over a lot of the complex points. I was setting up my temporary postgis instances on AWS and there was a certain amount of faff involved in sorting out database access and structure. This post is an improvement on those methods.

(There are still bits I’m not explaining in full – basically, how to display the ley line – but if you need more details, please let a comment and I’ll get in touch. But, basically, you need a little bit of HTML/Javascript to take this all the way through)

I’ve returned to this project for some recent writing, and the technical aspects are much easier using docker. In fact, the process can be broken down to seven steps, which I’ve listed below. You’ll need to be comfortable with a command line, have docker installed, and be willing to play about a little. But the process is relatively straightforward:

  1. Start up a postgis instance from a computer with docker installed:
docker run --name some-postgis -e POSTGRES_PASSWORD=mysecret -d postgis/postgis

2. Open a terminal in the docker instance:

docker exec -it some-postgis /bin/bash

3. From inside the docker instance, install the OSM tools

apt update
apt install osm2pgsql

4. From the host computer, download some OSM data and copy the file into the container. There are various options for downloading this data but I tend to use the Geofabrik server.

docker cp ~/east-sussex-latest.osm.pbf some-postgis:postgres

Inside the container, the OSM file can then be loaded into postGIS using the command

osm2pgsql merseyside-latest.osm.pbf -U postgres

5. Back to our terminal inside the host, start up postgres

psql -U postgres

6. Now, here is the bit where I send you to another tutorial to get the ley lines SQL. These have been made available as a github gist by Steven Kay. Cutting and pasting the commands will get a fairly decent result, but I recommend having a play to get the best results.

7. You can now use another query to get the KML (Keyhole Markup Representation) format of the lines. This format is fairly portable, and I tend to plonk it in a simple HTML page. The larger the “ct” value, the more items are connected by the ley.

SELECT (ST_AsKML(geom)) as geom from leys where ct = 7;

8. And this is the hand-waving step, which is to display your KML lines on an actual map. Basically, you need to amend a simple OpenLayers example. Basically, if that leaves you stuck, leave a comment and I’ll get back to you (commenter email addresses are not be visible on the actual site).

The script from Steven Kay is great for producing candidate lines. The only problem is that it tends to follow clusters of pubs on the same street. The restriction that the lines must begin and end at a site while being at least 20km long prevents short local lines being plotted. But it would also be good to add an extra predicate to judge lines by how spaced-out the locations are. That’s something I need to work on in future.


The Story Of Silicon Roundabout

I’ve written before about the failure to create a Silicon Brighton, despite both a vibrant community and significant investment. While the whole country has tried for years to create technology innovation clusters, the most successful example is “Silicon Roundabout”.

The name Silicon Roundabout came from a jokey tweet by Matt Biddulph (who formerly worked at Dopplr and Moo), back in July 2008: “Silicon Roundabout: the ever-growing community of fun startups in London’s Old Street area“. This off-hand tweet led to a small flurry of press, as well as an article in Wired UK.

Enough time has passed that there are in depth reports on Silicon Roundabout (from the LSE’s Center for Economic Performance), and Wired UK has even published an obituary.

As Max Nathan, one of the LSE report writers, summarised on Twitter, “David Cameron launched the Tech City programme in 2010 to ‘accelerate’ the Shoreditch cluster – using place branding, business support, networking, tax breaks and a new delivery agency. By 2014, Mayor of London Boris Johnson was hailing it a huge success.

That Tech City programme occurred significantly after the Silicon Roundabout Cluster had been identified. Nathan questioned whether the Cameron intervention had any significant effect: “Instead of catalysing the cluster, policy generally rode the wave.” (ref). The discussion on twitter concludes:

Where does this leave us? None of this takes away from the real and impressive growth in London’s overall tech ecosystem… However ~ this work challenges the backstory. Many of the big claims made for the Tech City programme in East London don’t really stack up. And in the paper I find some evidence of negative side effects – for example, spatial disruption and movement within cluster space…. On the one hand, it’s impressive that a tactical, light-touch, cheap programme like this had any impact. Though I’m not sure this approach would work outside London.

Wired’s obituary also looks at the way in which the Silicon Roundabout ended up as “a real estate nightmare”. Low rents were one fo the attractions of the area, but the branding had an effect on this. The article admits that :

the Silicon Roundabout cluster is still there – and it is bigger than ever. Ironically fulfilling Cameron’s prophecy, it has crawled all the way east, as people ran away from rising property prices. In merely spatial terms – how big the cluster has grown – the policy has been a success


With many more firms scrambling to move near the famed Old Street roundabout and rents duly skyrocketing – the initial benefits of the cluster vanished. Many companies, especially pure tech startups ended up being overwhelmed and closing shop; others left very quickly; the environment overheated and churn accelerated, destroying any semblance of a stable network of cooperation. “For digital tech firm productivity, I find suggestive evidence of a negative policy effect,” Nathan’s paper reads. “This suggests that policy weakened the net benefits of cluster location.”

What does this mean for Brighton? We’ve had several Silicon Initiatives and the Digital catapult. Why are we not a ‘Tech City’? The story of Silicon Roundabout suggests that policy interventions are not going to help as much as we think.


An Introduction to FXGL

Last week, Brighton Java had a session from Almas Baimagambetov of Brighton University, who introduced the FXGL game library. The session featured three introductory videos and some discussions . I’ve embedded the three videos at the bottom of this post.

What was most exciting to me was how quickly results could be produced. One of the problems with modern Java is how hard it is for beginners to produce tangible outputs. There are also a host of examples available that can be customised.

The format, with three videos and discussion in between worked well. And, while it wasn’t the focus of the talk, it was also useful to see some of the native packaging options for Java.

I’m passionate about the importance of Brighton Java as a local group, enabling people to communicate with nearby colleagues. However, moving online has enabled people from further afield to attend, and we had guests from Honduras and Senegal. Even as we move back to in-person events, I hope we can maintain access to the group for people outside Brighton.

Almas is scheduled to give a longer version of this talk at at FOSDEM21


The Seven States of a Boolean Object

Everyone understands Boolean variables – they’re either true or they’re false. Right? Except, in languages, where the Boolean variable might be null, which makes comparing two Boolean values a little little trickier. And the DailyWTF once gave an example of a Boolean ENUM that could be true, false or file-not-found.

Long ago, when I worked for Tom Hume at Future Platforms, one of the developers suggested something more ambitious. After all, a three-state Boolean leaves too much room for ambiguity. By the time he was done, Thom Hopper had suggested 6 or 7 states for a Boolean variable. Tom Hume and I recently tried to remember as many as we could, but only got 5 or 6.

We tweeted Thom Hopper to see if he remembered, but the states of this variable are lost in time, like tears in rain. But, for the sake of posterity, I want to record some of the values that a proper Boolean type might hold:

• true
• false (obviously)
• null
• uninitialised (similar null – but this means a true/false has never been set)
• unknown (we’re currently not certain of the value)
• indeterminate (there’s no way of knowing this value)

I’ve love to be able to declare that this is stupid, but I don’t know. I mean, if you accept null as a Boolean state, where do you stop? Maybe Java needs a ‘true’ boolean that can take all of these states?

infrastructure programming

Living Without Pre-Production Environments

Would we be better off without test environments?

I try not to recommend too many talks, but I loved this one from Nicky Wrightson of Skyscanner about living without pre-production environments. It provides an interesting solution to a lot of problems with performance environments I’ve been thinking through. Trying to accurately reproduce live environments seems like a wasteful, quixotic endeavour and I kept wondering about was just not using them. This video is an encouragement to that thinking.

Talks are often very different to the reality in companies, but there are some great questions for anyone about what test environments are for. “Historically, we had lower environments that were like production so that we could test releases that we couldn’t confidently reason about the effect of the changes.”

The only performance test that actually matters is how the live environment. ”At the end of the day, just make it a lot easier to track issues in production with really well instrumented code rather than try to replicate those issues in lower environments”.

There will always be inherent differences between environments, which limits the ability to draw conclusions from them. Yes, H2 is great for integration testing, but it has subtle differences with Oracle and MySQL which need managing. In performance tests, replicating load is hard – data and traffic shapes are as important as infrastructure and code. And then there are all the environment specific configs to manage, and the bugs from that.

Obviously, doing away with testing environments is the sort of decision that gets people fired. But! Just imagine how much things would improve if it worked!

infrastructure programming

What’s So Special About Cloud Software?

One of the more interesting interview question I’ve been asked in the past few years was about the differences between cloud development and monoliths. I don’t think I gave the expected answer when I said that they’re not all that different

Yes, cloud environments are complicated but good cloud development relies on the sort of fundamentals that sometimes go wrong in monolith development.

Co-ordinating subscription renewals, credit card billing and confirmation emails is an example of working with distributed systems on a monolith. Do this in the wrong way and you billed a customer multiple times, or spammed their email. Trade-offs, failures and retries needed to be carefully considered.

A lot of applications take external systems for granted, rather than considering that they do sometimes go wrong. One place I worked coupled their login process to Salesforce. When Salesforce had an outage, their monolith shared it. When local filesystems have issues, applications will fail dramatically.

There are obvious differences between cloud and monolith development (not least the operational complexity), but both require an attention to the principles of software development. You don’t need a large system to be hit by the fallacies of distributed computing.


My Most Useful Technical Interview Question

One interview question that I’ve been using for about ten years seems to filter out more candidates than any other. It’s not a trick, and I still don’t understand how come it catches so many people. Sometimes I worry that there is something wrong with what I’m asking.

The question is this – using a text editor and not an IDE, write a simple method to take an integer as an argument and return the factorial. I make sure to explain what a factorial is and wait.

The people I’m interviewing are rarely novices. I’ve asked this from people with years of banking experience. Some of them had exciting CVs, with successful projects and all the skills I was looking for; they could talk fluently about complex technology. Yet they did not seem that familiar with code. I’ve senior developers struggle with writing a simple loop.

I ask a lot of other things in interviews. I try to be open-minded, searching for strengths rather than weaknesses. I don’t bother with tricky algorithm questions that people rehearse for interviews and forget once they get an offer. With everything else I ask, if a candidate doesn’t know the initial answer, I follow-up to find what they do know.

But I expect anyone going for a technical position to be comfortable writing a simple piece of code, to be familiar with what code looks like. Can you write a loop and check it? I try to account for the fact that the candidate might feel nervous, and might find the lack of an IDE challenging. Sometimes, I tell them not to worry too much about syntax, to use pseudocode if they like.

I’ve been interviewing developers for years, and that question is essential. The piece of code I ask for is trivial. I’ve heard of interviewers getting the same results with FizzBuzz. The example that I use is listed across the web as an interview cliche, something a prepared candidate would expect. A good candidate disposes of this quickly and moves to the next question; but some people struggle. The question shouldn’t work, but it does.