Categories
weeknotes

Weeknotes: 2025-30/29

  • I drafted some notes last week, but didn’t press publish, so these notes are two weeks’ worth.
  • A client colleague prompted me to make more use of Copilot in Teams. It’s hugely useful, but there’s a gap between reading and writing in all these tools – it’s too easy to copy and paste the application summary rather than edit it (particularly if you have another meeting to get to, since the context disappears when you move away). It’s going to be interesting to see how helpful this proves in the long run.
  • I wonder if remote working is increasing the number of meetings as it is so easy to book them – and cameras off means that there are people multi-tasking, rather than looking bored in the room. There’s no feedback to prompt people to push back against the calls.
  • I’ve been playing with AmazonQ. The UX is an atrocity, but the tool itself impressive and compelling. There are however, a lot of subtleties about how this would work as a development workflow, and how it will scale up to use in large organisations. I’m using the Nilenso piece on AI-coding as a guideline. I made a post about my initial response to Q and another one about my second week.

Links

  • I’ve been catching up on Sean Goedecke’s excellent writing. In Do Not Yell at the Language Model he talks about how berating a language model for mistakes might create a negative context, producing worse results.
  • Peter Hilton describes an amazing lightning talk, where Chris Oldwood told programming jokes for 5 minutes. Hilton goes on to imagine a book of 97 Jokes Every Programmer Should Know, suggesting that such jokes are a good way to learn some aspects of programming. “There are 10 kinds of programmers: those who understand binary, those who don’t, and those who weren’t expecting a base 3 joke.”
  • Charity Majors wrote an interesting piece, On How Long it Takes to Know if a Job is Right for You or Not, in which she talks about the need for alignment between a manager’s values and the company they work for.
  • The striking thing about Bo Frese’s The 13 Ways We Kill High-Performing Agile Teams was how often these occur, despite going against well-known best practise. Also interesting to see that the scrum guide had removed ‘the three questions’ as a stand-up practise.
  • Good retros are hard, and Who Needs Action Items by Daniel Cooper is a good piece on this. “Eventually, people stop bringing anything that actually matters and it’ll all be fluff. No one wants to accidentally become the owner of ‘improve emotional tone in retros (Q3 OKR)’.”

Books

I completed a re-read of Kent Beck’s Extreme Programming Explained, which I last read back around 2001. I have a lot of notes to reflect on, but the biggest surprise was how little empirical evidence Beck had for his theories. Which is not to say I think Beck is wrong per se, rather that his insights are based on a particular set of experiences. There was also some provocative thoughts about documentation which goes against what I think, and is worth interrogating.

Categories
GenAI

Summer of Q: Week 2

My overall impression, after more time working with Amazon Q, is that it will take some work for a coding agent to make me faster and more effective. Q definitely removes some of the boring bits of coding (it’s great at Maven dependencies) but it’s more wayward on complicated tasks. There’s a lot to learn here.

At the end of last weekend, I’d settled on a method: writing a specification for an area of my application, having Q produce a BDD feature file outlining the behaviour, and then getting Q to fill in the testing code and after that, the implementation. This soon ran into problems as I’d still set Q too wide a brief, and the code produced quickly sprawled. There were many minor issues, such as Q producing unfocussed Cucumber step files. Along with the pages of code, some chunks of functionality were left out to ‘fill in later’.

It’s tricky to find a regular working pattern with good DevEx. I didn’t want to put Q into ‘trust’ mode, choosing rather to review each change as it was prepared. I did this so I could interrupt Q when it went off the rails, and also to reduce the amount of generated code I needed to review. This meant a lot of time waiting while Q was ‘thinking’. One colleague talked about their passion for writing code and how reviewing generated things is not the same. In their current form, these tools don’t have the responsiveness of working directly with code.

The production of the code also produced a strange effect around ownership. Hand-writing code (or whatever we call the ‘old’ ways of programming) meant taking care with each method. It was a good way to get inside the code, producing ‘mechanical sympathy’. Here, I started with a simple outline of my application in 275 words. Q produced over 10,000 words of feature files (including some useful functionality that was not asked for, such as sanitising inputs). This is a lot of reading! Assuming a reading rate of 400 words per minute, that is 25 minute’s work – setting aside the deeper understanding needed here, and any editing required.

Q also proved to be better at some things than others. When asked to generate some test data, Q created a programme to populate the DB on start-up. I had to suggest using liquibase. Being able to get the best out of this tool requires the operator to have a clear idea of what they would expect.

I’m still convinced that these tools will be part of a regular toolkit, but I don’t think they will offer the sort of incredible gains some have suggested – although they will be essential for prototyping. Cal Newport produced a great summary of the competing claims about productivity. My prediction is that, in the long run, we’ll see significant gains, but we won’t be relying solely on the agents.

Categories
GenAI

First Impressions of AmazonQ

My employer has organised a ‘Summer of Q’, where a number of us have signed up to play with AmazonQ. This weekend was the first time I could work with Q in depth. The main result – I ‘built’ a quiz application in 30 minutes (while also doing some chores) and it looked and worked better than what I’d have produced solo. But there are a lot of subtleties and caveats to add to this.

  • A major argument against GenAI putting developers out of work is how poor the tooling and signup flows for Q are. The signup is terrible and confuses a lot of people. Q failed to help, and kept hallucinating links to help pages that didn’t exist. The IntelliJ plugin is awful and locks the IDE, so I’ve had to use the command-line version instead.
  • Q is great at producing code. Producing the quiz example was a trivial task, so I’m now working on a much more complicated example. Straight away, I can see Q making me more effective. Personal tools I’ve wanted to make, that I decided against investing time in, now look easy.
  • The quiz app that Q produced looked and played better than what I could have produced by myself. I’m very impressed by this.
  • The model’s reasoning is clever and spooky – it makes mistakes sometimes, but then works to fix those. Interesting behaviour – although I expect there to be fewer mistakes in the generated code over time.
  • One of the challenges of coding agents is getting used to the new workflow. There’s a fair bit of waiting involved while Q thinks about each file that needs creating. It’s very different to using a GenAI coding assistant, and I need to figure out the best new workflow.
  • An ongoing problem with GenAI is that it involves a lot more reading than writing. I figure almost no-one is reading co-pilot meeting summaries, and I worry that not everyone will closely read the impressive amount of code that Q generates.
  • At present, I’m reviewing each action Q takes, rather than trusting it for the session. It’s going to be interesting to how other people are working. There’s a lot of boring waiting this way, but a lot less reading to do in one go.
  • Being able to produce decent (albeit not perfect) code so quickly will change the nature of programming. The coding part is going to get much easier. The development part – making sure the right thing is produced – will become more important, and maybe more difficult. I’m currently using feature tests as a way of validating what is being made.
  • Something I’ve noticed with GenAI in a number of areas is the importance of taste. The tools produce things (image/text/code) incredibly fast, and require an operator with strong opinions about this output.
  • Q responded to my initial, naive prompts by producing ornate additional features. For example I asked it to generate some BDD feature files and it’s adding some complicated accessibility tests. I’m looking forward to watching it try to fill those out! I also spotted some subtle divergences from the spec that I need to edit. The quiz code I initially generated also included a lot of useful but unasked-for features. They were improvements, for sure, but it was definitely not an MVP. It will be interesting to see how easy it is to work with Q on my more complicated application.
Categories
weeknotes

Weeknotes: 2025-28

  • I’ve been working this week on mongo replicasets and I’m very impressed with their resilience, particularly the use of an intelligent client in the driver to handle failover etc.
  • As part of an initiative at work, I started playing with Amazon Q, initially asking it to generate some basic arcade games. First impression was to be impressed at the simple examples produced, while being aware of the challenge in getting precise results from a coding agent. Something I need to spend more time on.

Links

  • An excellent post from Sean Goedecke, AI Interpretability is further along than I thought, talks about internals of language models – it was a useful reminder of why telling a chatbot that it’s an expert works.
  • AI-assisted coding for teams that can’t get away with vibes (via Simon Willison) was a useful primer on large-scale coding with GenAI. A useful rule here was ‘what helps the human helps the AI’, including linting, CI/CD, documentation and clearly defined features. Some good examples around prompting, and how AIs are used to build the prompts to code from. The most interesting bit, and something I’d like to go back to, is the claim that the DRY principle is less useful when working with LLMs. This is a living document being maintained by nilenso, which I will have to keep an eye.
  • Could HTTP 402 be the Future of the Web was a good speculative article about the need for micropayments and how charging AI crawlers could lead to that.
  • Some excellent words of wisdom from Everything is Prioritization: “If you’re remote and still free frazzled, you’re not doing remote wrong. You’re just prioritizing availability over impact.” The article talks about the need to avoid tempting distractions: “The best teams aren’t full of geniuses. They’re full of people who keep their focus and say ‘no’ without having a breakdown”.
  • I’ve long disliked the cargo cult metaphor, and this is deconstructed in The origin of the cargo cult metaphor, which points out a lot of the errors and miscomprehension in the popular understanding of actual cargo cults. “The cargo cult metaphor is best avoided”.
  • Simon Willison’s Identify, solve, verify is a short piece on the role of the programmer in the era of GenAI. “The more time I spend using LLMs for code, the less I worry about my career”.
  • The Elegance Question: What Makes Some Systems Just Work? set out some simple principles for building ‘elegant’ systems. This was thought-provoking, particularly around the question of why so many systems go against these principles.

Books

No time for reading this week – and I’ve been distracted by a non-tech book.

Categories
weeknotes

Weeknotes: 2025-27

  • I’m going to try writing a few weeknotes to see how they feel. I need some way to consolidate everything I’m reading and thinking about, but longer blog posts are not coming together. These weeknotes will help me track my technical interests – and hopefully help me find interesting blog posts when I need to refer back to them.
  • Last Sunday, I had an interesting conversation with Laurence where I found myself asking whether agile is too hard for most teams. Laurence pointed out out that the core of agile is simple, but it does place a lot of demand on developers. I think the widely perceived failures of agile need much more consideration.
  • In another discussion with Laurence, I realised how vital GenAI skills will be for technical managers – there is a huge change in software development coming and staying current will require understanding those skills – not least to be able to support and unblock those who use them most.
  • Something I’ve not blogged about over the past few weeks is the decline of stack overflow. It’s been interesting to see how the references for learning technical skills have changed over the years.
  • One of the things I like most about working in a large consultancy is the number of talks and activities going on. An ‘unadvent of code’ group has started to look at the Advent of Code puzzles from 2018. This has got me playing with Go as a coding activity, which I’m enjoying.

Links

  • I watched the video Java for AI by Java Library Architect Paul Sandoz – another example of the Java platform’s strength as a combination of JVM, libraries. It’s will be good to see Java become a first-class platform for AI
  • AI Is Poised to Re-write History is an interesting article looking at GenAI as a reading machine rather than a writing machine. It also interviews Mark Humphries, who was discussed in the excellent Feb 2024 Verge article How AI can make history

Reading

Writing for Developers

I started reading this book, a recommendation from my colleague Matt. The book could probably be titled ‘Blogging for Developers’, and it’s interesting to see someone writing such a book in 2025. I like the book, but I definitely have philosophical differences with it, in that it focusses on blogging as a way to go viral, sometimes neglecting the more personal uses of blogging (such as weeknotes). A good counterpoint occurs in Simon Willison’s piece on keeping a link blog.