weeknotes

Weeknotes: 2025-28

I’ve been working this week on mongo replicasets and I’m very impressed with their resilience, particularly the use of an intelligent client in the driver to handle failover etc.
As part of an initiative at work, I started playing with Amazon Q, initially asking it to generate some basic arcade games. First impression was to be impressed at the simple examples produced, while being aware of the challenge in getting precise results from a coding agent. Something I need to spend more time on.

Links

An excellent post from Sean Goedecke, AI Interpretability is further along than I thought, talks about internals of language models – it was a useful reminder of why telling a chatbot that it’s an expert works.
AI-assisted coding for teams that can’t get away with vibes (via Simon Willison) was a useful primer on large-scale coding with GenAI. A useful rule here was ‘what helps the human helps the AI’, including linting, CI/CD, documentation and clearly defined features. Some good examples around prompting, and how AIs are used to build the prompts to code from. The most interesting bit, and something I’d like to go back to, is the claim that the DRY principle is less useful when working with LLMs. This is a living document being maintained by nilenso, which I will have to keep an eye.
Could HTTP 402 be the Future of the Web was a good speculative article about the need for micropayments and how charging AI crawlers could lead to that.
Some excellent words of wisdom from Everything is Prioritization: “If you’re remote and still free frazzled, you’re not doing remote wrong. You’re just prioritizing availability over impact.” The article talks about the need to avoid tempting distractions: “The best teams aren’t full of geniuses. They’re full of people who keep their focus and say ‘no’ without having a breakdown”.
I’ve long disliked the cargo cult metaphor, and this is deconstructed in The origin of the cargo cult metaphor, which points out a lot of the errors and miscomprehension in the popular understanding of actual cargo cults. “The cargo cult metaphor is best avoided”.
Simon Willison’s Identify, solve, verify is a short piece on the role of the programmer in the era of GenAI. “The more time I spend using LLMs for code, the less I worry about my career”.
The Elegance Question: What Makes Some Systems Just Work? set out some simple principles for building ‘elegant’ systems. This was thought-provoking, particularly around the question of why so many systems go against these principles.

Books

No time for reading this week – and I’ve been distracted by a non-tech book.

weeknotes

Weeknotes: 2025-27

I’m going to try writing a few weeknotes to see how they feel. I need some way to consolidate everything I’m reading and thinking about, but longer blog posts are not coming together. These weeknotes will help me track my technical interests – and hopefully help me find interesting blog posts when I need to refer back to them.
Last Sunday, I had an interesting conversation with Laurence where I found myself asking whether agile is too hard for most teams. Laurence pointed out out that the core of agile is simple, but it does place a lot of demand on developers. I think the widely perceived failures of agile need much more consideration.
In another discussion with Laurence, I realised how vital GenAI skills will be for technical managers – there is a huge change in software development coming and staying current will require understanding those skills – not least to be able to support and unblock those who use them most.
Something I’ve not blogged about over the past few weeks is the decline of stack overflow. It’s been interesting to see how the references for learning technical skills have changed over the years.
One of the things I like most about working in a large consultancy is the number of talks and activities going on. An ‘unadvent of code’ group has started to look at the Advent of Code puzzles from 2018. This has got me playing with Go as a coding activity, which I’m enjoying.

Reading

Writing for Developers

I started reading this book, a recommendation from my colleague Matt. The book could probably be titled ‘Blogging for Developers’, and it’s interesting to see someone writing such a book in 2025. I like the book, but I definitely have philosophical differences with it, in that it focusses on blogging as a way to go viral, sometimes neglecting the more personal uses of blogging (such as weeknotes). A good counterpoint occurs in Simon Willison’s piece on keeping a link blog.

GenAI

Two notes on vibe coding

From Ashley Willis:

“[A mentor] pointed out that debugging AI generated code is a lot like onboarding into a legacy codebase, making sense of decisions you didn’t make, finding where things break, and learning to trust (or rewrite) what’s already there. That’s the kind of work a lot of developers end up doing anyway”

From Sean Goedecke:

Being good at debugging is more useful than being good at writing code – you only write a piece of code once, but you may end up debugging it hundreds of times¹. As programmers use more AI-written code, debugging may end up being the only remaining programming skill.

GenAI

Notes on RAG – Part 2

Continuing my research on RAG (part 1 here)

The week before last I worked on a simple example of RAG using Spring. This involved a lot of yak shaving, in part because spring had updated their package structures since I last worked on my RAG demo. I also wanted to set up a local vector DB without using docker, finally settling on MariaDB. In the end I had a simple example that took a CSV, inserted the rows as embeddings into a vector database and could run simple queries against it. I still need to tidy this up and upload it to github.

Interesting Links

I need to look into using vector databases for recommendation engines.
“If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves.” – link
Simon Willison gave a great talk about embeddings. This introduced the concept, showed how to do ‘vibes-based search’ against sqllite using llm, and talked about 2D visualisations.
- “Being able to spin up this kind of ultra-specific search engine in a few hours is exactly the kind of trick that excites me about having embeddings as a tool in my toolbox.”
- “A fascinating thing about RAG is that it has so many different knobs that you can tweak. You can try different distance functions, different embedding models, different prompting strategies and different LLMs. There’s a lot of scope for experimentation here.”
Adding semantic search to datasette
Lovely visualisation of 40 million Hacker news posts. This uses Uniform Manifold Approximation and Projection (UMAP) to reduce the vectors to a 2D geographic map (among other things). Some interesting applications of a massive dataset.
The llm tool can be used for image search using CLIP
This leads to someone searching images of faucets by image and phrase (what taps best represent the idea of ‘Bond villain?’)
According to ChatGPT, people have played with using vector databases to analyse recipes. There is a paper on this, ‘Learning Cross-modal Embeddings for Cooking Recipes and Food Images‘, but I can’t find any details on applications with it, experimental or otherwise.
Interesting discussion of search in RAG (look for the RAG section)
Text Embedding Models Contain Bias. Here’s Why That Matters – interesting Google paper from 2018
Using a vector database, SkyCLIP, and Leaflet to create a searchable aerial photograph
“This is why Retrieval-Augmented Generation (RAG) is not going anywhere. RAG is basically the practice of telling the LLM what it needs to know and then immediately asking it for that information back in condensed form. LLMs are great at it, which is why RAG is so popular.” link
Spring AI documentation on RAG

GenAI

Notes on Resource-Augmented Generation (part 1)

Post author By admin
Post date April 7, 2025
1 Comment on Notes on Resource-Augmented Generation (part 1)

I’ve been thinking recently about Resource Augmented Generation (RAG) and vector databases, and I wanted to gather some notes towards a talk.

The question

Most of the applications I’m seeing for Generative AI seem to involve RAG. But I feel that the vector database is doing most of interesting work here – and, in a lot of cases, an LLM-generated response is not always the best ‘view’ for the data that is returned. I want to dig a little more into vector databases, how they work, and what can be done with them.

Panda Smith wrote, “If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves”; I want to learn more about this search part of RAG

Playing with embeddings was a post I wrote in September ’24 looking at using vector databases for ‘vibes-based search’
GenAI is already useful for historians discussed an article about a historian using GenAI to find diary entries relevant to their research
Retrieval-augmented generation using SpringAI was a Spring AI RAG demo I built for a previous talk

Notes from wikipedia

Pulling out some notes from the wikipedia article:

“[RAG] modifies interactions with a large language model so that the model responds to user queries with reference to a specified set of documents, using this information to supplement information from its pre-existing training data. This allows LLMs to use domain-specific and/or updated information.”
There are two phases – information retreival and response generation
RAG was first proposed in the paper ‘Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks’ in 2020
The process involves the following stages:
- Indexing – the documents to be searched are stored – usually by converting the data into vector representations
- Retrieval – given a user query, a document retriever finds the most relevant documents for the query
- Augmentation – the documents are put in a prompt for an LLM
- Generation – the LLM generates a response based upon the prompt
The ‘chunking’ of the documents (how they are divided up into pieces to be stored) affects how good the responses are.
Risks of RAG
- While RAG reduces hallucinations in the responses, it cannot eliminate them.
- There is a danger of losing important context in the chunking phase.

Interesting links

Twitter thread by Jo Kristian Bergum on ‘The rise and fall of the vector database infrastructure category’
The Best Way to Use Text Embeddings Portably is With Parquet and Polars – fascinating discussion of vector databases, using Magic: The Gathering cards as a dataset
Embeddings: What they are and why they matter Text of a talk by Simon Willison: “Embeddings are a really neat trick that often come wrapped in a pile of intimidating jargon”

java

Does Java have a future?

I’ve been working with Java for 25 years. After a shaky time in the noughties, the platform has thrived, and a huge number of applications have been built on it. But I’m starting to wonder about Java’s future, given three things: new features in the language, the JVM itself, and the rise of generative AI.

When Java started it was an exciting prospect – an object-oriented language that could run on multiple platforms, but without the complexities of C++. During the 00s it had looked like newer languages would take its place, but with Spring Boot it’s become a popular language on the cloud.

It’s been a long time since Java was cool, but CTOs definitely like it and there is a massive sunk cost invested in the eco-system. While I’ve worked with other languages, I expected to be working primarily with Java for the remainder of my career. Some people have mocked Java as the new COBOL, but I see a steady supply of work as a positive thing.

One of Java’s strengths is its simplicity. A couple of years back, I published an article for my previous consultancy about Why Java Still Matters. I concluded that Java’s strength was its readability compared to other languages, and that helped with collaboration: “[Java’s] lack of sophistication forces developers to produce more straightforward code [and] we write code for other developers, not the machine.”

Since 2017, Java has committed to twice-yearly releases, which has led to a large number of new features in the language. I have argued (mostly in jest) that Java peaked with version 7, before the introduction of functional programming features. The new features in Java are undoubtedly expressive, but at the cost of some consistency. For a long time, Java applications tended to look very similar between workplaces. This will be less true as the language becomes richer.

If Java stops being a simple language it becomes less compelling as a choice. Indeed Java’s backward compatibility has produced some strange versions of modern features. Why not pick a language that was designed from scratch to include these things?

Java also brings with it the deadweight of the JVM. While Java is fast once it’s up-and-running, there are significant start-up costs that are particularly punishing on serverless. Yes, there are workarounds but these bring their own problems. GraalVM’s incredibly fast start-up comes at the cost of slower build times and significant differences between production software and development versions. Over the past few months, my workplace Java user group has been discussing the problems of Java on serverless and the outcomes are frustrating. It’s hard not to feel like we’re making excuses for the platform we work on.

Despite both of the above issues, a lot of money was invested in Java, and companies were unlikely to switch. But, with the rise in generative AI, it’s easier than ever for developers to get working with unfamiliar languages. And it’s also going to get easier to convert existing applications to new platforms. The first tools to do this are imperfect, but they will improve.

Java has always been a clunky language to work with. The boilerplate made hacking on new ideas unrewarding (although tools such as JHipster helped massively). GenAI supports me setting things up on new language, and it’s a great tutor, able to hone its examples to support the specific thing I’m trying to build.

I’ve had a lot of fun working with Java over the years, but I’m starting to feel that, long-term, my future lies with other languages. It’s time to explore some alternatives.

programming

The Importance of Blogging for Programmers

Post author By admin
Post date January 3, 2025
No Comments on The Importance of Blogging for Programmers

I started this weblog in November 2014 and have published 104 posts – a little under once a month. It has a very small readership. I still find it useful for two reasons. First, there are the reference posts that are useful documentation (for example recipes for GIS or a checklist of scheduling issues). Then there are the posts that help develop my thinking about things.

This latter type of post is a form of rubber ducking, and is useful even if nobody reads them. As EM Forster asked¹, “How do I know what I think until I see what I say?” Writing about a subject is a useful form of deliberate practise that helps develop insights and skills.

The problem is that these posts take a lot of work to write, and I’ve abandoned dozens over the years – some of which would have been helpful for tracking my development on topics. I’d love to look back at how my thoughts on Generative AI have changed.

Over the next year, I want to write more about programming and my experience of it. But an important first stage of this is reducing the effort required to publish something useful.

I’ve been inspired by a recent post on this topic by Hamel Husein, Building an Audience Through Technical Writing: Strategies and Mistakes. There’s a lot of good advice in this post, but what stood out to me immediately was the idea of a voice-to-content pipeline. While I’ve used AI to transcribe written notes, I’d not actually made direct use of speech-to-text. Dictating the first part of this post has sped things up for me significantly.

Husein also discusses using AI models to help with generating the text, and I certainly want to explore creating prompts to help me with editing and proofreading (something Simon Willison discussed here).

An obvious question is why write public blog posts rather than keeping a private list? First, I think that preparing thoughts for public consumption produces better summaries. Also, I think there’s value in having a public archive where others can respond to your thoughts. This might not happen often, but it is good to make space for this.

One of the biggest challenges I face with blogging is that I want every post to be as perfectly written as those by people like Charity Majors or Joel Spolsky. But I do think there is a space for smaller, more personal posts and link posts – some of which might eventually provide a basis for deeper essays.

There’s only a tiny audience for what I write here, but the most important part of this audience is me. Over the coming year I plan to post more. GenAI is a revolutionary technology for software development, and I want to follow this closely. I also want to think more about my experiences as a software developer and improving as a programmer².

Although he did not apparently originate the quote. ↩︎
I also want to think about the difference between being a programmer and a software developer. One seems to be more at the level of individual functions, and I think I’m better at the latter than the former. ↩︎

GenAI

What I believe about GenAI (and what I’m doing about it)

Post author By admin
Post date December 19, 2024
No Comments on What I believe about GenAI (and what I’m doing about it)

I woke up on Sunday morning with the following question: what do I believe about GenAI – and what should I be doing in response? Based on what I’ve been reading, here is what I currently think:

GenAI is a revolution – cynics have dismissed GenAI as ‘fancy autocomplete’, but that ignores the magic of LLMs – both their ability to produce plausible text and their performance with previously difficult and imprecise tasks.
GenAI is also overhyped – a lot of the problem with GenAI is that some companies are over-promising. LLMs are not going to lead to AGI and are not going to replace skilled people in most situations.
The main benefit of LLMs is efficiency – LLMs are very good at some previously complicated tasks, and this will make those tasks much cheaper. I’m expecting this to produce a boom in programming as previously-expensive projects become feasible – similar to how Excel has produced a boom in accountancy.
There is a correction coming – there’s a huge amount of money invested in GenAI and I think it will be some time before this pays off. I’m expecting to see a crash come before long term growth. But that’s the same thing as happened with the 2000 dotcom crash.
RAG is boring – using RAG to find relevant data and interpret it rarely feels like a good user experience. In most cases, a decent search engine is faster and more practical.
There are exciting surprises coming – I suspect that the large-scale models from people like OpenAI have peaked in their effectiveness, but smaller-scale models promise some interesting applications.

I am going to spend some time over Christmas coding with GenAI tools. I’m already sold on ChatGPT as a tool for teaching new technology and thinking through debugging, but there are many more tools out there.

I’m also going to do some personal research on how people are using Llama and other small open-source models. There must be more to GenAI than coding assistants and RAG.

NaNoGenMo

Thoughts on NaNoGenMo 2024

I spent about 25 hours in November producing a novel via an LLM for NaNoGenMo 2024. It was an interesting experiment, although the book produced was not particularly engaging. There’s a flatness to LLM-generated prose which I didn’t overcome, despite the potential of the oral history format. I do think that generated novels can be compelling, even moving, so I will have another try next year.

Some things I learned from this:

I hadn’t realised how long and detailed prompts can be. My initial ones did not make full use of the context. Using gpt-4o-mini was cheap enough that I could essentially pass it prompts containing much of the work produced so far.
For drafting prompts, the ChatGPT web interface was more effective, because it maintains the full conversation as a state. Once I used this for experimenting with prompts, things moved much faster.
Evaluating the output is incredibly hard here. In a matter of minutes I can create a text that takes hours to read. Most of my reviews were done by random sampling, and I didn’t have time to properly examine the text’s wider structure.
It was also tricky to get consistent layouts from the LLM. Using JSON formats helped somewhat here, but at the cost of reducing the size of LLM responses.

22 books were completed this year and I’m looking forward to reviewing them. I have an idea for a different approach next year and will do some research in the meantime (starting with Lillian-Yvonne Bertram and Nick Monfort’s Output Anthology)

NaNoGenMo

NaNoGenMo Updates

I’m now halfway through NaNoGenMo 2024. I’ve been working on my project every day this month and wanted to share some initial thoughts.

Having a software project to tinker with is fun, particularly with NaNoGenMo’s time limit to keep me focussed.
My tinkering has been distracted by working on refactorings rather than the GenAI-specific code. Adding design patterns into the codebase has been a useful opportunity to think about refactoring, and something I should be playing with coding projects more often.
Working with the LLM fills me with awe. These things can produce coherent text far faster than I can read them.
The output is readable without much work. I asked ChatGPT4 to produce a Fitzgerald pastiche (Gatsby vs Kong – about kaiju threatening a golden age) and it’s an interesting text to scan through.
The question of testing is particularly tricky here. I’m producing novels which would take about 3-4 hours to read. I’ve been randomly sampling passages, picking out style issues, but structural ones/weird repetitions on a larger scale will be harder to fix.
My overall plan is to produce a novel made of oral histories. Getting these to sound varied in tone is a challenge, and one I will dig into over the last two weeks. My pre-NaNoGenMo experiments suggested that LLMs were good at first person accounts – but getting an enjoyable novel out of them is difficult.
I’m relying on the structured JSON outputs from ChatGPT to get consistent formatting from ChatGPT, as it gives me a little more control.

Technically, I’ve completed NaNoGenMo as my project has used a fairly basic technique to generate 50,000 words of Godzilla vs Kong. But, ultimately, the question is whether ChatGPT can produce an enjoyable novel. I thought previous entrant All the Minutes was a genuinely exciting piece of literature. That is the bar I want to aim at.

Recent Posts

Recent Comments

Archives

Categories

Links

Books

Links

Reading

Writing for Developers

Interesting Links

The question

Previous posts

Notes from wikipedia

Interesting links