Categories
GenAI programming

The potential of ChatGPT for programmers

I’ve been meaning to post for some time about my first experiences of programming with ChatGPT, back in January. Ethan Mollick often suggests that people should try doing their job with ChatGPT for at least 10 hours to get a feel for its potential. Playing with ChatGPT for a short time has converted me from an AI cynic to an enthusiast.

Simon Willison wrote about his experiences coding with ChatGPT, concluding that AI-enhanced development makes him more ambitious with his projects.

Shortly after I read that post, I had a silly question related to watching movies. I order my watchlist at Letterboxd by the average rating on the site. But I began to wonder whether this was a good way to watch movies. Did my taste actually correlate with the overall site? Or would I be better off finding a different way to order the watch list?

The obvious way to check this is by writing a bit of code to do the analysis, but that seemed like a chore. I decided to put a few prompts into ChatGPT to see whether that helped. Within two minutes, I had a working python programme. There was a little bit of playing around to get the right page element to scrape, but essentially ChatGPT wrote me a piece of code that could load up a CSV file, use data in the CSV file to download a webpage, grab an item from the page and then generate another CSV file with the output.

I started with a simple initial prompt and asked for a series of improvements.

Can you show me an example of how to scrape a webpage using python, please? I need to find the content of an element with an id of “tooltip display-rating”, which is online. I also want to set the user agent to that of a browser.

(I also asked for a random time of between 1 and 2 minutes between each request to the website to be polite. I’m not supposed to scrape Letterboxd but it I figured it was OK as this was for personal use, and I am a paid member.)

This all went pretty well, and ChatGPT also talked me through installing python on my new Mac. The prompts I used were hesitant at first because I didn’t really know how far this was going to go. ChatGPT was also there to talk me through some python specific errors.

When I run this script, I get an error: “ModuleNotFoundError: No module named ‘requests'” What do I need to do to import this module

I get a warning when I run this command: “NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the ‘ssl’ module is compiled with ‘LibreSSL 2.8.3’.” Is this something I need to fix? What should I do?

Before long I’d got this complete working piece of code and checked my hypothesis. Turns out that’s not a strong enough correlation to say anything either way.

While the example itself is trivial and the output inconclusive, it showed me that it was very possible to write decent quality code very quickly. I rarely use Python, but ChatGPT provided useful assistance in an unfamiliar language. Writing this code from scratch, even in Java, even using Stackoverflow, would have taken more time than it was worth. As Simon Willison says

AI-enhanced development doesn’t just make me more productive: it lowers my bar for when a project is worth investing time in at all. Which means I’m building all sorts of weird and interesting little things that previously I wouldn’t have invested the time in.

My immediate takeaway is that AI tooling has the potential to revolutionise programming. It’s not going to replace programmers, rather it’s going to reduce the threshold for a project to be viable and unlock a lot of work. Tim Harford made the same point recently, looking at the history of the spreadsheet. This is an exciting time, and I’m expecting to be very busy in the next few years. I’m also impressed at how effective a tutor ChatGPT is, breaking down its examples into straightforward steps.

It has taken me far longer to write this post than it did to produce the code.

Categories
GenAI SpringAI

Spring AI Image generation example

Spring AI’s 0.8.0-SNAPSHOT release includes support for Image Generation using Dall-E 2/3 or Stability. This was added on January 24th and has not yet been documented but a video by Craig Walls describes how to use the new functionality.

I thought that an interesting example to try would be to combine ChatGPT with Dall-E. This way I can take a restricted range of parameters for an image (ie mood, animal, activity) and ask ChatGPT to expand this into a detailed prompt, which I can then use to generate an image. The idea here is to take user input for the prompt but to restrict what they can specify, maybe through some dropdowns. Another way of doung this would be to use ChatGPT to check freeform user input, but this seems to be simpler.

The example was pretty easy to put together. I used the org.springframework.ai.openai.OpenAiChatClient class to communicate with chatGPT followed by the org.springframework.ai.image.ImageClient class to generate the image using Dall-E 3. A simple Controller took some GET parameters and placed them into a prompt template:

I want to generate amusing images.
These images should feature an animal. The animal chosen is {animal}.
The animal in question should be {activity}.
The picture should make the user feel {mood}.

This template prompt could be changed to further restrict or specify the sort of image being produced through prompt engineering.

There’s a fair amount of Spring magic tying things together – in particular a @Configuration class that sets up the OpenAIImageClient, since auto-configuration is not yet available. The Controller method is as follows:

@GetMapping("safeimagegen")
public String restrictedImageGeneration(
@RequestParam(name = "animal") String animal,
@RequestParam(name = "activity") String activity,
@RequestParam(name = "mood") String mood) {

    PromptTemplate promptTemplate = new PromptTemplate(imagePrompt);
    Message message = promptTemplate.createMessage(Map.of("animal", animal, "activity", activity, "mood", mood));

    Prompt prompt = new Prompt(List.of(message));

    logger.info(prompt.toString());
    ChatResponse response = chatClient.call(prompt);
    String generatedImagePrompt = response.getResult().toString();
    logger.info("AI responded: generatedImagePrompt);
    ImageOptions imageOptions = ImageOptionsBuilder.builder().withModel("dall-e-3")
                .build();

    ImagePrompt imagePrompt = new ImagePrompt(generatedImagePrompt, imageOptions);
    ImageResponse imageResponse = imageClient.call(imagePrompt);
    String imageUrl = imageResponse.getResult().getOutput().getUrl();
    return "redirect:"+imageUrl;

}

This is not a particularly sophisticated piece of code, but it does show how simple it is to get SpringAI examples working.

I submitted a request for a picture of an aligator rollerblading, and set the mood as “joyful”. ChatGPT then generated a detailed prompt:

The image features a cheerful green gator. He’s wearing a pair of shiny, multicolored rollerblades that sparkle as they catch the light. His eyes are wide with excitement, and his mouth is stretched in a wide, friendly grin, revealing his white teeth. He’s standing in a beautiful park with green trees and flowers in the background, and there’s a clear blue sky overhead. He’s waving at the viewer as if inviting them to join him in his rollerblading adventure, adding to the joyful and playful vibe of the image.

And then the browser was redirected to the image:

Categories
GenAI SpringAI

Retrieval-augmented generation using SpringAI

On Tuesday, after a long day working in Leeds, I came home and decided to play with SpringAI, trying to see if I could set up a retrieval-augmented generation example. It took me just over an hour to get something running.

The documentation for SpringAI feels a little shinier and more solid than that for LangChain4j. Both projects have similar aims, providing abstractions for working with common AI tools and both are explicitly inspired by the LangChain project.

As with LangChain4j, there were issues caused by rapid changes in the project’s APIs. I started work with an example built against OpenAI Azure. It was simple enough to switch this to working against OpenAI, requiring just a change in Spring dependencies and a few properties – Spring magic did the rest. The main problem was updating the code from 0.2.0-SNAPSHOT to 0.8.0-SNAPSHOT (I’d not realised how old the example I’d started with was).

The actual code itself is, once again very simple. When the application receives a query, it uses the SpringAI org.springframework.ai.reader.JsonReader class to load a document – in this case one about bikes from the original project – and divides it into chunks. Each of these chunks are run through a org.springframework.ai.embedding.EmbeddingClient, which produces a vector describing that chunk, and these are placed in a org.springframework.ai.vectorstore.SimpleVectorStore. Once I’d found the updated classes, the APIs were all very straightforward to work with.

An incoming query is then compared against the document database to find likely matches – these are then compiled into a SystemQuery template, which contains a natural-language prompt explaining the LLMs role in this application (You’re assisting with questions about products in a bicycle catalog). The SystemQuery is sent by the application alongside the specific UserQuery, which contains the user’s submitted question.

The responses from the ChatGPT4 model combined the user query with the document, producing obviously relevant responses in natural language. For example:

The SwiftRide Hybrid’s largest size (L) is suitable for riders with a height of 175 – 186 cm (5’9″ – 6’1″). If the person is taller than 6’1″, this bike may not be the best fit.

Playing around with this was not cheap – the RAG method sends a lot of data to OpenAI, and was burning through $0.10-$0.16 worth of tokens in each query. I also managed to hit my account’s rate limit of 10000 per minute playing with this. I’m not sure how feasible using the OpenAI model in production would be.

Notes and follow-ups

  • I need to put some of the code into github to share.
  • I’m fascinated by how part of the application is a natural-language prompt to tell ChatGPT how to respond. Programming LLMs is spooky, very close to asking a person to pretend they’re doing a role.
  • In production, this sort of application would require a lot of protection – some of which would use natural language instructions, but there are also models specifically for this role.
  • The obvious improvement here is to use a local model and see how effective that is.
Categories
GenAI LangChain4j

LangChain4j and local models

A colleague told me about Ollama, which allows you to get LLMs working on a local machine. I was so excited about this that I downloaded the orca-mini model. Due to terrible hotel wifi I used my mobile internet and blew out the limit on that. Oops.

Anyway, it is very easy to get Ollama working. Just download and install the software, then run ollama run llama2. It has a simple REST interface:

curl -X POST http://localhost:11434/api/generate -d '{
    "model": "orca-mini",
    "prompt":"tell me a joke"                 
   }'

It was easy enough to get this working with LangChain4J, although the APIs were not quite the same as for the OpenAPI models.


import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.model.output.Response;

public class App
{
    public static void main( String[] args )
    {

        String modelName = "orca-mini";
        String localUrl = "http://localhost:11434/"; 

        ChatLanguageModel model =
                OllamaChatModel.builder().baseUrl(localUrl).modelName(modelName).build();


        String answer = model.generate("tell me a joke about space");

        System.out.println("Reply\n" + answer);
    }

While these local models are less powerful than OpenAPI they seem fairly decent on a first examination. They also a much cheaper way to work with an LLM and I am going to use this to set up a simple RAG (retrieval augmented generation) example in LangChain4J.

Categories
GenAI LangChain4j

First steps with LangChain4j

I found myself with some free time this week when train problems forced me to travel from Manchester to Sheffield via Leeds. I used that delay to set up a basic ‘Hello World’ example using Langchain4J. This proved a touch harder than expected.

The example on https://langchain4j.github.io/langchain4j/docs/get-started/ used a generate method on ChatLanguageModel that didn’t work for the latest versions of the libraries (0.26.1 at the time of writing).

Not a helpful example…

I soon cobbled together some working code using the latest version of the langchain4j-core and langchain4j libraries as well as a langchain4j-open-ai dependency. I originally used a couple of hello world queries, which produced boring responses, so I decided to ask OpenAI to tell me a joke.

package com.orbific;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.output.Response;

public class App
{
    public static void main( String[] args )
    {

        ChatLanguageModel model = OpenAiChatModel.builder()
                .apiKey(ApiKeys.OPENAI_API_KEY)
            .build();

        String message = "Tell me a joke.";
        AiMessage mine = AiMessage.aiMessage(message);
        Response<AiMessage> answer = model.generate(mine);
        System.out.println("Reply\n" + answer.content().text());
    }
}

The response made me smile:

Why don’t scientists trust atoms?

Because they make up everything!

What’s weird is that I kept getting the same joke, even when setting a higher temperature in the model or rephrasing the query. But requesting a joke about cats produced a pun about cheetahs. And asking repeatedly for jokes about underwater creatures brings back different responses. There’s obviously something here that I’m missing.

I set up a paid chatGPT account but that did not seem to grant me access to the API, and I also had to top up some credits as well. I’m not entirely sure whether I needed the paid account so will look into that before the subscription renews.

There’s an interesting question as to whether it would have been faster for me to read the documentation rather than flail around for a solution, but that’s the whole point of a quickstart, right? Although my flailing wasn’t helped much by tiredness and a dodgy mobile internet connection.

I have a genuine excitement about getting this working. It’s not much, but it opens up some exciting possibilities. Now to go and read some documentation.