AI image generation - John Walley

AI is everywhere at the moment. I've dabbled with ChatGPT and its peers, as well as a brief play about with MidJourney last year.

I wanted to check back in on the state-of-the-art when it comes to image generation so signed up for MidJourney's basic plan and spent a train journey into London experimenting. There are plenty of good models about such as DALL·E 2 and Stable Diffusion but I went with what I knew.

A big part of getting good results is the prompt you give the tool.

A Prompt is a short text phrase that the Midjourney Bot interprets to produce an image. The Midjourney Bot breaks down the words and phrases in a prompt into smaller pieces, called tokens, that can be compared to its training data and then used to generate an image. A well-crafted prompt can help make unique and exciting images.

It's something if a rabbit-hole so to get started quickly I tried basing my prompts on examples from the community showcase.

I chose my two cats as subject matter and here are two of the more striking results (with prompts).

astronaut black cat in space pointing up, 8K, UHD, detailed, photo realistic, full body composition,

I really like this one. I mean, yes, it has given the cat a hand but otherwise it's quite striking. Actually, the hand thing is important because when I last used the tool it struggled to generate hands at all (and eyes if I remember correctly).

black and white short hair cat dressed as Elizabeth the first Tudor stylised oil painting

This one is in honour of a recent visit the King Richard III Centre in Leicester.

All in all, colour me impressed. It can take some effort (there are at least ten substandard images for each one I've shown here) but I wouldn't have predicted we'd have access to powerful tools like this just a few years ago.