



I’ve been working with AI generated imagery for some time now. With nearly 80,000 images behind me I can safely say that I feel that I have a reasonably good idea how things work and a realistic picture of the nature of the ‘wolf at the door’
It all came to a head for me some time back when an Adobe executive announced; (seriously), that AI was the new digital camera. My thoughts in response to this were not charitable to say the least.
Previous to this was my exposure to Stephen Fry reading a letter from Nick Cave about Chat GPT. No pun intended but it struck a chord with the view that was already developing in my picture of the world of the ‘Artificially Intelligent’
Outside of the many issues that come up around the use of AI, one stands alone amongst all others and it’s simply that the use of AI algorithms across a range of problem solving and creative endeavours stupefies / arrests the development and refinement of related skill sets and the creative process as a whole. ‘Use it or lose it’ was a phrase that rang in my ears throughout my childhood and teen years. I could argue this out until the cows come home and in the end it would be a long, rambling, albeit coherent post that i would be subjecting the audience to, but know dear reader, that this is not the result of the breaking down of text into tokens and feeding it to the billions of dogs in the LLM compound and awaiting the result of the feeding frenzy.
“AI Photography” is a misnomer to say the least. Reducing, in this instance, the ‘photographic eye’ to the role of ‘content producer’, subsequently devolves the process across literature, music, art, design and photography to mere content production and sadly, spearheading this narrative are organisations like Spotify and Adobe Corporation.
No camera, no photography. It’s simple. Despite the existence of hyper-real image creation algorithms, faux photographic images just don’t cut it, simply because they are not representations of the vista before the photographer or the result of discriminated intent. However as objects of curiosity, it’s another debate.


The image above (photograph of objects) was the reference used to make the image on the right.



The above images show pixels from the original photograph (Fig 1) @12800 magnification. Each pixel has a co-ordinate and an HSL value. This information is written to storage and used to reconstruct the image when exported for printing or loaded for display. Digitally, ‘images’ only exist as data, which is converted to ‘tokens’ at the ‘bit’ level of the file data for use in LLM’s. For example, a 12mb file has 12 million bytes of data which in turn converts to 36 million bits ( 4 bits to the byte) So for a 12 mb image thats 36 million tokens that are generated when the file is ‘scraped’ for use in an LLM (Large Language Model). The job of the neural network (as I understand it) is to calculate the relationships between tokens within a very complex hierarchy of mathematical dimensions and re assemble / regenerate a pixel ‘grid’ ‘guided’ by the tokens generated in the prompt.


The above are pixel level crops from the generated image on the right. (Fig 2)
At the token level no image exists so it can’t be said that images are stolen or appropriated in any form because as previously mentioned in the digital realm, only the data used to reconstruct an ‘image; exists.
It’s been a while, a whole year nearly since posting to this site. A lot has happened some of which is worth talking about. It’s been the year of generative image making and ChatGPT. Myths and legends abound and for me it was a year to do some myth busting.
ChatGPT was my entry point. Through some extensive dialog with educators who had concerns about student usage and the supposed advantages thereof I found myself doing some system testing to see the extent of the application of generative text and to better understand the process. Leaving aside the standard “write me an essay about…….” I went straight for the throat and entered questions directly from HSC exam papers (relevant only to teachers in NSW Australia) What i discovered was that that ChatGPT struggled with contextualising the nature of the question and the response. Firstly given that most exam or essay questions are not questions but instructions; questions generally start with what, when, where, why, who, how etc and not verbs such as discuss, investigate, analyse, compare, describe, assess, clarify, evaluate, examine, identify, outline etc the responses were consistently very average; if you were to scale them most would fall into the ‘C’ range. So; no advantage to be had here. This then led to ‘why does this happen’ and to understand that I realised i had to look at ‘how’ this all happens. That led to looking at decoding and encoding text.
Machines, so it appears do not ‘read’ text as we do. Whilst the eyes and brain are involved in a decoding and encoding of the letter shapes and their relationships, machine learning involves encoding text something like this;
‘\x74\x27\x73\x20\x62\x65\x65\x6e\x20\x61\x20\x77\x68\x69\x6c\x65\x2c\x20\x61\x20\x77\x68\x6f\x6c\x65\x20\x79\x65\x61\x72\x20\x6e\x65\x61\x72\x6c\x79\x20\x73\x69\x6e\x63\x65\x20\x70\x6f\x73\x74\x69\x6e\x67\x20\x74\x6f\x20\x74\x68\x69\x73\x20\x73\x69\x74\x65\x2e\x20\x41\x20\x6c\x6f\x74\x20\x68\x61\x73\x20\x68\x61\x70\x70\x65\x6e\x65\x64\x20\x73\x6f\x6d\x65\x20\x6f\x66\x20\x77\x68\x69\x63\x68\x20\x69\x73\x20\x77\x6f\x72\x74\x68\x20\x74\x61\x6c\x6b\x69\x6e\x67\x20\x61\x62\x6f\x75\x74\x2e\x20\x49\x74\x27\x73\x20\x62\x65\x65\x6e\x20\x74\x68\x65\x20\x79\x65\x61\x72\x20\x6f\x66\x20\x67\x65\x6e\x65\x72\x61\x74\x69\x76\x65\x20\x69\x6d\x61\x67\x65\x20\x6d\x61\x6b\x69\x6e\x67\x20\x61\x6e\x64\x20\x43\x68\x61\x74\x47\x50\x54\x2e\x20\x4d\x79\x74\x68\x73\x20\x61\x6e\x64\x20\x6c\x65\x67\x65\x6e\x64\x73\x20\x61\x62\x6f\x75\x6e\x64\x20\x61\x6e\x64\x20\x66\x6f\x72\x20\x6d\x65\x20\x69\x74\x20\x77\x61\x73\x20\x61\x20\x79\x65\x61\x72\x20\x74\x6f\x20\x64\x6f\x20\x73\x6f\x6d\x65\x20\x6d\x79\x74\x68\x20\x62\x75\x73\x74\x69\x6e\x67\x2e\x20’
The above is the first paragraph of this post encoded in UTF-8 Hex code
Encoded in UTF-32 it looks like this; u+00000074u+00000027u+00000073u+00000020u+00000062u+00000065u+00000065u+0000006eu+00000020u+00000061u+00000020u+00000077u+00000068u+00000069u+0000006cu+00000065u+0000002cu+00000020u+00000061u+00000020u+00000077u+00000068u+0000006fu+0000006cu+00000065u+00000020u+00000079u+00000065u+00000061u+00000072u+00000020u+0000006eu+00000065u+00000061u+00000072u+0000006cu+00000079u+00000020u+00000073u+00000069u+0000006eu+00000063u+00000065u+00000020u+00000070u+0000006fu+00000073u+00000074u+00000069u+0000006eu+00000067u+00000020u+00000074u+0000006fu+00000020u+00000074u+00000068u+00000069u+00000073u+00000020u+00000073u+00000069u+00000074u+00000065u+0000002eu+00000020u+00000041u+00000020u+0000006cu+0000006fu+00000074u+00000020u+00000068u+00000061u+00000073u+00000020u+00000068u+00000061u+00000070u+00000070u+00000065u+0000006eu+00000065u+00000064u+00000020u+00000073u+0000006fu+0000006du+00000065u+00000020u+0000006fu+00000066u+00000020u+00000077u+00000068u+00000069u+00000063u+00000068u+00000020u+00000069u+00000073u+00000020u+00000077u+0000006fu+00000072u+00000074u+00000068u+00000020u+00000074u+00000061u+0000006cu+0000006bu+00000069u+0000006eu+00000067u+00000020u+00000061u+00000062u+0000006fu+00000075u+00000074u+0000002eu+00000020u+00000049u+00000074u+00000027u+00000073u+00000020u+00000062u+00000065u+00000065u+00…………etc, etc
So when a prompt is entered into ChatGPT it is first encoded so that it can be read. The response from ChatGPT is likewise scripted in machine language and then decoded into text. How does it work?
It’s primarily a predictive model, that predicts sequences based on learnings from ‘other encodings’, because that’s what the neural network reads. When this is understood a lot of the ‘myth-understandings’ about machine learning are to some degree dissolved. The better the quality of the language structure of the texts that neural networks are trained on the better the likelihood of a cohesive albeit somewhat standardised, (bearing in mind the encoding and decoding sequence) response.
to be continued @ STAGESIX