Whispering at AI

I’m on a journey to become an AI Whisperer, from frustration mostly. I know AI is the future, so I’ve tried it out but had little success up until now. I know it’s very clever, and I’m not overly dim, so why am I getting such poor results from it?

It turns out that AI whispering is important.

I’ve tried to use the new Bing AI search engine but I find it quite annoying that I don’t get the answers I want and I know it’s because I’m not asking the questions properly. Hence my foray into the world of AI Whispering, which is simply a phrase meaning; the proper use of specific terms that AI can understand to answer your questions or provide you with what you’ve asked for. It sounds easy but it takes some time to find out how to do that. Search engines don’t bring me the results that I want so I know that the only way I’ll learn is by doing. Here is my doing;

I start with the Stable Diffusion image generators Stable Diffusion Online (stablediffusionweb.com) and use basic terms which I found online, for creation;

A girl, makoto shinkai style, anime, japan animation’s background, 200mm camera lens, wide angle, night view, city, building, edge, high detail

It’s not bad but… there’s no girl and it’s not what I expected and doesn’t really justify the hype that we keep hearing about AI.

I went to Microsoft’s DALL-E which you can find in the sidebar of the Edge Browser (if you don’t have it, go and get it, it is the best browser by far) or here: DALL·E 2 (openai.com) and pasted in the same term as above and got this:

I got three other images too but this is the one I liked best.

The next term was;

Anthropomorphic majestic eagle knight, portrait, finely detailed armor, cinematic lighting, intricate filigree metal design, 4k, unreal engine, octane

Stable Diffusion:

Pretty nice I thought but using the same term in DALL-E:

The AI’s have different ideas about what an Eagle Knight is but I really like the latter.

Next term:

An airport with modern landscape architectural design for industrialpunk, water in the middle, dramatic lighting and composition, octane render, unreal engine 5 

DALL-E:

Kinda cool but not as good as DALL-E:

This one brings some of my favourite results, first Stable Diffusion;

beautiful Pikachu, pencil art, ultra realistic 

I liked this one until I used the same term in DALL-E and got these, I’ll post two as I like them so much;

On the next one I used a prompt that I thought would yield better results, it’s more descriptive and quite detailed but what I got wasn’t very impressive, first from Stable Diffusion;

A scribe, copying an exotic spell from one spell book into another. Perhaps a cat or a small magical animal is looking on. The library is dark, with an ornate and expensive candle stick lighting the room. He is surrounded by books – both on high shelves and piled on the desk around them, some of the books laying opened on the table. 

DALL-E;

Neither of them are much to look at, despite the amount of detail that was given, there is more to the whispering than descriptive detail.

This next one gets good results from both AI’s. First Stable Diffusion;

A colorful panda playing in forest, oil painting by Leonardo da Vinci, highly detailed 

When it’s given an artist style to follow, it gives a better result.

DALL-E gave me four of the same Panda doing different things, all of them were the same high quality;

The next one is a bit of fun for those of us who enjoy Anime, first Stable Diffusion;

Uzumaki Naruto vs Goku, anime fight, ultra realistic, 8k 

At this point I’m thinking that DALL-E is a much better option for everyone because the quality of the images is so much higher but this isn’t the case. There is a reason why there are jobs with high salaries for AI Whisperers, I will get to it but enjoy the images for now:

DALL-E;

Stable Diffusion;

A monochrome forest of ebony trees, octane rendered 

DALL-E;

For those who are not familiar with rendering, Octane is an engine that makes images from 3d models and Unreal is a game engine that makes games for Play Station and X Box which also creates (renders) images from 3d models.

I had high hopes for this next one as Studio Ghibli was used in the prompt but I didn’t find either of them particularly impressive. First Stable Diffusion;

A contemporary house in the woods, anime, oil painting, high resolution, ghibli inspired, 4k 

DALL-E;

I used a few more prompts and got the same results each time, Stable Diffusion gave me (mostly) unimpressive results and DALL-E gave great results. I then scrolled through the Stable Diffusion site to the Wanna create better prompt? section. I did wanna create a better prompt and so I clicked on it and found, after a chat with the Bing search engine, that I could indeed create a better prompt.

Because of the images, this is already a long post so I’ll continue this in the next post which will be up in the next day or so. I will be giving the prompts which yield good results along with the images themselves.

I will then be getting into Microsoft Designer, which seems very impressive and the new Canva AI, both of which can be used for free.

I’m also looking forward to seeing what kind of literature a good prompt can yield. I’ll be a whisperer in no time!

I am very much enjoying this and look forward to the next one, join me then.

Cheers,

Old Man


Leave a comment