It’s a day of experimenting with prompts and artists names. The plan is to create a deck of Tarot cards, with characters and scenes that share a particular style throughout.
I ask Bing to describe a beautiful tall Queen dressed in white, in detail and Bing kindly does. I change some of the details and add Michelangelo as the artist. The result is very impressive:
I like the second image so I ask for variations of that:
I like the second one again and ask for an upscaled version:
I like these images so much, I find it hard to believe they were created with just a few words from an old man. That is the image of a beautiful woman; asked for and received.
I know by now that Midjourney can produce extraordinary images, but I do not know if it will generate consistently similar styles over time. I try a prompt with a description of the Fool card, giving the same settings as before and specifying Michelangelo. The result is not even close to the prior images:
There’s not much wrong with them but they are too different to be images for the same deck of cards. I have to find out what’s going on here.
I go and study to find out why, and there are so many reasons that I would bore you to tears if I told you. Midjourney is entirely different from the web-based AI’s that I have used up to this point, which work primarily using keywords, as a Search Engine will. Midjourney takes extremely specific instructions.
I Study a bit more and realize that not only are the words important but the order of the words is as important as the words themselves. I put more detail into the description of The Fool, than the Queen and this is why there is the big difference.
I try a new prompt , then more still; changing the order of words and changing the artist names. I get some nice results;
The final image is the closest I’ve got to the Fool facing away from us, which I specified over and over. It’s not perfect but I’ll take what I can get today.
The servers got really slow after this one so I left things there. Progress is slow but Rome wasn’t built in a day.
I am glad that I said the next post would be 3-5 days because it has been a fun filled little while.
There is more to the AI whispering than I first thought. I do not mean it is more difficult, there are just things that need to be considered that could not be anticipated. I will get to them as I update you on my endeavours.
Angela wants a Tarot Deck for her customers, and I said I would design one. As always, I asked the Bing AI how I should go about it and it sent me to Midjourney.com. If you want to sell the images you make, this is the only option; you cannot sell DALL-E images, it is in the terms that you agree to. It is not clear for Stable Diffusion, but I think those images remain Public Domain. Midjourney it is then.
To use Midjourney you need a Discord Server, so I downloaded and installed. I have meant to get Discord for a long time, but I am old and things slip my mind. The trial period only gives you 25 minutes of GPU time and because I did not want to waste any of that time, I decided to research as much as I could beforehand.
Research and type were all I did for most of the day. I have lists of every emotion that exists. I know every type of art style, all the lighting terms, all the renderers, all the aspect ratios and I have seen and studied hundreds of prompts.
Finally ready to try my hand at Midjourney, I asked Bing for a detailed description of a city street. It gave me this: “The kaleidoscope of shimmering lights flicker in the distance as the starry sky sweeps over the city that never sleeps. Hazy clouds envelope the moon so it was in its own realm of perpetual darkness. The wet, desolate streets of the city rested in silence as the starry black sky wept over it.”
I pasted it into Midjourney and it said: Due to extreme demand we can’t provide a free trial right now. Please /subscribe or try again tomorrow.
Marvellous, isn’t it? With some quiet muttering under my breath, I signed out and left it for the day.
Not wanting to face another due to extreme demand message, I bought a subscription to Midjourney the next day. The Bing description of a city street produced a rather good image without any input from me, here it is:
Remembering the Tarot Deck, I asked Bing for a description of the Fool Card, because I like fools. I took part of that description and added some details myself. I won’t give you the prompts for these because they are for someone else but I started by adding the Artist Monet to the description and then, one of my favorite’s, Turner. The results are below.
Monet;
They are quite representative of Monet but a bit boring so I tried Turner, whose landscapes I like very much;
I think that if you want a deck of cards designed, you could do worse than include Turner as an influence but I’m not through with experimenting and I really want to see what including Michelangelo will do. That will be in another blog.
For now I feel like playing around and I expand on the fool reference and go for a Court Jester and this is my favorite image so far;
I’ll be opening an Instagram account soon to post more images like this and I’ll let you know when it’s live.
When using Stable Diffusion with basic prompts, my images were far inferior to those of DALL-E but looking at the images of other users it is obvious that better images can be made.
Clicking on the Wanna create better prompt? section, brings me to the Prompt Search Engine. Unfortunately, I have no idea what I ought to be searching for. I know very well how to use a search engine, but I do not know how to search for prompts.
I have an irritable chat with the Bing Search Engine which tells me to use the prompt search engine to search for prompts. This just annoys me because I do not know what I am searching for. Eventually I ask the question properly and ask for an example of a search term and it tells me to search for “A beautiful sunset over the ocean”
It is not a database for just prompts, though I had the right to assume it was, given the name, it is a database of images with the prompts which generated them attached to each one.
Still annoyed, I search for “a beautiful sunset over the ocean” and get this:
with the prompt underneath:
A beautiful photograph of a detailed ornate steampunk airship flying over a majestic mediterranian port city filled with tiny glowing lanterns with a view of the ocean at sunset, by David Noren, jordan grimmer, tyler edlin, featured on cgsociety
My irritation dissipates at once. This is the kind of thing I expected from the start; dreamy, fantasy images that I want to keep looking at.
I’m getting the idea that giving the AI the names of artists to mimic brings better results but I know there is more to it than just that, once I change some of the details and see the images.
I try: A beautiful photograph of a detailed witch on a broomstick flying over a majestic mountain range with a village below, with tiny glowing lanterns with a view of the ocean in the distance, at sunset, by David Noren, Jordan Grimmer, Tyler Edlin, featured on cgsociety and get these:
I don’t know who the named artists are but I don’t think that they paint Witches. The landscapes are awesome but the witches are an abomination.
Rather than heading over to CGSociety to find an artist or two with an interest in Witches, I just change the term Witch to Golden Eagle and get these;
Those images are awesome, I am getting this the hang of this, slowly. Some artists are obviously better for landscapes but it seems that if I want a detailed character, I’ll have to add an artist who paints that particular character well, to the list.
Next, I searched “Pixar Characters” and saw a lovely image of a Chinese woman with this prompt:
an epic fantasy comic book style full body portrait painting of a beautiful Chinese woman, long fire hair, cute, character design by Mark Ryden and Pixar and Hayao Miyazaki, unreal 5, DAZ, hyperrealistic, octane render, cosplay, RPG portrait, dynamic lighting, intricate detail, summer vibrancy, cinematic
I really liked it and only changed a few words. I have a soft spot for Koreans because I love the K-pop, so I put:
an epic fantasy comic book style full body portrait painting of a beautiful South Korean singer, long fire hair, cute, character design by Mark Ryden and Pixar and Hayao Miyazaki, unreal 5, hyper realistic, octane render, cosplay, RPG portrait, dynamic lighting, intricate detail, summer vibrancy, cinematic
Two of which I liked more than the original:
These are my favorite so far, I really like them and I’m wondering if I can improve them by adding a background like the one for the Witches so I merge the terms to this:
an epic fantasy comic book style full body portrait painting of a beautiful South Korean singer, long fire hair, cute, character design by Mark Ryden and Pixar and Hayao Miyazaki, unreal 5, standing on a majestic mountain range with a village below, with tiny glowing lanterns with a view of the ocean in the distance, at sunset, by David Noren, Jordan Grimmer, Tyler Edlin, featured on cgsociety
And get something a bit weird but also pretty cool:
I think I need to be more careful with my grammar. I think the weird looking half man is very likely Hayao Miyazaki, from the prompt. The way it’s worded can be interpreted as me asking for him and Unreal to be standing on a majestic mountain range. I’ll have to be more careful in the future.
I like these images as much as I like those from DALL-E but the DALL-E image quality can be achieved with fewer prompt terms. Before I used the Prompt database, I really thought that Microsoft had the edge, and perhaps it does for the casual user but I’ll be spending more time playing with Diffusion than DALL-E for the moment.
My next post will be about Microsoft Designer and perhaps Canva, both of which use AI. I will be using them to design the website oldmanversusai.blog. As it’s a new domain and completely empty apart from two posts, it will be 3 to 5 days before the next one goes up.
I have never used MS Designer or Canva before so it’ll be a learning curve for sure and I’ll have plenty to write about.
I’m on a journey to become an AI Whisperer, from frustration mostly. I know AI is the future, so I’ve tried it out but had little success up until now. I know it’s very clever, and I’m not overly dim, so why am I getting such poor results from it?
It turns out that AI whispering is important.
I’ve tried to use the new Bing AI search engine but I find it quite annoying that I don’t get the answers I want and I know it’s because I’m not asking the questions properly. Hence my foray into the world of AI Whispering, which is simply a phrase meaning; the proper use of specific terms that AI can understand to answer your questions or provide you with what you’ve asked for. It sounds easy but it takes some time to find out how to do that. Search engines don’t bring me the results that I want so I know that the only way I’ll learn is by doing. Here is my doing;
A girl, makoto shinkai style, anime, japan animation’s background, 200mm camera lens, wide angle, night view, city, building, edge, high detail
It’s not bad but… there’s no girl and it’s not what I expected and doesn’t really justify the hype that we keep hearing about AI.
I went to Microsoft’s DALL-E which you can find in the sidebar of the Edge Browser (if you don’t have it, go and get it, it is the best browser by far) or here: DALL·E 2 (openai.com) and pasted in the same term as above and got this:
I got three other images too but this is the one I liked best.
Pretty nice I thought but using the same term in DALL-E:
The AI’s have different ideas about what an Eagle Knight is but I really like the latter.
Next term:
An airport with modern landscape architectural design for industrialpunk, water in the middle, dramatic lighting and composition, octane render, unreal engine 5
DALL-E:
Kinda cool but not as good as DALL-E:
This one brings some of my favourite results, first Stable Diffusion;
beautiful Pikachu, pencil art, ultra realistic
I liked this one until I used the same term in DALL-E and got these, I’ll post two as I like them so much;
On the next one I used a prompt that I thought would yield better results, it’s more descriptive and quite detailed but what I got wasn’t very impressive, first from Stable Diffusion;
A scribe, copying an exotic spell from one spell book into another. Perhaps a cat or a small magical animal is looking on. The library is dark, with an ornate and expensive candle stick lighting the room. He is surrounded by books – both on high shelves and piled on the desk around them, some of the books laying opened on the table.
DALL-E;
Neither of them are much to look at, despite the amount of detail that was given, there is more to the whispering than descriptive detail.
This next one gets good results from both AI’s. First Stable Diffusion;
A colorful panda playing in forest, oil painting by Leonardo da Vinci, highly detailed
When it’s given an artist style to follow, it gives a better result.
DALL-E gave me four of the same Panda doing different things, all of them were the same high quality;
The next one is a bit of fun for those of us who enjoy Anime, first Stable Diffusion;
Uzumaki Naruto vs Goku, anime fight, ultra realistic, 8k
At this point I’m thinking that DALL-E is a much better option for everyone because the quality of the images is so much higher but this isn’t the case. There is a reason why there are jobs with high salaries for AI Whisperers, I will get to it but enjoy the images for now:
DALL-E;
Stable Diffusion;
A monochrome forest of ebony trees, octane rendered
DALL-E;
For those who are not familiar with rendering, Octane is an engine that makes images from 3d models and Unreal is a game engine that makes games for Play Station and X Box which also creates (renders) images from 3d models.
I had high hopes for this next one as Studio Ghibli was used in the prompt but I didn’t find either of them particularly impressive. First Stable Diffusion;
A contemporary house in the woods, anime, oil painting, high resolution, ghibli inspired, 4k
DALL-E;
I used a few more prompts and got the same results each time, Stable Diffusion gave me (mostly) unimpressive results and DALL-E gave great results. I then scrolled through the Stable Diffusion site to the Wanna create better prompt? section. I did wanna create a better prompt and so I clicked on it and found, after a chat with the Bing search engine, that I could indeed create a better prompt.
Because of the images, this is already a long post so I’ll continue this in the next post which will be up in the next day or so. I will be giving the prompts which yield good results along with the images themselves.
I will then be getting into Microsoft Designer, which seems very impressive and the new Canva AI, both of which can be used for free.
I’m also looking forward to seeing what kind of literature a good prompt can yield. I’ll be a whisperer in no time!
I am very much enjoying this and look forward to the next one, join me then.