Images Generator

A nice way to use AI

Jun 24, 2023

People enjoying a glass of wine together, happy character, warm light, captured with Sony Lens 20mm Kodak Vision3 200D --ar 16:

AI is the topic being talked about everywhere, some with excitement and some with dread, imagining a future where Skynet or the Matrix will rule the Earth. In both cases the AI everyone is talking about is a textual AI, like ChatGPT or Bard. You write a phrase (a ‘prompt’) and the system give you the answer: a post for your blog, a screenplay for a new movie, a lawyer’s defense arguments.

There’re also other AI, like Midjourney or Stable Diffusion, text-to-image models that create images when the user input a prompt. Like textual AIs, more accurate is your prompt, the more accurate is the answer and most beautiful the image will be.

The image you see above is the answer to this question:

Travel around vinery

As you can see a really simple sentence, banal I would say. It has been one of my first experiment with Midjourney but in my opinion it’s a nice picture.

So, the prompt is what you write when interacting with an AI, whether textual such as ChatGPT or Bend, or visual, such as Midjourney or Dall-E.

A new profession is emerging, Prompt Engineer. It's like the early IBM mainframes of the last century, when you could interact with the mainframe only through punch cards. Punching the cards was a job, yes. My first boss did it.

This image also is the answer to a simple question, or rather, a simple prompt:

Wine tasting on Mars surface

But what happens if more complex prompts are constructed?

a woman and a man sit on a table with food, outdoor on top of a hill, woman is speaking and the man listen smiling. a boy and a girl are playing near their parents. back the man there is a little home, in the background there is another hill with an italian country, in the valley there're vineyards. On the table there is a bottle of white wine and a bottle of red wine. man and woman have glass of wine in their hands. in the sky only one little cloud. captured with sony lens 20mm kodak vision3 200D

In the case of the image above, the prompt used is far more articulate:

A woman and a man sit on a table with food, outdoors on top of a hill, woman is speaking and the man listen smiling. a boy and a girl are playing near their parents. back the man there is a little home, in the background there is another hill with an italian country, in the valley there're vineyards. On the table there is a bottle of white wine and a bottle of red wine. man and woman have glass of wine in their hands. in the sky only one little cloud. captured with sony lens 20mm kodak vision3 200D

Depending on how you ask the questions, you get different answers, which still seems rather ... human.

Midjourney works with Discord, a platform created in 2015 and widely used especially by gamers, the users of online games, to exchange files and chat while playing. You can read more about it here if you want to know more.

With practice, experience, and several experiments, you can get very realistic images but prompt engineering is still an empirical science, a bit like SEO: it's about understanding how software works by entering inputs and studying how the outputs change. For those who like to play with computer science, this is a lot of fun. The image above for example is obtained from this prompt:

A woman and a man sit on a table with food, outdoors on top of a hill, woman is speaking and the man listen smiling. a boy and a girl are playing near their parents. back the man there is a little home, in the background there is another hill with an italian country, in the valley there're vineyards. On the table there is a bottle of white wine and a bottle of red wine. man and woman have glass of wine in their hands. in the sky only one little cloud. captured with sony lens 20mm kodak vision3 200D.

The same prompt, but with the addition of the 'hyperrealistic' parameter, provided this image:

Probably the wording of the request is interpreted by the AI in a certain way, I don’t know why the images produced all look very 1950s.

Instead, here we have a couple of examples of images that can be, at least at first glance and perhaps at second glance, mistaken for real photographs for use, for example, on a winery's Web site. The prompt for this one is:

Create a realistic image of a woman sitting on a wineyard table in autumn drinking a glass of a red wine. Use a Hasselblad camera with a 85mm lens at F 1. 2 aperture setting to blur the background and isolate the subject. On the table should be a bottle of wine and a plate of chees.The wineyard should have colorful autumn leaves and vines in the background, with soft sunlight falling on the subject’s face. Use a warm and inviting lighting effect to create a cozy and inviting image. Use the Midjourney v5 with photorealism mode turned on to capture the woman’s natural beauty and grace.

For the one below, however, I used the following formulation:

A vibrant and ultra - realistic stock photo portraying a confident woman dressed in stylish smart- casual journey attire, standing in a cellar space bathed in soft, natural light. The woman, who is the focal point of the composition, directly engages the viewer with her genuine, warm smile, and double thumbs - up, exuding positivity and success. The woman has a glass of red wine in her hand. The Nikon D850 camera and Nikkor 70 - 200mm f/ 2. 8 lens capture her radiant expression and dynamic pose in exquisite detail, utilizing an aperture of f/ 4, ISO 400, and a shutter speed of 1/ 200 sec. Behind her one sommelier stands pouring wine in a glass. The surrounding cellar space is full of shelf with bottles of wine. With its clear subject, natural lighting, and positive energy, this high - resolution image is perfectly suited for use in a wide variety of wine marketing contexts, encapsulating a sense of collective achievement and satisfaction

And for the one below, the prompt used is much simpler than the previous ones but equally realistic:

A sommelier girl on the beach serving drink to people with surf table captured with Sony a7R IV camera, Meike 85mm F1. 8 lens

As you can see, these are still images that can be used as stock images for a website or even an advertisement. The next step is to have the AI create similar images and then build a story, even if it is just a strip of four or five frames.

More than textual AIs, visual ones can give a better result and be used more comfortably, where errors are all in all minimal or, at best, passed off as artistic license.

Contrary to rumors, I believe that this innovation will not wipe out the current professions, and thus will not put tens of thousands of people out of work, but rather prompt many to change or modify their current one. In any case, the changes will not be immediate, as they always are; the changes will happen in the next five to ten years, and those who are doing a job today will not see the changes anytime soon. Artists, photographers, painters, draftsmen, even screenwriters, will learn to use the new tools as they learned to use the computer or photo editing programs.

Meanwhile, I am trying to learn the art of Prompt Engineering and how to use it for wine industry. If you’re using AI, text or visual, in your company, in your job, let me know it.

The Digital Wine

Discussion about this post