Yle Beta exists to try out improbable projects. This article is the post mortem of a one which I still don't completely know what it was all about.
In November I was given green light to try out a new idea of how to approach artificial intelligence (AI) generated content. More specifically I wanted to try out the latest text and image generating machine learning (ML) algorithms with experts from Silo AI.
Because my understanding from these algorithms is based on random Reddit stories, I was not able to draw a clear picture of what we should be aiming for. However the context was juicy enough to get everyone inspired: Chat with a ghost, who is recommending contents for you in a personalized, and a very personal way!
Before explaining what we ended up building, let's envision a future where everyone has their very own personal AI assistants. Some think of these as voice controlled, but I like the chat interface, because it can utilize images as well as text.
What it comes to personalization, the assistant should know you as well as you do, preferably even better. I come from a part of Yle where we focus on storytelling. That's why I can look at the world from an angle where personal assistants are not limited to humans or bots: Supernatural beings are acceptable too!
What about an assistant is actually you, but from the future, after you have died? When I think about opening a chat to ghost-me, I'd certainly like to know more. This opens up an interesting question about personal assistant's personality - would I be interested in it, even if I knew it doesn't really exist?
With this project we aimed to peak into how this kind of future will be built and what sort of challenges there will be in making it as the present.
As expected, the starting point aimed too high. Unfortunately many of the early ideas had to be shot down because of practical issues. Most importantly we didn't have time to optimize the algorithms nor have the computing power to do any content in real time. Because of this, personalization was not possible. So we focused on personality.
With these sorts of new technology meet storytelling -experiments I like to build something that could be the real thing. See the proof of concept -demo of how the chat could work.
The story is always the same and user can't affect it in any way.
- A customer care chat appears.
- Once the user opens it, an ML generated (not based on a photograph) person appears. It has a ghost-like appearance using WebGL artistry and ML.
- The chat agent recommends a particular content. The recommendation comes from the ghost's hopes, which is created using ML. It makes sense only if the user wants to believe it does - this goes with all of the texts.
- Next the chat agent tells a short description about him/herself and shows a StreetView image of user's location and an ML manipulation of it. The image is a upside down world version of our reality, an attempt to mimic Stranger Things using ML.
- Finally the chat agent disappears with ML generated regrets piece of content.
At this point some questions may have arisen. First of all, I am writing this in English, as some of the stakeholders were English. Also the end result was in English, because of limitations of the ML models we used.
What's the deal with the dead? We've used horror as the content base for experimental tech projects many times before. Everyone is familiar with the supernatural properties of the horror genre. I like it because supernatural reality may have glitches. If AI is producing text that is not quite on the level of professional journalist, it may be on the level of a person who has died a hundred years ago!
And finally, why spend time and effort in something that is clearly not going to be used anywhere? Yle Areena doesn't even have a chat. The point was to test idea of personalizing straightforward recommendations in a new and personal way. Also as usual, there was originally another use case, that we could not use in the end because of reasons. In the demo, Yle Areena is basically just a background image representing a wealth of contents.
Chat Beyond The Grave is a proof of concept (POC), combining web 3D artistry and cutting edge ML models with a bit of man-made storytelling, packaged in a form of chat. The end result is a huge compromise where most of the ideas had to be cut down. However it still has the logic originally envisioned.
WebGL (3D graphics on the web) is used to render ghosts and AI is used to create their personality. In the early stages of the project, Joonas Kallioinen, the WebGL artist, came up with beautiful concepts that were definitely more interesting than what AI could produce. Lighting effects and animated 3D shapes bring the supernatural magic, but most of those had to be cut down in order to make the chat function as expected - to mimic what AI could produce.
The first version of the AI did not bring much to the table, so it seemed pretty dumb to throw away the only functioning part of the project. But as the project evolved, focus shifted into the surreal beauty of ML content, not so much the presentation. For me this part of the process was the most interesting - seeing how I started to get more emotions from something that clearly doesn't make any sense, instead of from something that works beautifully.
There are both text and image generating algorithms used. Ghost's personality is created using text, but it's outlook and how the ghost realm looks, are created using image generation. For more details about how these were built, please see Silo AI's marketing post.
Back to The Future
I'm undecided if the end result gives any indications if these sort of agents will be the future. AI based content generators are dumb as hammers, they have no intelligence. But it seems if they're mimicking human behavior well enough, the intelligence opens up in the user's head! User sees in between the non-intelligent lines, and that's very interesting as we who designed the system know there really is nothing.
Juuso Pekkinen interviewed Michael Laakasuo and Aku Visala in his podcast regarding this same observation. The podcast starts with relevant discussion about "anthropomorphization" but all in all it has a lot wider scope than just believing in AI ghosts, so a strong listening recommendation (in Finnish).
ML is a very expensive way to produce non-intelligent content. This project was fun to make and produced interesting observations, but after looking critically at the end result, the lack of contextual tailoring (personalization) is a show stopper. Chat requires it to be (almost) real time, but maybe some other form of content delivery mechanism doesn't, so maybe there is a sweet spot somewhere waiting to be found.