Conversational game techniques for AI in video
Transcrição
Conversational game techniques for AI in video
Conversational Techniques and the Advancement of AI in Video Games. Sean Kevin Farrell (Author) Marco Hellmich (Author) University of Limerick Video Games Development Limerick, Ireland [email protected] University of Limerick Video Games Development Limerick, Ireland Abstract – An investigation into the tools and techniques used in modern artificial intelligence for conversations inside video games. I. INTRODUCTION I am obsessed with virtual worlds. Everything about them seems magical to me. As if recreating a living world that I can walk around and interact with. We have gotten to the stage where our designs for environments are good enough that players feel huge amounts of immersion and exploration is encouraged. The one area that has not grown with world detail is NPC interactions. They are stale, built on old mechanics and in need of innovation. How do we do this and how do we create meaningful conversations in video games? II. THE BEGINGING Right now, the player will talk to an NPC and a list of options will be available for questioning. Once you move through a decision tree and ask “Have you heard any rumours? “ ten times, the conversations become more of a tedious task to cokes information out of dead soles that aimlessly walk the world for no reason. What if the conversations and interactions between the NPCS and the player were life like? Were if you asked the same question more than once, they would call you out on it? Over the years there has been many iterations of Communications in games over the year. In the most basic form a commutation AI was set to trigger when a user/player has performed an arbitrary task, such as reaching a point in a level or activating an item in game. In very early games with 8bit sounds and graphics the AI's commutation would be as simple as flashing arrow or text to tell the player which direction to go once the player had reached a point. When games developed and technology became more advance, they were able to use more complete methods to communicate to the player. Adopting sound and even video. While this was a new experience for the player there was little difference in how the AI worked. Another alternative to this is the conversation options. The player is presented with a number of speech options to select from, each will lead to a different responses. Factors such as the player’s speech skill, alignment or the in game characters disposition to the player can either display new speech options or have a percentage change of pick between different outcomes. Having a specific item in the player’s inventory can also add a new speech options. The above are common AI mechanics for RPG styled games or dating simulators. III. MOST USED CONCEPT The next step in AI communications was the Passive commutation. This is where game characters who are present in the world will talk to the player when their character is within a predetermined proximity. These comments are normally greetings or recognition of the play action/status, disposition, alignment or even an acknowledgement of an item in the player’s inventory; such as referring to a hat the player is wearing. An example of how an NPC can react to the player is if a player in an RPG if a player alignment is evil the NPC could run or say specific dialog. NPC will also react to each other and start conversations with each other. At this point you may notice none of these are actually that different. They are all based on a simple trigger system. Has play done [action]? [yes/no] respond with [action]! IV. CLEVERBOT So now let's look at something else, a program called Cleverbot, while not exactly a game Cleverbot is a fantastic example of a compunction AI. The below image is an example of a conversation I had with Cleverbot. This AI is degree capable of reproducing conversion. It picks key words from what is said to it to try and create a response that will be appropriate. Cleverbot is even capable to recoding new data and increasing its vocabulary and understating of responses. For more about Cleverbot you can see this article http://singularityhub.com/2010/01/13/cleverbotchat-engine-is-learning-from-the-internet-to-talklike-a-human/ The AI Cleverbot was given the Turing test and was judged to be about 59% human, although when I questioned the AI, it stated that it did not know what the Turing test was and then began to accuse me of being an AI. More can be found here about the CleverBot and the Turing test: http://www.cleverbot.com/human However this is a report about game AI, and Cleverbot is not exactly a game, but can we see elements of what Cleverbot is used in a game environment? Well yes it can, well to an extent anyway. Let’s talk about a game called Façade. The idea behind this game is to interact with two NPC in a house using the keyboard to type to them. In a way this game is similar to Cleverbot but it is able to be more accurate as the game has the advantage of context. Cleverbot is required to come up it an answer to what the user types and has no frame of reference as to what the user maybe talking about, Facade on the other hand can use the current conversation, the player actions and scenario setting to diver how it interprets the player's input. The way this game uses user input to communicate while inaccurate or limited at times is ha huge step in how AI commutation may be improved in the future. If this type of technology is improved further. V. WHERE ARE WE NOW? The implementation of learning and conversation machines between a computer and a user have been slowly improving ever sense the first Zork game was created. The highest level of this and largest innovation is a little further from games that I would hope but innovation none the less. Watson – IBM’s computer that plays the game of Jeopardy [1]. The program takes a text file input of the answer then calculates what the answer is to that question. If the confidence is lower than the threshold of risk, then the program will not answer the question. What implications does this have on video games? Well what if you could type what you want to ask to an NPC in the world and it responds to what you correctly wrote? Some games right now can do this. One game genre is dating games. ‘Catherine is a puzzle-platformer adventure game in which players control Vincent Brooks, who begins having strange nightmares after his girlfriend, Katherine, begins to talk about marriage and commitment. This matter becomes more complicated for him when he meets a girl named Catherine, and begins an affair with her, and the nightmares get more and more intense’.[2] Awkward gameplay mechanics and story aside, Catherine has a system of text messaging were the player can choose from a few predetermined lines of text and create their own reply to the message. A morality meter will give feedback on how well or how poorly your response was received. (28:29 for demonstration and explanation in game: [3]) This works in a lot of ways. It still plays off of the logic/decision tree system that basically everyone uses for game logic in any game where an NPC interacts with a player, but it also has a strange human element in it. How many times have you wrote something in a text message and then deleted it for something else? It’s like you want to say how you really feel but you know that in the long run it won’t work out for you. Well that’s how this feels with the added morality system. You find yourself being nice even though you don’t really want to be. How amazing would a detective game be if you could talk to a character and interrogate them fully? Of course this takes a lot of resources and facial modelling to show true emotion but we are very close to that fidelity. At the Hebrew University in Jerusalem, Israel, Oren Tsur and others have created an AI that understands sarcasm. ‘The inherently ambiguous nature of sarcasm sometimes makes it hard even for humans to decide whether an utterance is sarcastic or not. In this paper we present a novel algorithm for automatic identification of sarcastic sentences in product reviews’. (Source: http://staff.science.uva.nl/~otsur/papers/sarcasmAm azonICWSM10.pdf) With a data set of 66000 reviews from various sources they achieved an 83.1% evaluation set containing newly discovered sarcastic sentences. Being able to define a key characteristic of human interaction and create an algorithm that understands it is just another step in the direction of realistic AI in video games. Being able to walk up to the ugliest person on the street and tell him/her that they are beautiful and have them understand that you are being sarcastic and react to that would be huge leap in more humanoid programming. Now would it be interesting if you didn’t have to type at all to your NPC companion? Integrating software like SpeechFX (Source: http://www.speechfxinc.com/video-games.html) might help bridge this gap. ‘SpeechFX VoiceGaming SDK is our awardwinning software solution for video game voice command and control. SFX VoiceIn is the industry's most memory-efficient voice interface. Game developers worldwide can now build games that use a common API set across all game platforms.’ It’s been used in many well-known games such as Tom Clancy’s Rainbow Six Vegas 1 and 2 as well as Endwar and HAWX. What if you could walk up to a guard in Skyrim and ask him where the nearest blacksmith is and he would understand and give you the answer back? What are the implications on immersion then? Other programs like Tazti (Source: http://www.tazti.com/demo_video.php) Are still a step in the same direction but making it possible to use for all games. How do we solve the issue that decision and logic trees walk around? The fact of the matter is, if we give the power of all possibilities of questioning to the player, how do we fix the “I have limited responses, you must ask the right questions” conundrum? One way of looking at this problem is Markov Decision Processes or a Markov Chain. A Markov chain is a mathematical system that transitions from one state to another along a finite or countable number of possible states. It’s a random process and the next state always depends only on the current state and not the sequence of events that preceded it, like a decision or logic tree. Markov chains are widely used in statistical models of real-world processes. Markov chains can solve the problems of large amounts of variables that determine outcome based on player behaviour. In speech, it could mean that we just have a large bank of responses too quest related items. Each question might have three or four correct answers which give a helpful response but anything else could pull from a random bank and give a global answer instead of a heavily generic answer. Now if you have morality system on the back end or front end, we can weigh the Markov chain responses more according to if you are likes or not. Markov chains are already being tested for co-op play with a player and NPC. This study created by Truong-Huy Dinh Nguyen, David Hsu, Wee-Sun Lee and Tze-Yun Leong from the University of Singapore have been looking at the complex problem of human and machine interaction. (Source: http://arxiv.org/pdf/1206.5928.pdf) ‘In this paper, we study the problem of creating NPCs that are able to help players play collaborative games. The main difficulties in creating NPC helpers are to understand the intention of the human player and to work out how to assist the player.’ ‘Traditionally, the behaviour of Non-Player Characters (NPCs) in games is hand-crafted by programmers using techniques such as Hierarchical Finite State Machines (HFSMs) and Behaviour Trees (Champandard 2007). These techniques sometimes suffer from poor behaviour in scenarios that have not been anticipated by the programmer during game construction. In contrast, techniques such as Hierarchical Task Networks (HTNs) or Goal-Oriented Action Planner (GOAP) (Orkin 2004) specify goals for the NPCs and use planning techniques to search for appropriate actions, alleviating some of the difficulties of having to anticipate all possible scenarios.’ The conclusion was near human level performance with MDP’s. The main fall back was the state space size. As game engines become more complex, so do the number of variables and calculations needed to produce the correct result. The best application I can see for this study would be integration with Portal 2’s Coop robot mode. If a machine can help out a human to solve complex level problems and understands the humanized ‘PING’ system in place already, then this would be a real leap forward in realistic AI properties in video games. It’s not all doom and gloom though. There are games out there that have figured out this problem in quite interesting ways. Façade, is a game based around a dinner conversation with a troubled couple. Your interactions through your own writings has a large outcome on the ending. (Source: http://www.interactivestory.net/) The characters show real facial emotion. It’s an AI based art/research experiment in an attempt to move beyond traditional branching or hyper-linked narrative to creatively drive a fully realized one act interactive drama. The game has won massive applauds from all over the world as the next step in video games. (Play though: http://www.youtube.com/watch?v=WkPVcb6KMF 4). As a game, it has its problems, but as an experiment, it hits all bases of being a great step forward for other games to show them that it can be done. Another game that took this concept and improved on it is ‘221b’. A Sherlock Homes game that was meant to come out during the last movie but was dropped at the last moment of development. The game is designed by Rollo Carpenter, a two-time winner of the Loebner Prize, a competition that challenges computer scientists to build programmes capable of convincingly human conversations. (Source: http://news.bbc.co.uk/2/hi/8426523.stm) The player interrogates an NPC using his own words and the AI comes back with a response. Rather than making full lists of responses, the game uses Fuzzy logic interpretation. In the background, the player is kept on the right path by allowing not too many random responses and drives more correct questioning. Another game that is on that path is Scribblenauts. In this game, the player can type in anything they want and the game will create it. If you want a flying pretty blue pig? It has it. The player uses crazy combinations of items that they create from typing the descriptions out in order to solve problems in game. The range of items that can be created and with the fidelity they can be created is staggering. "The game has 22,000 plus words and has attempted to implement all the possible interactions. Put Death up against God, for example, and you get an interesting surprise." So is it all worth it? If we had an NPC that had humanized conversational skills, could interoperate movement and player engagement. Is the development process worth it? Bertil Jeppsson, a computer science major, wrote a paper on AIControlled life in Role-playing Games. ‘Will more realistic behaviour among non-playing characters (NPCs) in a role-playing game (RPG) improve the overall feeling of the game for the player? Would players notice the enhanced life of a NPC in a roleplaying game, or is the time spent in cities and villages insufficient to notice any difference at all? There are plenty best-selling RPGs with simplistic, repetitive NPC behaviour on the market. Does that mean that smarter NPCs is not necessary and that an improvement of them wouldn't benefit the players' impression of it? Or would some of these well recognised games get even better with a more evolved AI?’ The conclusion is that yes, it is worth it but the definition of “Smart AI” and realism in games is with different people and their thoughts on the process. There are advances in AI for games. More games are producing better understanding programs that work for the players benefit. Hopefully in the next few years, these techniques can all come together and produce a more realistic/humanized living world Is the Turing Test going to be a viable test once these methods have been added to video games? I personally feel that the standard for evaluation will need to change to accommodate high fidelity modelling. With larger polygon counts that ever before and the added motion capture of facial distinction, will we have to add rule sets for the most common indication of human emotion? Nvidia’s facial modelling and real time rendering of expression (Source: http://www.youtube.com/watch?v=CvaGd4KqlvQ) Have come such a long way that we can now show, in real time, a large range of emotions for NPCs in video games. How would the Turing test encompass this humanoid behaviour? Sources: [1]. IBM’s Watson AI playing Jeopardy! Extract from the first show. http://www.youtube.com/watch?v=seNkjYyG3gI [2]. Wiki description of the game Catherine. [3]. Youtube playthrough of Catherine with demonstration of texting mechanic. (28:36) Nvidia’s face model real time renerhttp://www.youtube.com/watch?v=CvaGd4K qlvQ http://staff.science.uva.nl/~otsur/papers/sarcasmAm azonICWSM10.pdf http://news.bbc.co.uk/2/hi/8426523.stm http://www.bth.se/fou/cuppsats.nsf/all/91f4ce512fb 90b71c12574650047f273/$file/bachelor_thesis_rev ised_bertil_jeppsson.pdf http://www.speechfxinc.com/video-games.html http://www.tazti.com/speech-recognition-softwarefor-pc-games.html http://arxiv.org/pdf/1206.5928.pdf http://www.youtube.com/watch?v=WkPVcb6KMF 4 Catherine http://stackoverflow.com/questions/2824143/deter mining-what-action-an-npc-will-take-when-it-ispartially-random-but-influe http://en.wikipedia.org/wiki/Markov_chain http://www.gamedev.net/topic/18584-rpg-npc-aischeduling-and-needs/ http://www.youtube.com/watch?v=CvaGd4KqlvQ