Conversational game techniques for AI in video

Transcrição

Conversational game techniques for AI in video
Conversational Techniques and the
Advancement of AI in Video Games.
Sean Kevin Farrell (Author)
Marco Hellmich (Author)
University of Limerick
Video Games Development
Limerick, Ireland
[email protected]
University of Limerick
Video Games Development
Limerick, Ireland
Abstract – An investigation into the tools and
techniques used in modern artificial intelligence
for conversations inside video games.
I.
INTRODUCTION
I am obsessed with virtual worlds. Everything
about them seems magical to me. As if recreating a
living world that I can walk around and interact
with. We have gotten to the stage where our
designs for environments are good enough that
players feel huge amounts of immersion and
exploration is encouraged. The one area that has
not grown with world detail is NPC interactions.
They are stale, built on old mechanics and in need
of innovation. How do we do this and how do we
create meaningful conversations in video games?
II.
THE BEGINGING
Right now, the player will talk to an NPC and a list
of options will be available for questioning. Once
you move through a decision tree and ask “Have
you heard any rumours? “ ten times, the
conversations become more of a tedious task to
cokes information out of dead soles that aimlessly
walk the world for no reason. What if the
conversations and interactions between the NPCS
and the player were life like? Were if you asked the
same question more than once, they would call you
out on it?
Over the years there has been many iterations of
Communications in games over the year. In the
most basic form a commutation AI was set to
trigger when a user/player has performed an
arbitrary task, such as reaching a point in a level or
activating an item in game. In very early games
with 8bit sounds and graphics the AI's
commutation would be as simple as flashing arrow
or text to tell the player which direction to go once
the player had reached a point.
When games developed and technology
became more advance, they were able to use more
complete methods to communicate to the player.
Adopting sound and even video. While this was a
new experience for the player there was little
difference in how the AI worked.
Another alternative to this is the
conversation options. The player is presented with
a number of speech options to select from, each
will lead to a different responses. Factors such as
the player’s speech skill, alignment or the in game
characters disposition to the player can either
display new speech options or have a percentage
change of pick between different outcomes. Having
a specific item in the player’s inventory can also
add a new speech options. The above are common
AI mechanics for RPG styled games or dating
simulators.
III.
MOST USED CONCEPT
The next step in AI communications was
the Passive commutation. This is where game
characters who are present in the world will talk to
the player when their character is within a
predetermined proximity. These comments are
normally greetings or recognition of the play
action/status, disposition, alignment or even an
acknowledgement of an item in the player’s
inventory; such as referring to a hat the player is
wearing. An example of how an NPC can react to
the player is if a player in an RPG if a player
alignment is evil the NPC could run or say specific
dialog. NPC will also react to each other and start
conversations with each other. At this point you
may notice none of these are actually that different.
They are all based on a simple trigger system. Has
play done [action]? [yes/no] respond with [action]!
IV.
CLEVERBOT
So now let's look at something else, a program
called Cleverbot, while not exactly a game
Cleverbot is a fantastic example of a compunction
AI. The below image is an example of a
conversation I had with Cleverbot.
This AI is degree capable of reproducing
conversion. It picks key words from what is said to
it to try and create a response that will be
appropriate.
Cleverbot is even capable to recoding new data and
increasing its vocabulary and understating of
responses.
For more about Cleverbot you can see this article
http://singularityhub.com/2010/01/13/cleverbotchat-engine-is-learning-from-the-internet-to-talklike-a-human/
The AI Cleverbot was given the Turing test and
was judged to be about 59% human, although when
I questioned the AI, it stated that it did not know
what the Turing test was and then began to accuse
me of being an AI. More can be found here about
the CleverBot and the Turing test:
http://www.cleverbot.com/human
However this is a report about game AI, and
Cleverbot is not exactly a game, but can we see
elements of what Cleverbot is used in a game
environment? Well yes it can, well to an extent
anyway. Let’s talk about a game called Façade.
The idea behind this game is to interact with two
NPC in a house using the keyboard to type to them.
In a way this game is similar to Cleverbot but it is
able to be more accurate as the game has the
advantage of context. Cleverbot is required to come
up it an answer to what the user types and has no
frame of reference as to what the user maybe
talking about, Facade on the other hand can use the
current conversation, the player actions and
scenario setting to diver how it interprets the
player's input. The way this game uses user input to
communicate while inaccurate or limited at times is
ha huge step in how AI commutation may be
improved in the future. If this type of technology is
improved further.
V.
WHERE ARE WE NOW?
The implementation of learning and conversation
machines between a computer and a user have been
slowly improving ever sense the first Zork game
was created. The highest level of this and largest
innovation is a little further from games that I
would hope but innovation none the less. Watson –
IBM’s computer that plays the game of Jeopardy
[1]. The program takes a text file input of the
answer then calculates what the answer is to that
question. If the confidence is lower than the
threshold of risk, then the program will not answer
the question.
What implications does this have on video games?
Well what if you could type what you want to ask
to an NPC in the world and it responds to what you
correctly wrote? Some games right now can do
this. One game genre is dating games. ‘Catherine is
a puzzle-platformer adventure game in which
players control Vincent Brooks, who begins having
strange nightmares after his girlfriend, Katherine,
begins to talk about marriage and commitment.
This matter becomes more complicated for him
when he meets a girl named Catherine, and begins
an affair with her, and the nightmares get more and
more intense’.[2] Awkward gameplay mechanics
and story aside, Catherine has a system of text
messaging were the player can choose from a few
predetermined lines of text and create their own
reply to the message. A morality meter will give
feedback on how well or how poorly your response
was received. (28:29 for demonstration and
explanation in game: [3])
This works in a lot of ways. It still plays off of the
logic/decision tree system that basically everyone
uses for game logic in any game where an NPC
interacts with a player, but it also has a strange
human element in it. How many times have you
wrote something in a text message and then deleted
it for something else? It’s like you want to say how
you really feel but you know that in the long run it
won’t work out for you. Well that’s how this feels
with the added morality system. You find yourself
being nice even though you don’t really want to be.
How amazing would a detective game be if you
could talk to a character and interrogate them fully?
Of course this takes a lot of resources and facial
modelling to show true emotion but we are very
close to that fidelity. At the Hebrew University in
Jerusalem, Israel, Oren Tsur and others have
created an AI that understands sarcasm. ‘The
inherently ambiguous nature of sarcasm sometimes
makes it hard even for humans to decide whether
an utterance is sarcastic or not. In this paper we
present a novel algorithm for automatic
identification of sarcastic sentences in product
reviews’. (Source:
http://staff.science.uva.nl/~otsur/papers/sarcasmAm
azonICWSM10.pdf)
With a data set of 66000 reviews from various
sources they achieved an 83.1% evaluation set
containing newly discovered sarcastic sentences.
Being able to define a key characteristic of human
interaction and create an algorithm that understands
it is just another step in the direction of realistic AI
in video games. Being able to walk up to the ugliest
person on the street and tell him/her that they are
beautiful and have them understand that you are
being sarcastic and react to that would be huge leap
in more humanoid programming.
Now would it be interesting if you didn’t have to
type at all to your NPC companion? Integrating
software like SpeechFX (Source:
http://www.speechfxinc.com/video-games.html)
might help bridge this gap.
‘SpeechFX VoiceGaming SDK is our awardwinning software solution for video game voice
command and control. SFX VoiceIn is the
industry's most memory-efficient voice interface.
Game developers worldwide can now build games
that use a common API set across all game
platforms.’ It’s been used in many well-known
games such as Tom Clancy’s Rainbow Six Vegas 1
and 2 as well as Endwar and HAWX. What if you
could walk up to a guard in Skyrim and ask him
where the nearest blacksmith is and he would
understand and give you the answer back? What
are the implications on immersion then? Other
programs like Tazti (Source:
http://www.tazti.com/demo_video.php) Are still a
step in the same direction but making it possible to
use for all games.
How do we solve the issue that decision and logic
trees walk around? The fact of the matter is, if we
give the power of all possibilities of questioning to
the player, how do we fix the “I have limited
responses, you must ask the right questions”
conundrum? One way of looking at this problem is
Markov Decision Processes or a Markov Chain. A
Markov chain is a mathematical system that
transitions from one state to another along a finite
or countable number of possible states. It’s a
random process and the next state always depends
only on the current state and not the sequence of
events that preceded it, like a decision or logic tree.
Markov chains are widely used in statistical models
of real-world processes. Markov chains can solve
the problems of large amounts of variables that
determine outcome based on player behaviour. In
speech, it could mean that we just have a large
bank of responses too quest related items. Each
question might have three or four correct answers
which give a helpful response but anything else
could pull from a random bank and give a global
answer instead of a heavily generic answer. Now if
you have morality system on the back end or front
end, we can weigh the Markov chain responses
more according to if you are likes or not.
Markov chains are already being tested for co-op
play with a player and NPC. This study created by
Truong-Huy Dinh Nguyen, David Hsu, Wee-Sun
Lee and Tze-Yun Leong from the University of
Singapore have been looking at the complex
problem of human and machine interaction.
(Source: http://arxiv.org/pdf/1206.5928.pdf) ‘In
this paper, we study the problem of creating NPCs
that are able to help players play collaborative
games. The main difficulties in creating NPC
helpers are to understand the intention of the
human player and to work out how to assist the
player.’
‘Traditionally, the behaviour of Non-Player
Characters (NPCs) in games is hand-crafted by
programmers using techniques such as
Hierarchical Finite State Machines (HFSMs) and
Behaviour Trees (Champandard 2007). These
techniques sometimes suffer from poor behaviour
in scenarios that have not been anticipated by the
programmer during game construction. In contrast,
techniques such as Hierarchical Task Networks
(HTNs) or Goal-Oriented Action Planner (GOAP)
(Orkin 2004) specify goals for the NPCs and use
planning techniques to search for appropriate
actions, alleviating some of the difficulties of
having to anticipate all possible scenarios.’
The conclusion was near human level performance
with MDP’s. The main fall back was the state space
size. As game engines become more complex, so
do the number of variables and calculations needed
to produce the correct result.
The best application I can see for this study would
be integration with Portal 2’s Coop robot mode. If a
machine can help out a human to solve complex
level problems and understands the humanized
‘PING’ system in place already, then this would be
a real leap forward in realistic AI properties in
video games.
It’s not all doom and gloom though. There are
games out there that have figured out this problem
in quite interesting ways. Façade, is a game based
around a dinner conversation with a troubled
couple. Your interactions through your own
writings has a large outcome on the ending.
(Source: http://www.interactivestory.net/) The
characters show real facial emotion. It’s an AI
based art/research experiment in an attempt to
move beyond traditional branching or hyper-linked
narrative to creatively drive a fully realized one act
interactive drama. The game has won massive
applauds from all over the world as the next step in
video games. (Play though:
http://www.youtube.com/watch?v=WkPVcb6KMF
4). As a game, it has its problems, but as an
experiment, it hits all bases of being a great step
forward for other games to show them that it can be
done.
Another game that took this concept and improved
on it is ‘221b’. A Sherlock Homes game that was
meant to come out during the last movie but was
dropped at the last moment of development. The
game is designed by Rollo Carpenter, a two-time
winner of the Loebner Prize, a competition that
challenges computer scientists to build programmes
capable of convincingly human conversations.
(Source: http://news.bbc.co.uk/2/hi/8426523.stm)
The player interrogates an NPC using his own
words and the AI comes back with a response.
Rather than making full lists of responses, the game
uses Fuzzy logic interpretation. In the background,
the player is kept on the right path by allowing not
too many random responses and drives more
correct questioning.
Another game that is on that path is Scribblenauts.
In this game, the player can type in anything they
want and the game will create it. If you want a
flying pretty blue pig? It has it. The player uses
crazy combinations of items that they create from
typing the descriptions out in order to solve
problems in game. The range of items that can be
created and with the fidelity they can be created is
staggering. "The game has 22,000 plus words and
has attempted to implement all the possible
interactions. Put Death up against God, for
example, and you get an interesting surprise."
So is it all worth it? If we had an NPC that had
humanized conversational skills, could interoperate
movement and player engagement. Is the
development process worth it? Bertil Jeppsson, a
computer science major, wrote a paper on AIControlled life in Role-playing Games. ‘Will more
realistic behaviour among non-playing characters
(NPCs) in a role-playing game (RPG) improve the
overall feeling of the game for the player? Would
players notice the enhanced life of a NPC in a roleplaying game, or is the time spent in cities and
villages insufficient to notice any difference at all?
There are plenty best-selling RPGs with simplistic,
repetitive NPC behaviour on the market. Does that
mean that smarter NPCs is not necessary and that
an improvement of them wouldn't benefit the
players' impression of it? Or would some of these
well recognised games get even better with a more
evolved AI?’
The conclusion is that yes, it is worth it but the
definition of “Smart AI” and realism in games is
with different people and their thoughts on the
process. There are advances in AI for games. More
games are producing better understanding
programs that work for the players benefit.
Hopefully in the next few years, these techniques
can all come together and produce a more
realistic/humanized living world
Is the Turing Test going to be a viable test once
these methods have been added to video games?
I personally feel that the standard for evaluation
will need to change to accommodate high fidelity
modelling. With larger polygon counts that ever
before and the added motion capture of facial
distinction, will we have to add rule sets for the
most common indication of human emotion?
Nvidia’s facial modelling and real time rendering
of expression (Source:
http://www.youtube.com/watch?v=CvaGd4KqlvQ)
Have come such a long way that we can now show,
in real time, a large range of emotions for NPCs in
video games. How would the Turing test
encompass this humanoid behaviour?
Sources:
[1]. IBM’s Watson AI playing Jeopardy! Extract
from the first show.
http://www.youtube.com/watch?v=seNkjYyG3gI
[2]. Wiki description of the game Catherine.
[3]. Youtube playthrough of Catherine with
demonstration of texting mechanic. (28:36)
Nvidia’s face model real time
renerhttp://www.youtube.com/watch?v=CvaGd4K
qlvQ
http://staff.science.uva.nl/~otsur/papers/sarcasmAm
azonICWSM10.pdf
http://news.bbc.co.uk/2/hi/8426523.stm
http://www.bth.se/fou/cuppsats.nsf/all/91f4ce512fb
90b71c12574650047f273/$file/bachelor_thesis_rev
ised_bertil_jeppsson.pdf
http://www.speechfxinc.com/video-games.html
http://www.tazti.com/speech-recognition-softwarefor-pc-games.html
http://arxiv.org/pdf/1206.5928.pdf
http://www.youtube.com/watch?v=WkPVcb6KMF
4
Catherine
http://stackoverflow.com/questions/2824143/deter
mining-what-action-an-npc-will-take-when-it-ispartially-random-but-influe
http://en.wikipedia.org/wiki/Markov_chain
http://www.gamedev.net/topic/18584-rpg-npc-aischeduling-and-needs/
http://www.youtube.com/watch?v=CvaGd4KqlvQ