Robot interception using Theory of Mind reasoning over BDI Model

From Master Projects
Jump to: navigation, search

has title::Robot interception using Theory of Mind reasoning over BDI Model
status: finished
Master: project within::Computational Intelligence and Selforganisation
Student name: student name::G.C.M. Al
Start start date:=2010/05/01
End end date:=2011/01/24
Supervisor: Mark Hoogendoorn
Company: has company::VU University
Poster: has poster::Media:Poster_Ard.jpg

Signature supervisor



In a room of a certain building (e.g. a floor of a museum) a surveillance robot patrols to protect certain valuable objects (e.g. artifacts) which each have their own value (e.g. amount of money that the artifact is worth). A burglar is also in this room and sets off a silent alarm, which is only noticed by the robot. This alarm only indicates where the thief has entered the building, it will not follow him. The task of the robot is to intercept the thief before he can escape. It will determine a path from its current location to where the thief could go to. This path will be planned using a path planning algorithm. There are limited ways for the thief to escape, only a door and a few windows (the total number of escape possibilities do not have to be specified yet). The task of the thief is to collect as many as valuable objects as possible, preferably the ones with the highest value (the artifacts that are the most worth). So the robot’s task has to be specified more detailed, namely to prevent the thief of collecting the objects and escaping by intercepting him as soon as possible.

The task of the surveillance robot is to intercept the thief before he can escape the building. Preferably, the robot has to intercept the thief as early in time as possible. When the situation in the room is stable, i.e. there is no thief in the building, the robot patrols in the room. Therefore the robot has a map of the room with containing the free spaces, obstacles, the valuable objects and the escape possibilities. A silent alarm goes off when the thief breaks in to the room, which is notified (only) by the robot. This alarm is not noticed by the thief and only indicates the location where the thief has entered the room, which means that the robot only knows that the thief has entered the room by going, e.g., through a particular window. Now it is up to the robot to plan its route to intercept the thief as soon as possible. This will be done using the location where the thief has entered the room (taken into account that the thief does not run through the building, so there is a high chance of him still being in the area of the alarm), the location and weight of the objects and the robot will reason about the internal model of the thief (e.g. a BDI model). The first features are obvious, but the latter will be elaborated more next. The thief does not know the entire map by heart and he knows more or less where the valuable objects are. This means that the thief has to react to the observations that are made in the room. Since he is a thief, he will not have a very extended and complex internal model; he wants to do it as fast as possible. The thief will make certain observations and using these observations, with beliefs, desires and intentions attached to it, he will perform some action(s).

By using the internal model of the thief, the robot can perform a ‘theory of mind’ strategy on this model in order to reason about the possible series of actions the thief can perform. Just as the thief, the robot also has to update its observations after each action it has performed in order to keep up track of the newest information in the world to base its path planning algorithm on. It is not for certain that the thief will perform these actions, because there will be a probabilistic measure attached to each action. For instance, there is no guarantee that the thief will steal a certain object (object 1), it could be the case that the thief first decides to go to another object (object 2), because it has a higher value that the other one. So it could be that the action of the thief going to object 1 has a probabilistic value of 0.6, while the action of first going to the object that has more value, object 2, has a probability of 0.4. By adding such probabilities to the actions the thief can perform, makes it more interesting to see how the robot sets up its own plan by using the algorithm to intercept the thief as quick as possible. But that is not all. The robot also has to take time into account for its path planning. When the thief enters the building (at time step 0) and the robot has reasoned about the thief’s possible series of actions and has determined (according to the possibilities, weights etcetera) that the thief first will steal object 2 (at time step 1), then will steal object 1 (at time step 3) and will steal the third object as last (at time step 6), it has to make sure that it will not arrive too late at the locations of the objects, because then it would be impossible for the robot to intercept the thief. To elaborate this more precisely: if the robot determines that the thief will steal object 2 first and the thief is 1 time step away of this object and the robot itself is, e.g., 2 time steps away from the same object, then the robot must not go to this object, because it will never intercept the thief due to its late arrival. The robot has to reason about this, instead of driving to the object. For instance, if object 1, which according to the robot will be stolen second, is also 2 time steps away from the robot, then it will be more intelligent to go to object 1, because it will take 3 time steps for the thief (instead of 2 for the robot). Then the robot would intercept the thief and so preventing him from stealing more objects and escaping.

There could be a situation where the outcome of the robot’s path planning to an object at a certain time step is equal to the time step of the thief to this object. For instance, the robot reasons that they both arrive at an object at time step 3, which means that the robot will be just in time or just too late to intercept the thief at the object (assuming the thief indeed goes to the object). When the robot arrives at the object and the object is still present, it will intercept the thief at an early time step. What is more interesting to see is how the robot responds to the fact when the object already is stolen, which means that it has just missed the thief. The robot has to determine and reason fast what the following action(s) of the was/where in order to lose as less time as possible. First it needs to update its own observations, beliefs and knowledge. The robot ‘knows’ that the thief has stolen the object where it currently stands and it must assume that he will not be able to intercept the thief by chasing him because he could move faster than the robot itself. Therefore the robot must plan another path, based upon the theory of mind strategy, to cross the thief’s path and intercept him. To see how the robot determines a plan and how it adapts to new information (it already had planned a path in order to intercept the thief, but it can throw this plan away and plan a new path to intercept the thief) could be a contribution to the research fields of artificial intelligence and robotics.