Lecture 2
|
Al Capone was finally convicted for tax evasion. Were the police acting rationally? To answer this, we must first look at how the performance of police forces is viewed: arresting and convicting the people who have committed a crime is a start, but their success in getting criminals off the street is also a reasonable, if contentious, measure. Given that they didn't convict Capone for the murders he committed, they failed on that measure. However, they did get him off the street, so they succeeded there. We must also look at the what the police knew and what they had experienced about the environment: they had experienced murders which they knew were undertaken by Capone, but they had not experienced any evidence which could convict Capone of the murders. However, they had evidence of tax evasion. Given the knowledge about the environment that they can only arrest if they have evidence, their actions were therefore limited to arresting Capone on tax evasion. As this got him off the street, we could say they were acting rationally. This answer is controversial, and highlights the reason why we have to think hard about how to assess the rationality of an agent before we consider building it. |
To summarise, an agent takes input from its environment and affects that environment. The rational performance of an agent must be assessed in terms of the task it was meant to undertake, it's knowledge and experience of the environment and the actions it was actually able to undertake. This performance should be objectively measured independently of any internal measures used by the agent.
In English language usage, autonomy means an ability to govern one's actions independently. In our situation, we need to specify the extent to which an agent's behaviour is affected by its environment. We say that:
At one extreme, an agent might never pay any attention to the input from its environment, in which case, its actions are determined entirely by its built-in knowledge. At the other extreme, if an agent does not initially act using its built-in knowledge, it will have to act randomly, which is not desirable. Hence, it is desirable to have a balance between complete autonomy and no autonomy. Thinking of human agents, we are born with certain reflexes which govern our actions to begin with. However, through our ability to learn from our environment, we begin to act more autonomously as a result of our experiences in the world. Imagine a baby learning to crawl around. It must use in-built information to enable it to correctly employ its arms and legs, otherwise it would just thrash around. However, as it moves, and bumps into things, it learns to avoid objects in the environment. When we leave home, we are (supposed to be) fully autonomous agents ourselves. We should expect similar of the agents we build for AI tasks: their autonomy increases in line with their experience of the environment.
We will mostly be dealing with agents based inside computers, rather than robots based in the real world. However, the museum tour guide robot mentioned in the first lecture offers an ideal example of an autonomous agent which we will use to illustrate various concepts in the rest of this lecture. This robot was called RHINO.
RHINO's job was to inform visitors to the Museum about various exhibits. To do this, it had to perform two main tasks: (i) move safely from exhibit to exhibit and (ii) display information and answer questions about each exhibit it visited. The project was very successful - in an operational time of 47 hours, covering 18.6 kilometres, the software made only one mistake: a minor collision which caused no harm.
We have looked at agents in terms of their external influences and behaviours: they take input from the environment and perform rational actions to alter that environment. We will now look at some generic internal mechanisms which are common to intelligent agents.
The program of an agent is the mechanism by which it turns input from the environment into an action on the environment. The architecture of an agent is the computing device (including software and hardware) upon which the program operates. On this course, we mostly concern ourselves with the intelligence behind the programs, and do not worry about the hardware architectures they run on. In fact, we will mostly assume that the architecture of our agents is a computer getting input through the keyboard and acting via the monitor.
RHINO consisted of the robot itself, including the necessary hardware for locomotion (motors, etc.) and state of the art sensors, including laser, sonar, infrared and tactile sensors. RHINO also carried around three on-board PC workstations and was connected by a wireless Ethernet connection to a further three off-board SUN workstations. In total, it ran up to 25 different processes at any one time, in parallel. The program employed by RHINO was even more complicated than the architecture upon which it ran. RHINO ran software which drew upon techniques ranging from low level probabilistic reasoning and visual information processing to high level problem solving and planning using logical representations.
An agent's program will make use of knowledge about its environment and methods for deciding which action to take (if any) in response to a new input from the environment. These methods include reflexes, goal based methods and utility based methods.
We must distinguish between knowledge an agent receives through it's sensors and knowledge about the world from which the input comes. Knowledge about the world can be programmed in, and/or it can be learned through the sensor input. For example, a chess playing agent would be programmed with the positions of the pieces at the start of a game, but would maintain a representation of the entire board by updating it with every move it is told about through the input it receives. Note that the sensor inputs are the opponent's moves and this is different to the knowledge of the world that the agent maintains, which is the board state.
There are three main ways in which an agent can use knowledge of its world to inform its actions. If an agent maintains a representation of the world, then it can use this information to decide how to act at any given time. Furthermore, if it stores its representations of the world, then it can also use information about previous world states in its program. Finally, it can use knowledge about how its actions affect the world.
The RHINO agent was provided with an accurate metric map of the museum and exhibits beforehand, carefully mapped out by the programmers. Having said this, the layout of the museum changed frequently as routes became blocked and chairs were moved. By updating it's knowledge of the environment, however, RHINO consistently knew where it was, to an accuracy better than 15cm. RHINO didn't move objects other than itself around the museum. However, as it moved around, people followed it, so its actions really were altering the environment. It was because of this (and other reasons) that the designers of RHINO made sure it updated its plan as it moved around.
If an agent decides upon and executes an action in response to a sensor input without consultation of its world, then this can be considered a reflex response. Humans flinch if they touch something very hot, regardless of the particular social situation they are in, and this is clearly a reflex action. Similarly, chess agents are programmed with lookup tables for openings and endings, so that they do not have to do any processing to choose the correct move, they simply look it up. In timed chess matches, this kind of reflex action might save vital seconds to be used in more difficult situations later.
Unfortunately, relying on lookup tables is not a sensible way to program intelligent agents: a chess agent would need 35100 entries in its lookup table (considerably more entries than there are atoms in the universe). And if we remember that the world of a chess agent consists of only 32 pieces on 64 squares, it's obvious that we need more intelligent means of choosing a rational action.
For RHINO, it is difficult to identify any reflex actions. This is probably because performing an action without consulting the world representation is potentially dangerous for RHINO, because people get everywhere, and museum exhibits are expensive to replace if broken!
One possible way to improve an agent's performance is to enable it to have some details of what it is trying to achieve. If it is given some representation of the goal (e.g., some information about the solution to a problem it is trying to solve), then it can refer to that information to see if a particular action will lead to that goal. Such agents are called goal-based. Two tried and trusted methods for goal-based agents are planning (where the agent puts together and executes a plan for achieving its goal) and search (where the agent looks ahead in a search space until it finds the goal). Planning and search methods are covered later in the course.
In RHINO, there were two goals: get the robot to an exhibit chosen by the visitors and, when it gets there, provide information about the exhibit. Obviously, RHINO used information about its goal of getting to an exhibit to plan its route to that exhibit.
A goal based agent for playing chess is infeasible: every time it decides which move to play next, it sees whether that move will eventually lead to a checkmate. Instead, it would be better for the agent to assess it's progress not against the overall goal, but against a localised measure. Agent's programs often have a utility function which calculates a numerical value for each world state the agent would find itself in if it undertook a particular action. Then it can check which action would lead to the highest value being returned from the set of actions it has available. Usually the best action with respect to a utility function is taken, as this is the rational thing to do. When the task of the agent is to find something by searching, if it uses a utility function in this manner, this is known as a best-first search.
RHINO searched for paths from its current location to an exhibit, using the distance from the exhibit as a utility function. However, this was complicated by visitors getting in the way.
We have seen that intelligent agents should take into account certain information when choosing a rational action, including information from its sensors, information from the world, information from previous states of the world, information from its goal and information from its utility function(s). We also need to take into account some specifics about the environment it works in. On the surface, this consideration would appear to apply more to robotic agents moving around the real world. However, the considerations also apply to software agents which are receiving data and making decisions which affect the data they receive - in this case we can think of the environment as the flow of information in the data stream. For example, an AI agent may be employed to dynamically update web pages based on the requests from internet users.
We follow Russell and Norvig's lead in characterising information about the environment:
In some cases, certain aspects of an environment which should be taken into account in decisions about actions may be unavailable to the agent. This could happen, for instance, because the agent cannot sense certain things. In these cases, we say the environment is partially inaccessible. In this case, the agent may have to make (informed) guesses about the inaccessible data in order to act rationally.
The builders of RHINO talk about "invisible" objects that RHINO had to deal with. These included glass cases and bars at various heights which could not be detected by the robotic sensors. These are clearly inaccessible aspects of the environment, and RHINO's designers took this into account when designing its programs.
If we can determine what the exact state of the world will be after an agent's action, we say the environment is deterministic. In such cases, the state of the world after an action is dependent only on the state of the world before the action and the choice of action. If the environment is non-deterministic, then utility functions will have to make (informed) guesses about the expected state of the world after possible actions if the agent is to correctly choose the best one.
RHINO's world was non-deterministic because people moved around, and they move objects such as chairs around. In fact, visitors often tried to trick the robot by setting up roadblocks with chairs. This was another reason why RHINO's plan was constantly updated.
If an agent's current choice of action does not depend on its past actions, then the environment is said to be episodic. In non-episodic environments, the agent will have to plan ahead, because it's current action will affect subsequent ones.
Considering only the goal of getting to and from exhibits, the individual trips between exhibits can be seen as episodes in RHINO's actions. Once it had arrived at one exhibit, how it got there would not normally affect its choices in getting to the next exhibit. If we also consider the goal of giving a guided tour, however, RHINO must at least remember the exhibits it had already visited, in order not to repeat itself. So, at the top level, its actions were not episodic.
An environment is static if it doesn't change while an agent's program is making the decision about how to act. When designing agents to operate in dynamic (non-static) environments, the underlying program may have to refer to the changing environment while it deliberates, or to anticipate the change in the environment between the time when it receives an input and when it has to take an action.
RHINO was very fast in making decisions. However, because of the amount of visitor movement, by the time RHINO had planned a route, that plan was sometimes wrong because someone was now blocking the route. However, because of the speed of decision making, instead of referring to the environment during the planning process, as we have said before, the designers of RHINO chose to enable it to continually update its plan as it moved.
The nature of the data coming in from the environment will affect how the agent should be designed. In particular, the data may be discrete (composed of a limited number of clearly defined parts) or continuous (seemingly without discernible sections). Of course, given the nature of computer memory (in bits and bytes), even streaming video can be shoe-horned into the discrete category, but an intelligent agent will probably have to deal with this as if it is continuous. The mathematics in your agent's programs will differ depending on whether the data is taken to be discrete or continuous.
RHINO's data came from 3d space, hence was considered continuous.
The word 'agent' is extremely popular in AI at the moment, and you will come across research on multi-agent systems. This approach to AI is to break a task into subtasks which have to be undertaken simultaneously, and give each subtask to a different autonomous agent. The agents can communicate in order to co-operate and compete on their tasks. This approach has been shown to be very effective on certain problems and is currently very influential in AI. I recommend Mike Wooldridge's book: An Introduction to MultiAgent Systems as an excellent introductory text. |