Artificial Intelligence Agents

In the previous lecture, we discussed what we will be talking about in Artificial Intelligence and why those things are important. This lecture is all about how we will be talking about AI, i.e., the language, assumptions and concepts which will be common to all the topics we cover.

These notions should be considered before undertaking any large AI project. Hence, this lecture also serves to add to the systems engineering information you have/will be studying. For AI software/hardware, of course, we have to worry about which programming language to use, how to split the project into modules, etc. However, we also have to worry about higher level notions, such as: what does it mean for our program/machine to act rationally in a particular domain, how will it use knowledge about the environment, and what form will that knowledge take? All these things should be taken into consideration before we worry about actually doing any programming.

2.1 Autonomous Rational Agents

In many cases, it is inaccurate to talk about a single program or a single robot, as the combination of hardware and software in some intelligent systems is considerably more complicated. Instead, we will follow the lead of Russell and Norvig and describe AI through the autonomous, rational intelligent agents paradigm. We're going to use the definitions from chapter 2 of Russell and Norvig's textbook, starting with these two:

We see that the word 'agent' covers humans (where the sensors are the senses and the effectors are the physical body parts) as well as robots (where the sensors are things like cameras and touch pads and the effectors are various motors) and computers (where the sensors are the keyboard and mouse and the effectors are the monitor and speakers).

To determine whether an agent has acted rationally, we need an objective measure of how successful it has been and we need to worry about when to make an evaluation using this measure. When designing an agent, it is important to think hard about how to evaluate it's performance, and this evaluation should be independent from any internal measures that the agent undertakes (for example as part of a heuristic search - see the next lecture). The performance should be measured in terms of how rationally the program acted, which depends not only on how well it did at a particular task, but also on what the agent experienced from its environment, what the agent knew about its environment and what actions the agent could actually undertake.

To summarise, an agent takes input from its environment and affects that environment. The rational performance of an agent must be assessed in terms of the task it was meant to undertake, it's knowledge and experience of the environment and the actions it was actually able to undertake. This performance should be objectively measured independently of any internal measures used by the agent.

In English language usage, autonomy means an ability to govern one's actions independently. In our situation, we need to specify the extent to which an agent's behaviour is affected by its environment. We say that:

At one extreme, an agent might never pay any attention to the input from its environment, in which case, its actions are determined entirely by its built-in knowledge. At the other extreme, if an agent does not initially act using its built-in knowledge, it will have to act randomly, which is not desirable. Hence, it is desirable to have a balance between complete autonomy and no autonomy. Thinking of human agents, we are born with certain reflexes which govern our actions to begin with. However, through our ability to learn from our environment, we begin to act more autonomously as a result of our experiences in the world. Imagine a baby learning to crawl around. It must use in-built information to enable it to correctly employ its arms and legs, otherwise it would just thrash around. However, as it moves, and bumps into things, it learns to avoid objects in the environment. When we leave home, we are (supposed to be) fully autonomous agents ourselves. We should expect similar of the agents we build for AI tasks: their autonomy increases in line with their experience of the environment.

2.2 RHINO, the Museum Tour-Guide Agent

We will mostly be dealing with agents based inside computers, rather than robots based in the real world. However, the museum tour guide robot mentioned in the first lecture offers an ideal example of an autonomous agent which we will use to illustrate various concepts in the rest of this lecture. This robot was called RHINO.

RHINO's job was to inform visitors to the Museum about various exhibits. To do this, it had to perform two main tasks: (i) move safely from exhibit to exhibit and (ii) display information and answer questions about each exhibit it visited. The project was very successful - in an operational time of 47 hours, covering 18.6 kilometres, the software made only one mistake: a minor collision which caused no harm.

2.3 Internal Structure of Agents (Slide 16)

We have looked at agents in terms of their external influences and behaviours: they take input from the environment and perform rational actions to alter that environment. We will now look at some generic internal mechanisms which are common to intelligent agents.

The program of an agent is the mechanism by which it turns input from the environment into an action on the environment. The architecture of an agent is the computing device (including software and hardware) upon which the program operates. On this course, we mostly concern ourselves with the intelligence behind the programs, and do not worry about the hardware architectures they run on. In fact, we will mostly assume that the architecture of our agents is a computer getting input through the keyboard and acting via the monitor.

RHINO consisted of the robot itself, including the necessary hardware for locomotion (motors, etc.) and state of the art sensors, including laser, sonar, infrared and tactile sensors. RHINO also carried around three on-board PC workstations and was connected by a wireless Ethernet connection to a further three off-board SUN workstations. In total, it ran up to 25 different processes at any one time, in parallel. The program employed by RHINO was even more complicated than the architecture upon which it ran. RHINO ran software which drew upon techniques ranging from low level probabilistic reasoning and visual information processing to high level problem solving and planning using logical representations.

An agent's program will make use of knowledge about its environment and methods for deciding which action to take (if any) in response to a new input from the environment. These methods include reflexes, goal based methods and utility based methods.

-Knowledge of Environment (World)

– -Different to sensory information from environment

lWorld knowledge can be (pre)-programmed in

– -Can also be updated/inferred by sensory information

-Using knowledge to inform choice of actions:

– - Use knowledge of current state of the world

– -Use knowledge of previous states of the world

– -Use knowledge of how its actions change the world

-Example: Chess agent

– -World knowledge is the board state (all the pieces)

– -Sensory information is the opponents move

– -It’s moves also change the board state ( previous states, …)

We must distinguish between knowledge an agent receives through it's sensors and knowledge about the world from which the input comes. Knowledge about the world can be programmed in, and/or it can be learned through the sensor input. For example, a chess playing agent would be programmed with the positions of the pieces at the start of a game, but would maintain a representation of the entire board by updating it with every move it is told about through the input it receives. Note that the sensor inputs are the opponent's moves and this is different to the knowledge of the world that the agent maintains, which is the board state.

There are three main ways in which an agent can use knowledge of its world to inform its actions. If an agent maintains a representation of the world, then it can use this information to decide how to act at any given time. Furthermore, if it stores its representations of the world, then it can also use information about previous world states in its program. Finally, it can use knowledge about how its actions affect the world.

The RHINO agent was provided with an accurate metric map of the museum and exhibits beforehand, carefully mapped out by the programmers. Having said this, the layout of the museum changed frequently as routes became blocked and chairs were moved. By updating it's knowledge of the environment, however, RHINO consistently knew where it was, to an accuracy better than 15cm. RHINO didn't move objects other than itself around the museum. However, as it moved around, people followed it, so its actions really were altering the environment. It was because of this (and other reasons) that the designers of RHINO made sure it updated its plan as it moved around.

Reflexes (Slide 20)
- Action on the world (Slide 20)

– -In response only to a sensor input

– -Not in response to world knowledge

- Humans – flinching, blinking

If an agent decides upon and executes an action in response to a sensor input without consultation of its world, then this can be considered a reflex response. Humans flinch if they touch something very hot, regardless of the particular social situation they are in, and this is clearly a reflex action. Similarly, chess agents are programmed with lookup tables for openings and endings, so that they do not have to do any processing to choose the correct move, they simply look it up. In timed chess matches, this kind of reflex action might save vital seconds to be used in more difficult situations later.

Unfortunately, relying on lookup tables is not a sensible way to program intelligent agents: a chess agent would need 35¹⁰⁰ entries in its lookup table (considerably more entries than there are atoms in the universe). And if we remember that the world of a chess agent consists of only 32 pieces on 64 squares, it's obvious that we need more intelligent means of choosing a rational action.

For RHINO, it is difficult to identify any reflex actions. This is probably because performing an action without consulting the world representation is potentially dangerous for RHINO, because people get everywhere, and museum exhibits are expensive to replace if broken!

One possible way to improve an agent's performance is to enable it to have some details of what it is trying to achieve. If it is given some representation of the goal (e.g., some information about the solution to a problem it is trying to solve), then it can refer to that information to see if a particular action will lead to that goal. Such agents are called goal-based. Two tried and trusted methods for goal-based agents are planning (where the agent puts together and executes a plan for achieving its goal) and search (where the agent looks ahead in a search space until it finds the goal). Planning and search methods are covered later in the course.

In RHINO, there were two goals: get the robot to an exhibit chosen by the visitors and, when it gets there, provide information about the exhibit. Obviously, RHINO used information about its goal of getting to an exhibit to plan its route to that exhibit.

Utility Functions (Slide 22)
- Knowledge of a goal may be difficult to pin down

– -For example, checkmate in chess (king can’t move)

- But some agents have localised measures

– -Utility functions measure value of world states

– -Choose action which best improves utility (rational!)

– - In search, this is “Best First”

A goal based agent for playing chess is infeasible: every time it decides which move to play next, it sees whether that move will eventually lead to a checkmate. Instead, it would be better for the agent to assess it's progress not against the overall goal, but against a localised measure. Agent's programs often have a utility function which calculates a numerical value for each world state the agent would find itself in if it undertook a particular action. Then it can check which action would lead to the highest value being returned from the set of actions it has available. Usually the best action with respect to a utility function is taken, as this is the rational thing to do. When the task of the agent is to find something by searching, if it uses a utility function in this manner, this is known as a best-first search.

RHINO searched for paths from its current location to an exhibit, using the distance from the exhibit as a utility function. However, this was complicated by visitors getting in the way.

2.4 Environments

We have seen that intelligent agents should take into account certain information when choosing a rational action, including information from its sensors, information from the world, information from previous states of the world, information from its goal and information from its utility function(s). We also need to take into account some specifics about the environment it works in. On the surface, this consideration would appear to apply more to robotic agents moving around the real world. However, the considerations also apply to software agents which are receiving data and making decisions which affect the data they receive - in this case we can think of the environment as the flow of information in the data stream. For example, an AI agent may be employed to dynamically update web pages based on the requests from internet users.

We follow Russell and Norvig's lead in characterising information about the environment:

Accessibility (Slide 24)
Is everything an agent requires to choose its actions available to it via its sensors?

– -If so, the environment is fully accessible

- If not, parts of the environment are inaccessible

– -Agent must make informed guesses about world

In some cases, certain aspects of an environment which should be taken into account in decisions about actions may be unavailable to the agent. This could happen, for instance, because the agent cannot sense certain things. In these cases, we say the environment is partially inaccessible. In this case, the agent may have to make (informed) guesses about the inaccessible data in order to act rationally.

The builders of RHINO talk about "invisible" objects that RHINO had to deal with. These included glass cases and bars at various heights which could not be detected by the robotic sensors. These are clearly inaccessible aspects of the environment, and RHINO's designers took this into account when designing its programs.

If we can determine what the exact state of the world will be after an agent's action, we say the environment is deterministic. In such cases, the state of the world after an action is dependent only on the state of the world before the action and the choice of action. If the environment is non-deterministic, then utility functions will have to make (informed) guesses about the expected state of the world after possible actions if the agent is to correctly choose the best one.

RHINO's world was non-deterministic because people moved around, and they move objects such as chairs around. In fact, visitors often tried to trick the robot by setting up roadblocks with chairs. This was another reason why RHINO's plan was constantly updated.

If an agent's current choice of action does not depend on its past actions, then the environment is said to be episodic. In non-episodic environments, the agent will have to plan ahead, because it's current action will affect subsequent ones.

Considering only the goal of getting to and from exhibits, the individual trips between exhibits can be seen as episodes in RHINO's actions. Once it had arrived at one exhibit, how it got there would not normally affect its choices in getting to the next exhibit. If we also consider the goal of giving a guided tour, however, RHINO must at least remember the exhibits it had already visited, in order not to repeat itself. So, at the top level, its actions were not episodic.

An environment is static if it doesn't change while an agent's program is making the decision about how to act. When designing agents to operate in dynamic (non-static) environments, the underlying program may have to refer to the changing environment while it deliberates, or to anticipate the change in the environment between the time when it receives an input and when it has to take an action.

RHINO was very fast in making decisions. However, because of the amount of visitor movement, by the time RHINO had planned a route, that plan was sometimes wrong because someone was now blocking the route. However, because of the speed of decision making, instead of referring to the environment during the planning process, as we have said before, the designers of RHINO chose to enable it to continually update its plan as it moved.

- Nature of sensor readings / choices of action (Slide 28)

–- Sweep through a range of values (continuous)

–- Limited to a distinct, clearly defined set (discrete)

The nature of the data coming in from the environment will affect how the agent should be designed. In particular, the data may be discrete (composed of a limited number of clearly defined parts) or continuous (seemingly without discernible sections). Of course, given the nature of computer memory (in bits and bytes), even streaming video can be shoe-horned into the discrete category, but an intelligent agent will probably have to deal with this as if it is continuous. The mathematics in your agent's programs will differ depending on whether the data is taken to be discrete or continuous.

Lecture 2 Artificial Intelligence Agents

2.1 Autonomous Rational Agents

2.2 RHINO, the Museum Tour-Guide Agent

2.3 Internal Structure of Agents (Slide 16)

2.4 Environments

Lecture 2
Artificial Intelligence Agents