w2mind.computing.dcu.ie -> tyrrell's simulated environment -> description
1. Introduction
Tyrrell's Simulated Environment provides a model of a creature in a well populated environment, where the creature is presented with the problem of surviving long enough to mate sufficiently often to assist the survival of its species. The environment is also populated with feline and avian predators, as well as prey, potential mates and other "irrelevant" animals that fall into none of the above categories. Fruit and cereal food is freely available in the environment, as is water. All the different types of food including prey, provide the animal with its requirements of protein, fat and carbohydrate. Should the animal fail to eat or drink enough, fail to clean itself enough or encounter a dangerous place such as a swamp, or meet a predator its health will suffer, and it will not live as long as it would otherwise have done. The problem then for the animal is to try to satisfy each of its goals to a sufficient degree to avoid suffering ill health. This is made all the more difficult by the fact that the animal has limited perceptions, which are often incorrect, and a fuzzy memory of where it is and has been.
A snapshot of the animal in his environment is given in the image below:
|
|
The full text of Tyrrell's PhD thesis, which describes the Simulated Environment in full, is available here.
The remainder of this page provides a summary of the information required in order to understand the structure of the environment and the nature of the animal's goals.
Tyrrell states that the basic task of an animal's brain can be divided into four parts:
The model of the animal provided by the Simulated Environment software provides the animal's perception, navigation and motor control. It is then left as a task to build the action selection mechanism.
2.1 Perception
The perception process is described in sections 2.5 and 3.3 of Tyrrell's thesis. The following is a brief extract from the thesis.
"This process transforms a large amount of low-level sensory data (e.g. signals from numerous different smell receptors, signals from retinal rods and cones) into a higher-level description of the environment in terms of features and their positions. Animals use a wide variety of different senses to obtain information about the environment (e.g. sight, sound, taste, smell, touch, echo-location, heat-detection). The main assumption made here is that the animal, using whatever senses, can only sense a fairly restricted area around it (i.e. a local area of the environment). The model used here is of an error-prone process. For every feature in the area of the SE local to the animal, a probability of incorrect perception is calculated. If a randomly distributed number between 0.0 and 1.0 is less than this probability then the perception of the feature is distorted and the animal will either perceive nothing there or mistakenly perceive a different type of feature. In some cases the animal will also perceive features where none exist. The probability of incorrect perception is affected by intervening vegetation, distance from the animal, the selected action of the animal, and the time of day.
The animal can also sense the values of its internal variables (e.g. food and water deficits and body temperature), again with a degree of noise built into the perception."
2.2 Navigation
The navigation process is described in sections 2.5 and 3.4 of Tyrrell's thesis. The following is a brief extract from the thesis.
"The term navigation is used here to cover the creation and usage of a ‘map’ (a collection of memories of the positions and attributes of various features), as well as the estimation by the animal of its current position, and the calculation of how to move towards a remembered feature. The animal forms error-prone memories of features based on its perception of them and on the animal’s estimate of its own position when it encounters them. The variance of the animal’s estimate of its position increases each time it moves, but can decrease when the animal encounters recognised features. The memories will become more accurate as the animal visits the features more often.
...although the navigational inputs to an ASM will change through the animal’s lifetime as it becomes more acquainted with the layout of its environment, this does not mean that any learning takes place...
There is a limit to how many features can be remembered at any one time. The strength of a memory is dependent on the utility of the feature, the number of times it has been visited and how long since it was last visited. Lower strength memories can be be removed from the map (be ‘forgotten’) as newer, stronger ones are added. The animal can use its memories of features to head towards a remembered feature, though the likelihood of finding it depends on how well it has remembered the feature’s position and on how well it knows its own position, as well as on the existence of hazards between the animal and the feature.
In summary, the map consists of error-prone memories which can be forgotten over time and which can not always be used successfully to find the feature they represent."
2.3 Motor Control
The navigation process is described in sections 2.5 and 3.5 of Tyrrell's thesis. The following is a brief extract from the thesis.
"This process models the transformation from a chosen action into a set of lower level motor commands that bring about movements of the animal’s body. In the model used here the process is once again error-prone. There is a fixed probability of an action not being successfully executed. This probability only increases if the animal is incapacitated (has very bad health). The low-level details of motor control are not simulated, only the end effects of each action (e.g. an increase in the level of the animal’s internal water and a decrease in level of a water source after the animal drinks there, a change in the position of the animal and an increase in the animal’s internal temperature when it moves fast)."
3. Environment
The environment consists of 625 cells arranges into a square of 25x25. Each cell may contain one physical feature, and any number of animal features. The following are the nine physical features that can ocupy a cell.
In addition to the animal itself, there are five other types of animal features.
Each day in the environment consists of 500 timesteps. The first 1/6 and last 1/6 of the day are considered to be night time. Sunrise occupies the first 1/12 of the day after night, and sunset occupies the last 1/12 of the day before night. Morning, midday and afternoon each take one third of the remaining time.

At every point in time the environment is updated as features are added, moved and removed. The animal can perceive the cells in its vicinity up to a radius of three cells. The animal's perception of the cells in its vicinity can be impaired by the time of day or night, the degree to which the cells are obscured by eachother, and the animal's health.
The animal starts with health of 1.0 and will die when its health reaches 0. It's main goal is to mate, but in order to do that it must find a mate, maintain its health and avoid predators. Given the complexity of the environment, and the goals of the animal, the animal is faced with the following twelve problems (the majority of the text below is taken from Tyrrell's thesis).
4.1 Cleaning
This sub-problem models the need of animals in the wild to maintain their feathers, fur or skin in a clean and parasite-free state. Lack of preening or cleaning or grooming can lead to difficulties caused by less effective insulation, infection of wounds, infestation with parasites, etc. The animal is provided with a low-level action CLEAN, and an internal variable cleanliness which can vary in the range 0.0 (maximally dirty/dishevelled/parasite-ridden) to 1.0 (maximally clean). At every timestep cleanliness is decreased by an average of 0.001. On every occasion that the animal selects the action CLEAN then the difference between the current value of cleanliness and 1.0 is reduced by a factor of 0.15. The animal's health is decreased by one half of the difference between 1.0 and cleanliness.
4.2 Obtaining food
Instead of assuming one unitary food variable, three different internal food variables –
fat, carbohydrate and protein – are assumed. Each of the three variables can lie in the range 0.0 (death through lack) to 1.0 (death through surfeit).
In practice the values of those food variables are not allowed to reach 1.0, but rather when any of them exceed a value of 0.75, then vomiting is assumed to occur, i.e. the value returns to 0.75 and the animal incurs a recoverable health penalty. When the value of any of the food variables falls below 0.25 then the animal’s health is decreased by (0.25 - value /
0.25)2. The value of each of the three variables decreases slowly over time when the animal is not eating. At each timestep the amount of decrease is dependent on the action of the animal, with actions such as moving fast causing higher decrease.
The environment has a primitive model of weather, and the values of the cereal type food instances vary in relation to the amount of rain that has occurred over the last few days. The fruit type food is not related to weather, but is cyclic in nature.
When the animal eats from fruit food or cereal food sources then their values are decremented by the amount the animal eats. They then ‘recuperate’ at rates of 0.3 per day and 0.2 per day respectively until they return to their normal values.
Three low-level actions are needed in the animal’s repertoire: EAT_CF,
EAT_ FF and POUNCE (on prey).
4.3 Obtaining water
There is one internal variable, water, which can lie in the range 0.0 (death through lack) to 1.0 (death through surfeit). As with food variables, values greater than 0.75 lead to vomiting and values less than 0.25 lead to a decrement in health of (0.25 - value / 0.25)2. Again, the amount that the value of water decreases each timestep depends on how strenuous an action has been selected.
When an animal drinks from a water source then the water source’s value is decreased by however much the animal drank, after which it can recuperate at the rate of 0.03 every 10 timesteps, until it returns to what its value would have been otherwise. Each water source has a 5% chance of being toxic, in which case it will cause a decrement in the animal’s health of 0.0 – 0.2 some time later.
An extra action of DRINK is added to the animal’s repertoire.
The animal is given an internal variable temperature, which can vary in the range 0.0 (death due to cold) to 1.0 (death due to heat). If temperature is less than 0.25 then the animal’s health is decremented by (0.25 - temperature / 0.25)2. If temperature is greater than 0.75 then the animal’s health is decremented by (temperature - 0.75 / 0.25)2.
The value of the internal variable temperature is dependent on
A further action REST is made available to the animal. This action leads to a lower than average body temperature. Other actions such as MATE and MOVE_FAST lead to higher than average body temperature.
If a predator1 catches the animal (gets in the same square and the animal doesn’t escape) then the
animal’s health will be decremented by a random number between 0.0 and 0.3 (65% chance) or between
0.3 and 1.0 (35% chance). The animal also has a 25% chance of suffering 0.0 – 0.5 permanent injury.
In short, if the animal is caught by a predator1 then it has a good chance of being killed outright,
whatever its current state of health. A predator2 can inflict more damage than a predator 1.
Two new types of animal action are added for this sub-problem: MOVE_FAST in a direction
enables the animal to try and run away from a predator, and FREEZE (become motionless) makes
the animal inconspicuous and, especially when the animal is in vegetation, less likely to be spotted or
kept sight of by the predator.
4.6 Vigilence
The animal needs to scan the environment for predators every so often, especially when it has perceived one or more predators recently. Two new actions LOOK_AROUND and LOOK (in a particular direction) allow the animal to perceive its environment to a greater distance and with improved accuracy. Two new stimuli time_since_last_scan and time_since_predator_last_perceived are used for this sub-problem.
4.7 Hazard avoidance
Dangerous places only have an effect on the animal if it moves into a square containing one of them, in which case the animal has a 40% chance of suffering 0.0 – 0.6 damage to its health, and a 10% chance of suffering 0.6 – 1.0 damage. There is also a 30% chance of 0.0 – 0.5 permanent injury, a 5% chance of 0.5 – 1.0 injury, and a 15% chance of 1.0 injury. In short, there is a very high chance of death or at least very serious injury if the animal enters a square with a dangerous place feature in it.
4.8 Irrelevant animal avoidance
If the animal is in the same square as an irrelevant animal and does not manage to escape (the probability of which is dependent on the action of the animal and on the
thickness of vegetation) then it will incur 0.0 – 0.4 damage to its health and 0.0 – 0.4 permanent injury.
Predators are able to perceive their surroundings normally at night, whereas the animal’s perception becomes progressively worse as night approaches, until
the animal can perceive nothing at all outside of its own square. Because of this severely reduced
perception at night the animal is also liable to be injured or killed by encounters with dangerous
places or irrelevant animals if it moves about, because it cannot see them so as to avoid them. It
is therefore highly advantageous for the animal to spend nights in its den.
Another action, SLEEP, is introduced. This brings about the lowest reduction of food and water per
timestep. Two new indeterminate stimuli, proximity_of_night and distance_from_den
are included.
It is advantageous for the animal to stay in fairly close proximity to cover (or rather, any sort of protection from predators — fruit type food, shade, the animal’s den and cover all offer varying degrees of protection from perception or attack by predators) so that it can head there if a predator is perceived. A new stimulus of distance_from_cover is required.
4.11 Not getting lost
The animal must keep the variance to a minimum in order to be able to find its way back to the den.
4.12 Reproduction
This sub-problem is the only one that does not concern survival, as well as the only one involving
conspecifics, although of the opposite sex. The feature mate moves about the SE and has a 50%
chance of being receptive (i.e. ‘in heat’) at any time (it is assumed to be a female and the animal to
be a male). Two new actions COURT and MATE are added to the animal’s repertoire. If the animal
enters the same square as a mate and performs the action COURT then if the mate is receptive then it
will respond to the courting to show that it is ready to mate. If the animal then performs the further
action MATE then reproduction will be assumed to have occurred and the animal’s genetic fitness
will increase by 1.0.
If the animal tries to mate with a mate that is not receptive or not courted then the mate will attack
the animal. If the animal does not manage to escape then it will suffer 0.0 – 0.25 decrement to health
and have a 5% chance of 0.0 – 0.2 permanent injury.
At every point in time, the environment will provide a state, containing all the stimuli required in order for the animal to select an action. This information includes the animal's perceptions of its environment, its memories and various other stimuli regarding the time of day, and the time since it scanned for predators. This information must be used by authors of mind services in order to select an action from the thirty four low level actions available to the animal. This is described in the page on State and Action.
The World-Wide-Mind service for the world implementation is available as a world service for SOML version 1.0.
The URL of the service that will receive messages using either HTTP POST or HTTP GET is http://w2mind.computing.dcu.ie/services/tyrrell.
The definition of state and action for this service is available here.
|
home (w2mind.computing.dcu.ie), tyrrell's simulated environment, challenge, documentation, tools, blocks world |
|
Ciarán O'Leary, Ciaran.OLeary@comp.dit.ie, www.comp.dit.ie/coleary, Last updated, 8th September 2003 |