What is Cybernetics?


Cybernetics is the interdisciplinary study of the Structure of Regulatory system. Cybernetics is closely related to control theory and systems theory. Both in its origins and in its evolution in the second-half of the 20th century, cybernetics is equally applicable to physical and social (that is, language-based) systems.

Contemporary cybernetics began as an interdisciplinary study connecting the fields of control systems, electrical network theory, mechanical engineering, logic modeling, evolutionary biology, neuroscience, anthropology, and psychology in the 1940s, often attributed to the Macy Conferences.

Other fields of study which have influenced or been influenced by cybernetics include game theory, system theory (a mathematical counterpart to cybernetics), psychology(especially neuropsychology, behavioral psychology, cognitive psychology, philosophy, and architecture.

Wednesday, January 28, 2009

Control system

A control system is a device or set of devices to manage, command, direct or regulate the behavior of other devices or systems.

There are two common classes of control systems, with many variations and combinations: logic or sequential controls, and feedback or linear controls. There is also fuzzy logic, which attempts to combine some of the design simplicity of logic with the utility of linear control. Some devices or systems are inherently not controllable.

The term "control system" may be applied to the essentially manual controls that allow an operator to, for example, close and open a hydraulic press, where the logic requires that it cannot be moved unless safety guards are in place.

An automatic sequential control system may trigger a series of mechanical actuators in the correct sequence to perform a task. For example various electric and pneumatic transducers may fold and glue a cardboard box, fill it with product and then seal it in an automatic packaging machine.

In the case of linear feedback systems, a control loop, including sensors, control algorithms and actuators, is arranged in such a fashion as to try to regulate a variable at a setpoint or reference value. An example of this may increase the fuel supply to a furnace when a measured temperature drops. PID controllers are common and effective in cases such as this. Control systems that include some sensing of the results they are trying to achieve are making use of feedback and so can, to some extent, adapt to varying circumstances. Open-loop control systems do not directly make use of feedback, but run only in pre-arranged ways.


Logic control

Pure logic control systems were historically implemented by electricians with networks of relays, and designed with a notation called ladder logic. Today, most such systems are constructed with programmable logic devices.

Logic controllers may respond to switches, light sensors, pressure switches etc and cause the machinery to perform some operation. Logic systems are used to sequence mechanical operations in many applications. Examples include elevators, washing machines and other systems with interrelated stop-go operations.

Logic systems are quite easy to design, and can handle very complex operations. Some aspects of logic system design make use of Boolean logic.


On-off control

For example, a thermostat is a simple negative-feedback control: when the temperature (the "measured variable" or MV) goes below a set point (SP), the heater is switched on. Another example could be a pressure-switch on an air compressor: when the pressure (MV) drops below the threshold (SP), the pump is powered. Refrigerators and vacuum pumps contain similar mechanisms operating in reverse, but still providing negative feedback to correct errors.

Simple on-off feedback control systems like these are cheap and effective. In some cases, like the simple compressor example, they may represent a good design choice.

In most applications of on-off feedback control, some consideration needs to be given to other costs, such as wear and tear of control valves and maybe other start-up costs when power is reapplied each time the MV drops. Therefore, practical on-off control systems are designed to include hysteresis, usually in the form of a deadband, a region around the setpoint value in which no control action occurs. The width of deadband may be adjustable or programmable.


Linear control

Linear control systems use linear negative feedback to produce a control signal mathematically based on other variables, with a view to maintaining the controlled process within an acceptable operating range.

The output from a linear control system into the controlled process may be in the form of a directly variable signal, such as a valve that may be 0 or 100% open or anywhere in between. Sometimes this is not feasible and so, after calculating the current required corrective signal, a linear control system may repeatedly switch an actuator, such as a pump, motor or heater, fully on and then fully off again, regulating the duty cycle using pulse-width modulation.

Proportional control

When controlling the temperature of an industrial furnace, it is usually better to control the opening of the fuel valve in proportion to the current needs of the furnace. This helps avoid thermal shocks and applies heat more effectively.

Proportional negative-feedback systems are based on the difference between the required set point (SP) and measured value (MV) of the controlled variable. This difference is called the error. Power is applied in direct proportion to the current measured error, in the correct sense so as to tend to reduce the error (and so avoid positive feedback). The amount of corrective action that is applied for a given error is set by the gain or sensitivity of the control system.

At low gains, only a small corrective action is applied when errors are detected: the system may be safe and stable, but may be sluggish in response to changing conditions; errors will remain uncorrected for relatively long periods of time: it is over-damped. If the proportional gain is increased, such systems become more responsive and errors are dealt with more quickly. There is an optimal value for the gain setting when the overall system is said to be critically damped. Increases in loop gain beyond this point will lead to oscillations in the MV; such a system is under-damped.

Under-damped furnace example

In the furnace example, suppose the temperature is increasing towards a set point at which, say, 50% of the available power will be required for steady-state. At low temperatures, 100% of available power is applied. When the MV is within, say 10° of the SP the heat input begins to be reduced by the proportional controller. (Note that this implies a 20° "proportional band" (PB) from full to no power input, evenly spread around the setpoint value). At the setpoint the controller will be applying 50% power as required, but stray stored heat within the heater sub-system and in the walls of the furnace will keep the measured temperature rising beyond what is required. At 10° above SP, we reach the top of the proportional band (PB) and no power is applied, but the temperature may continue to rise even further before beginning to fall back. Eventually as the MV falls back into the PB, heat is applied again, but now the heater and the furnace walls are too cool and the temperature falls too low before its fall is arrested, so that the oscillations continue.

Over-damped furnace example

The temperature oscillations that an under-damped furnace control system produces are unacceptable for many reasons, including the waste of fuel and time (each oscillation cycle may take many minutes), as well as the likelihood of seriously overheating both the furnace and its contents.

Suppose that the gain of the control system is reduced drastically and it is restarted. As the temperature approaches, say 30° below SP (60° proportional band or PB now), the heat input begins to be reduced, the rate of heating of the furnace has time to slow and, as the heat is still further reduced, it eventually is brought up to set point, just as 50% power input is reached and the furnace is operating as required. There was some wasted time while the furnace crept to its final temperature using only 52% then 51% of available power, but at least no harm was done. By carefully increasing the gain (i.e. reducing the width of the PB) this over-damped and sluggish behavior can be improved until the system is critically damped for this SP temperature. Doing this is known as 'tuning' the control system. A well-tuned proportional furnace temperature control system will usually be more effective than on-off control, but will still respond slower than the furnace could under skillful manual control.

PID control

Apart from sluggish performance to avoid oscillations, another problem with proportional-only control is that power application is always in direct proportion to the error. In the example above we assumed that the set temperature could be maintained with 50% power. What happens if the furnace is required in a different application where a higher set temperature will require 80% power to maintain it? If the gain was finally set to a 50° PB, then 80% power will not be applied unless the furnace is 15° below setpoint, so for this other application the operators will have to remember always to set the setpoint temperature 15° higher than actually needed. This 15° figure is not completely constant either: it will depend on the surrounding ambient temperature, as well as other factors that affect heat loss from or absorption within the furnace.

To resolve these two problems, many feedback control schemes include mathematical extensions to improve performance. The most common extensions lead to proportional-integral-derivative control, or PID control (pronounced pee-eye-dee).

Derivative action

The derivative part is concerned with the rate-of-change of the error with time: If the measured variable approaches the setpoint rapidly, then the actuator is backed off early to allow it to coast to the required level; conversely if the measured value begins to move rapidly away from the setpoint, extra effort is applied — in proportion to that rapidity — to try to maintain it.

Derivative action makes a control system behave much more intelligently. On systems like the temperature of a furnace, or perhaps the motion-control of a heavy item like a gun or camera on a moving vehicle, the derivative action of a well-tuned PID controller can allow it to reach and maintain a setpoint better than most skilled human operators could.

If derivative action is over-applied, it can lead to oscillations too. An example would be a temperature that increased rapidly towards SP, then halted early and seemed to "shy away" from the setpoint before rising towards it again.

Integral action

The integral term magnifies the effect of long-term steady-state errors, applying ever-increasing effort until they reduce to zero. In the example of the furnace above working at various temperatures, if the heat being applied does not bring the furnace up to setpoint, for whatever reason, integral action increasingly moves the proportional band relative to the setpoint until the time-integral of the MV error is reduced to zero and the setpoint is achieved.

Other techniques

Another common technique is to filter the MV or error signal. Such a filter can reduce the response of the system to undesirable frequencies, to help eliminate instability or oscillations. Some feedback systems will oscillate at just one frequency. By filtering out that frequency, one can use very "stiff" feedback and the system can be very responsive without shaking itself apart.

The most complex linear control systems developed to date are in oil refineries (model predictive control). The chemical reaction paths and control systems are normally designed together using specialized computer-aided-design software.

Feedback systems can be combined in many ways. One example is cascade control in which one control loop applies control algorithms to a measured variable against a setpoint, but then actually outputs a setpoint to another controller, rather than affecting power input directly.

Usually if a system has several measurements to be controlled, feedback systems will be present for each of them.


Fuzzy logic

Fuzzy logic is an attempt to get the easy design of logic controllers and yet control continuously-varying systems. Basically, a measurement in a fuzzy logic system can be partly true, that is if yes is 1 and no is 0, a fuzzy measurement can be between 0 and 1.

The rules of the system are written in natural language and translated into fuzzy logic. For example, the design for a furnace would start with: "If the temperature is too high, reduce the fuel to the furnace. If the temperature is too low, increase the fuel to the furnace."

Measurements from the real world (such as the temperature of a furnace) are converted to values between 0 and 1 by seeing where they fall on a triangle. Usually the tip of the triangle is the maximum possible value which translates to "1."

Fuzzy logic then modifies Boolean logic to be arithmetical. Usually the "not" operation is "output = 1 - input," the "and" operation is "output = input.1 multiplied by input.2," and "or" is "output = 1 - ((1 - input.1) multiplied by (1 - input.2))."

The last step is to "defuzzify" an output. Basically, the fuzzy calculations make a value between zero and one. That number is used to select a value on a line whose slope and height converts the fuzzy value to a real-world output number. The number then controls real machinery.

If the triangles are defined correctly and rules are right the result can be a good control system.

When a robust fuzzy design is reduced into a single, quick calculation, it begins to resemble a conventional feedback loop solution. For this reason, many control engineers think one should not bother with it. However, the fuzzy logic paradigm may provide scalability for large control systems where conventional methods become unwieldy or costly to derive.

Fuzzy electronics is an electronic technology that uses fuzzy logic instead of the two-value logic more commonly used in digital electronics.


Physical implementations

Since modern small microcontrollers are so cheap (often less than $1 US), it's very common to implement control systems, including feedback loops, with computers, often in an embedded system. The feedback controls are simulated by having the computer make periodic measurements and then calculating from this stream of measurements.

Computers emulate logic devices by making measurements of switch inputs, calculating a logic function from these measurements and then sending the results out to electronically-controlled switches.

Logic systems and feedback controllers are usually implemented with programmable logic controllers which are devices available from electrical supply houses. They include a little computer and a simplified system for programming. Most often they are programmed with personal computers.

Logic controllers have also been constructed from relays, hydraulic and pneumatic devices, and electronics using both transistors and vacuum tubes (feedback controllers can also be constructed in this manner).

Computer vision

Computer vision is the science and technology of machines that see. As a scientific discipline, computer vision is concerned with the theory for building artificial systems that obtain information from images. The image data can take many forms, such as a video sequence, views from multiple cameras, or multi-dimensional data from a medical scanner.

As a technological discipline, computer vision seeks to apply the theories and models of computer vision to the construction of computer vision systems. Examples of applications of computer vision systems include systems for:

  • Controlling processes (e.g. an industrial robot or an autonomous vehicle).
  • Detecting events (e.g. for visual surveillance or people counting).
  • Organizing information (e.g. for indexing databases of images and image sequences).
  • Modeling objects or environments (e.g. industrial inspection, medical image analysis or topographical modeling).
  • Interaction (e.g. as the input to a device for computer-human interaction).

Computer vision can also be described as a complement (but not necessarily the opposite) of biological vision. In biological vision, the visual perception of humans and various animals are studied, resulting in models of how these systems operate in terms of physiological processes. Computer vision, on the other hand, studies and describes artificial vision system that are implemented in software and/or hardware. Interdisciplinary exchange between biological and computer vision has proven increasingly fruitful for both fields.

Sub-domains of computer vision include scene reconstruction, event detection, tracking, object recognition, learning, indexing, motion estimation, and image restoration.



State of the art

The field of computer vision can be characterized as immature and diverse. Even though earlier work exists, it was not until the late 1970s that a more focused study of the field started when computers could manage the processing of large data sets such as images. However, these studies usually originated from various other fields, and consequently there is no standard formulation of "the computer vision problem." Also, and to an even larger extent, there is no standard formulation of how computer vision problems should be solved. Instead, there exists an abundance of methods for solving various well-defined computer vision tasks, where the methods often are very task specific and seldom can be generalized over a wide range of applications. Many of the methods and applications are still in the state of basic research, but more and more methods have found their way into commercial products, where they often constitute a part of a larger system which can solve complex tasks (e.g., in the area of medical images, or quality control and measurements in industrial processes). In most practical computer vision applications, the computers are pre-programmed to solve a particular task, but methods based on learning are now becoming increasingly common.


Related fields

Relation between computer vision and various other fields

A significant part of artificial intelligence deals with autonomous planning or deliberation for systems which can perform mechanical actions such as moving a robot through some environment. This type of processing typically needs input data provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment and the robot. Other parts which sometimes are described as belonging to artificial intelligence and which are used in relation to computer vision is pattern recognition and learning techniques. As a consequence, computer vision is sometimes seen as a part of the artificial intelligence field or the computer science field in general.

Physics is another field that is strongly related to computer vision. A significant part of computer vision deals with methods which require a thorough understanding of the process in which electromagnetic radiation, typically in the visible or the infra-red range, is reflected by the surfaces of objects and finally is measured by the image sensor to produce the image data. This process is based on optics and solid-state physics. More sophisticated image sensors even require quantum mechanics to provide a complete comprehension of the image formation process. Also, various measurement problems in physics can be addressed using computer vision, for example motion in fluids. Consequently, computer vision can also be seen as an extension of physics.

A third field which plays an important role is neurobiology, specifically the study of the biological vision system. Over the last century, there has been an extensive study of eyes, neurons, and the brain structures devoted to processing of visual stimuli in both humans and various animals. This has led to a coarse, yet complicated, description of how "real" vision systems operate in order to solve certain vision related tasks. These results have led to a subfield within computer vision where artificial systems are designed to mimic the processing and behaviour of biological systems, at different levels of complexity. Also, some of the learning-based methods developed within computer vision have their background in biology.

Yet another field related to computer vision is signal processing. Many methods for processing of one-variable signals, typically temporal signals, can be extended in a natural way to processing of two-variable signals or multi-variable signals in computer vision. However, because of the specific nature of images there are many methods developed within computer vision which have no counterpart in the processing of one-variable signals. A distinct character of these methods is the fact that they are non-linear which, together with the multi-dimensionality of the signal, defines a subfield in signal processing as a part of computer vision.

Beside the above mentioned views on computer vision, many of the related research topics can also be studied from a purely mathematical point of view. For example, many methods in computer vision are based on statistics, optimization or geometry. Finally, a significant part of the field is devoted to the implementation aspect of computer vision; how existing methods can be realized in various combinations of software and hardware, or how these methods can be modified in order to gain processing speed without losing too much performance.

The fields, most closely related to computer vision, are image processing, image analysis, robot vision and machine vision. There is a significant overlap in terms of what techniques and applications they cover. This implies that the basic techniques that are used and developed in these fields are more or less identical, something which can be interpreted as there is only one field with different names. On the other hand, it appears to be necessary for research groups, scientific journals, conferences and companies to present or market themselves as belonging specifically to one of these fields and, hence, various characterizations which distinguish each of the fields from the others have been presented.

The following characterizations appear relevant but should not be taken as universally accepted:

  • Image processing and image analysis tend to focus on 2D images, how to transform one image to another, e.g., by pixel-wise operations such as contrast enhancement, local operations such as edge extraction or noise removal, or geometrical transformations such as rotating the image. This characterization implies that image processing/analysis neither require assumptions nor produce interpretations about the image content.
  • Computer vision tends to focus on the 3D scene projected onto one or several images, e.g., how to reconstruct structure or other information about the 3D scene from one or several images. Computer vision often relies on more or less complex assumptions about the scene depicted in an image.
  • Machine vision tends to focus on applications, mainly in industry, e.g., vision based autonomous robots and systems for vision based inspection or measurement. This implies that image sensor technologies and control theory often are integrated with the processing of image data to control a robot and that real-time processing is emphasized by means of efficient implementations in hardware and software. It also implies that the external conditions such as lighting can be and are often more controlled in machine vision than they are in general computer vision, which can enable the use of different algorithms.
  • There is also a field called imaging which primarily focus on the process of producing images, but sometimes also deals with processing and analysis of images. For example, medical imaging contains lots of work on the analysis of image data in medical applications.
  • Finally, pattern recognition is a field which uses various methods to extract information from signals in general, mainly based on statistical approaches. A significant part of this field is devoted to applying these methods to image data.

A consequence of this state of affairs is that you can be working in a lab related to one of these fields, apply methods from a second field to solve a problem in a third field and present the result at a conference related to a fourth field.


Applications for computer vision

One of the most prominent application fields is medical computer vision or medical image processing. This area is characterized by the extraction of information from image data for the purpose of making a medical diagnosis of a patient. Generally, image data is in the form of microscopy images, X-ray images, angiography images, ultrasonic images, and tomography images. An example of information which can be extracted from such image data is detection of tumours, arteriosclerosis or other malign changes. It can also be measurements of organ dimensions, blood flow, etc. This application area also supports medical research by providing new information, e.g., about the structure of the brain, or about the quality of medical treatments.

A second application area in computer vision is in industry. Here, information is extracted for the purpose of supporting a manufacturing process. One example is quality control where details or final products are being automatically inspected in order to find defects. Another example is measurement of position and orientation of details to be picked up by a robot arm.

Military applications are probably one of the largest areas for computer vision. The obvious examples are detection of enemy soldiers or vehicles and missile guidance. More advanced systems for missile guidance send the missile to an area rather than a specific target, and target selection is made when the missile reaches the area based on locally acquired image data. Modern military concepts, such as "battlefield awareness", imply that various sensors, including image sensors, provide a rich set of information about a combat scene which can be used to support strategic decisions. In this case, automatic processing of the data is used to reduce complexity and to fuse information from multiple sensors to increase reliability.

Artist's Concept of Rover on Mars, an example of an unmanned land-based vehicle. Notice the stereo cameras mounted on top of the Rover. (credit: Maas Digital LLC)


One of the newer application areas is autonomous vehicles, which include submersibles, land-based vehicles (small robots with wheels, cars or trucks), aerial vehicles, and unmanned aerial vehicles (UAV). The level of autonomy ranges from fully autonomous (unmanned) vehicles to vehicles where computer vision based systems support a driver or a pilot in various situations. Fully autonomous vehicles typically use computer vision for navigation, i.e. for knowing where it is, or for producing a map of its environment (SLAM) and for detecting obstacles. It can also be used for detecting certain task specific events, e. g., a UAV looking for forest fires. Examples of supporting systems are obstacle warning systems in cars, and systems for autonomous landing of aircraft. Several car manufacturers have demonstrated systems for autonomous driving of cars, but this technology has still not reached a level where it can be put on the market. There are ample examples of military autonomous vehicles ranging from advanced missiles, to UAVs for recon missions or missile guidance. Space exploration is already being made with autonomous vehicles using computer vision, e. g., NASA's Mars Exploration Rover.

Other application areas include:

  • Support of visual effects creation for cinema and broadcast, e.g., camera tracking (matchmoving).
  • Surveillance.

Typical tasks of computer vision

Each of the application areas described above employ a range of computer vision tasks; more or less well-defined measurement problems or processing problems, which can be solved using a variety of methods. Some examples of typical computer vision tasks are presented below.

Recognition

The classical problem in computer vision, image processing and machine vision is that of determining whether or not the image data contains some specific object, feature, or activity. This task can normally be solved robustly and without effort by a human, but is still not satisfactorily solved in computer vision for the general case: arbitrary objects in arbitrary situations. The existing methods for dealing with this problem can at best solve it only for specific objects, such as simple geometric objects (e.g., polyhedrons), human faces, printed or hand-written characters, or vehicles, and in specific situations, typically described in terms of well-defined illumination, background, and pose of the object relative to the camera.

Different varieties of the recognition problem are described in the literature:

  • Recognition: one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene.
  • Identification: An individual instance of an object is recognized. Examples: identification of a specific person's face or fingerprint, or identification of a specific vehicle.
  • Detection: the image data is scanned for a specific condition. Examples: detection of possible abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analyzed by more computationally demanding techniques to produce a correct interpretation.

Several specialized tasks based on recognition exist, such as:

  • Content-based image retrieval: finding all images in a larger set of images which have a specific content. The content can be specified in different ways, for example in terms of similarity relative a target image (give me all images similar to image X), or in terms of high-level search criteria given as text input (give me all images which contains many houses, are taken during winter, and have no cars in them).
  • Pose estimation: estimating the position or orientation of a specific object relative to the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in an assembly line situation.
  • Optical character recognition (or OCR): identifying characters in images of printed or handwritten text, usually with a view to encoding the text in a format more amenable to editing or indexing (e.g. ASCII).

Motion

Several tasks relate to motion estimation, in which an image sequence is processed to produce an estimate of the velocity either at each points in the image or in the 3D scene. Examples of such tasks are:

  • Egomotion: determining the 3D rigid motion of the camera.
  • Tracking: following the movements of objects (e.g. vehicles or humans).

Scene reconstruction

Given one or (typically) more images of a scene, or a video, scene reconstruction aims at computing a 3D model of the scene. In the simplest case the model can be a set of 3D points. More sophisticated methods produce a complete 3D surface model.

Image restoration

The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from images. The simplest possible approach for noise removal is various types of filters such as low-pass filters or median filters. More sophisticated methods assume a model of how the local image structures look like, a model which distinguishes them from the noise. By first analysing the image data in terms of the local image structures, such as lines or edges, and then controlling the filtering based on local information from the analysis step, a better level of noise removal is usually obtained compared to the simpler approaches.


Computer vision systems

The organization of a computer vision system is highly application dependent. Some systems are stand-alone applications which solve a specific measurement or detection problem, while other constitute a sub-system of a larger design which, for example, also contains sub-systems for control of mechanical actuators, planning, information databases, man-machine interfaces, etc. The specific implementation of a computer vision system also depends on if its functionality is pre-specified or if some part of it can be learned or modified during operation. There are, however, typical functions which are found in many computer vision systems.

  • Image acquisition: A digital image is produced by one or several image sensor which, besides various types of light-sensitive cameras, includes range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or colour images), but can also be related to various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance.
  • Pre-processing: Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are
    • Re-sampling in order to assure that the image coordinate system is correct.
    • Noise reduction in order to assure that sensor noise does not introduce false information.
    • Contrast enhancement to assure that relevant information can be detected.
    • Scale-space representation to enhance image structures at locally appropriate scales.
  • Feature extraction: Image features at various levels of complexity are extracted from the image data. Typical examples of such features are
    • Lines, edges and ridges.
    • Localized interest points such as corners, blobs or points.
More complex features may be related to texture, shape or motion.
  • Detection/Segmentation: At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing. Examples are
    • Selection of a specific set of interest points
    • Segmentation of one or multiple image regions which contain a specific object of interest.
  • High-level processing: At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example:
    • Verification that the data satisfy model-based and application specific assumptions.
    • Estimation of application specific parameters, such as object pose or object size.
    • Classifying a detected object into different categories.

Robotics

The Shadow robot hand system

Robotics is the science and technology of robots, and their design, manufacture, and application. Robotics has connections to electronics, mechanics, and software.


Origins

Stories of artificial helpers and companions and attempts to create them have a long history, but fully autonomous machines only appeared in the 20th century. The first digitally operated and programmable robot, the Unimate, was installed in 1961 to lift hot pieces of metal from a die casting machine and stack them. Today, commercial and industrial robots are in widespread use performing jobs more cheaply or with greater accuracy and reliability than humans. They are also employed for jobs which are too dirty, dangerous, or dull to be suitable for humans. Robots are widely used in manufacturing, assembly and packing, transport, earth and space exploration, surgery, weaponry, laboratory research, safety, and mass production of consumer and industrial goods.

Date Significance Robot Name Inventor
First century A.D. and earlier Descriptions of more than 100 machines and automata, including a fire engine, a wind organ, a coin-operated machine, and a steam-powered engine, in Pneumatica and Automata by Heron of Alexandria
Ctesibius, Philo of Byzantium, Heron of Alexandria, and others
1206 First programmable humanoid robots Boat with four robotic musicians Al-Jazari
c. 1495 Designs for a humanoid robot Mechanical knight Leonardo da Vinci
1738 Mechanical duck that was able to eat, flap its wings, and excrete Digesting Duck Jacques de Vaucanson
1800s Japanese mechanical toys that served tea, fired arrows, and painted Karakuri toys Tanaka Hisashige
1921 First fictional automatons called "robots" appear in the play R.U.R. Rossum's Universal Robots Karel Čapek
1930s Humanoid robot exhibited at the 1939 and 1940 World's Fairs Elektro Westinghouse Electric Corporation
1948 Simple robots exhibiting biological behaviors[4] Elsie and Elmer William Grey Walter
1956 First commercial robot, from the Unimation company founded by George Devol and Joseph Engelberger, based on Devol's patents[5] Unimate George Devol
1961 First installed industrial robot Unimate George Devol
1963 First palletizing robot[6] Palletizer Fuji Yusoki Kogyo
1973 First industrial robot with six electromechanically driven axes[7] Famulus KUKA Robot Group
1975 Programmable universal manipulation arm, a Unimation product PUMA Victor Scheinman

According to the Oxford English Dictionary, the word robotics was first used in print by Isaac Asimov, in his science fiction short story "Liar!", published in May 1941 in Astounding Science Fiction. Asimov was unaware that he was coining the term; since the science and technology of electrical devices is electronics, he assumed robotics already referred to the science and technology of robots. The word robot was introduced to the public by Czech writer Karel Čapek in his play R.U.R. (Rossum's Universal Robots), which premiered in 1921.

Components of robots

Structure

The structure of a robot is usually mostly mechanical and can be called a kinematic chain (its functionality being similar to the skeleton of the human body). The chain is formed of links (its bones), actuators (its muscles), and joints which can allow one or more degrees of freedom. Most contemporary robots use open serial chains in which each link connects the one before to the one after it. These robots are called serial robots and often resemble the human arm. Some robots, such as the Stewart platform, use a closed parallel kinematical chain. Other structures, such as those that mimic the mechanical structure of humans, various animals, and insects, are comparatively rare. However, the development and use of such structures in robots is an active area of research (e.g. biomechanics). Robots used as manipulators have an end effector mounted on the last link. This end effector can be anything from a welding device to a mechanical hand used to manipulate the environment.

Power source

At present; mostly (lead-acid) batteries are used, but potential powersources could be:

  • compressed air canisters
  • flywheel energy storage
  • organic garbage (trough anaerobic digestion
  • feces (human, animal); may be intresting in a military context; as feces of small combat groups may be reused for the energy requirements of the robot assistant (see DEKA's project Slingshot stirling engine on how the system would operate)
  • still untested energy sources (eg Joe Cell, ...)
  • radioactive source (such as with the proposed Ford car of the '50); too proposed in movies as Red Planet (film)

Actuation

A robot leg powered by Air Muscles

Actuators are the "muscles" of a robot, the parts which convert stored energy into movement. By far the most popular actuators are electric motors, but there are many others, powered by electricity, chemicals, and compressed air.

  • Motors: The vast majority of robots use electric motors, including brushed and brushless DC motors.
  • Stepper motors: As the name suggests, stepper motors do not spin freely like DC motors; they rotate in discrete steps, under the command of a controller. This makes them easier to control, as the controller knows exactly how far they have rotated, without having to use a sensor. Therefore, they are used on many robots and CNC machines.
  • Piezo motors: A recent alternative to DC motors are piezo motors or ultrasonic motors. These work on a fundamentally different principle, whereby tiny piezoceramic elements, vibrating many thousands of times per second, cause linear or rotary motion. There are different mechanisms of operation; one type uses the vibration of the piezo elements to walk the motor in a circle or a straight line. Another type uses the piezo elements to cause a nut to vibrate and drive a screw. The advantages of these motors are nanometer resolution, speed, and available force for their size. These motors are already available commercially, and being used on some robots.
  • Air muscles: The air muscle is a simple yet powerful device for providing a pulling force. When inflated with compressed air, it contracts by up to 40% of its original length. The key to its behavior is the braiding visible around the outside, which forces the muscle to be either long and thin, or short and fat. Since it behaves in a very similar way to a biological muscle, it can be used to construct robots with a similar muscle/skeleton system to an animal. For example, the Shadow robot hand uses 40 air muscles to power its 24 joints.
  • Electroactive polymers: Electroactive polymers are a class of plastics which change shape in response to electrical stimulation. They can be designed so that they bend, stretch, or contract, but so far there are no EAPs suitable for commercial robots, as they tend to have low efficiency or are not robust. Indeed, all of the entrants in a recent competition to build EAP powered arm wrestling robots, were beaten by a 17 year old girl. However, they are expected to improve in the future, where they may be useful for microrobotic applications.
  • Elastic nanotubes: These are a promising, early-stage experimental technology. The absence of defects in nanotubes enables these filaments to deform elastically by several percent, with energy storage levels of perhaps 10J per cu cm for metal nanotubes. Human biceps could be replaced with an 8mm diameter wire of this material. Such compact "muscle" might allow future robots to outrun and outjump humans.

Manipulation

Robots which must work in the real world require some way to manipulate objects; pick up, modify, destroy, or otherwise have an effect. Thus the 'hands' of a robot are often referred to as end effectors, while the arm is referred to as a manipulator. Most robot arms have replaceable effectors, each allowing them to perform some small range of tasks. Some have a fixed manipulator which cannot be replaced, while a few have one very general purpose manipulator, for example a humanoid hand.

  • Mechanical Grippers: One of the most common effectors is the gripper. In its simplest manifestation it consists of just two fingers which can open and close to pick up and let go of a range of small objects. See end effectors.
  • Vacuum Grippers: Pick and place robots for electronic components and for large objects like car windscreens, will often use very simple vacuum grippers. These are very simple astrictive devices, but can hold very large loads provided the prehension surface is smooth enough to ensure suction.
  • General purpose effectors: Some advanced robots are beginning to use fully humanoid hands, like the Shadow Hand and the Schunk hand. These highly dexterous manipulators, with as many as 20 degrees of freedom and hundreds of tactile sensors

For the definitive guide to all forms of robot endeffectors, their design, and usage consult the book "Robot Grippers".

Locomotion

Rolling robots

Segway in the Robot museum in Nagoya.

For simplicity, most mobile robots have four wheels. However, some researchers have tried to create more complex wheeled robots, with only one or two wheels.

  • Two-wheeled balancing: While the Segway is not commonly thought of as a robot, it can be thought of as a component of a robot. Several real robots do use a similar dynamic balancing algorithm, and NASA's Robonaut has been mounted on a Segway.
  • Ballbot: Carnegie Mellon University researchers have developed a new type of mobile robot that balances on a ball instead of legs or wheels. "Ballbot" is a self-contained, battery-operated, omnidirectional robot that balances dynamically on a single urethane-coated metal sphere. It weighs 95 pounds and is the approximate height and width of a person. Because of its long, thin shape and ability to maneuver in tight spaces, it has the potential to function better than current robots can in environments with people.
  • Track Robot: Another type of rolling robot is one that has tracks, like NASA's Urban Robot, Urbie.

Walking robots

iCub robot, designed by the RobotCub Consortium

Walking is a difficult and dynamic problem to solve. Several robots have been made which can walk reliably on two legs, however none have yet been made which are as robust as a human. Many robots have also been build that walk on more than 2 legs; these robots being significantly more easy to construct. Hybrids too have been proposed in movies as iRobot, where they walk on 2 legs and switch to 4 (arms+legs) when going to a sprint. Typically, robots on 2 legs can walk well on flat floors, and can occasionally walk up stairs. None can walk over rocky, uneven terrain. Some of the methods which have been tried are:

  • ZMP Technique: The Zero Moment Point (ZMP) is the algorithm used by robots such as Honda's ASIMO. The robot's onboard computer tries to keep the total inertial forces (the combination of earth's gravity and the acceleration and deceleration of walking), exactly opposed by the floor reaction force (the force of the floor pushing back on the robot's foot). In this way, the two forces cancel out, leaving no moment (force causing the robot to rotate and fall over). However, this is not exactly how a human walks, and the difference is quite apparent to human observers, some of whom have pointed out that ASIMO walks as if it needs the lavatory. ASIMO's walking algorithm is not static, and some dynamic balancing is used (See below). However, it still requires a smooth surface to walk on.
  • Hopping: Several robots, built in the 1980s by Marc Raibert at the MIT Leg Laboratory, successfully demonstrated very dynamic walking. Initially, a robot with only one leg, and a very small foot, could stay upright simply by hopping. The movement is the same as that of a person on a pogo stick. As the robot falls to one side, it would jump slightly in that direction, in order to catch itself. Soon, the algorithm was generalised to two and four legs. A bipedal robot was demonstrated running and even performing somersaults. A quadruped was also demonstrated which could trot, run, pace, and bound. For a full list of these robots, see the MIT Leg Lab Robots page.
  • Dynamic Balancing: A more advanced way for a robot to walk is by using a dynamic balancing algorithm, which is potentially more robust than the Zero Moment Point technique, as it constantly monitors the robot's motion, and places the feet in order to maintain stability. This technique was recently demonstrated by Anybots' Dexter Robot, which is so stable, it can even jump.
  • Passive Dynamics: Perhaps the most promising approach utilizes passive dynamics where the momentum of swinging limbs is used for greater efficiency. It has been shown that totally unpowered humanoid mechanisms can walk down a gentle slope, using only gravity to propel themselves. Using this technique, a robot need only supply a small amount of motor power to walk along a flat surface or a little more to walk up a hill. This technique promises to make walking robots at least ten times more efficient than ZMP walkers, like ASIMO.


Other methods of locomotion

RQ-4 Global Hawk unmanned aerial vehicle
  • Flying: A modern passenger airliner is essentially a flying robot, with two humans to manage it. The autopilot can control the plane for each stage of the journey, including takeoff, normal flight, and even landing. Other flying robots are uninhabited, and are known as unmanned aerial vehicles (UAVs). They can be smaller and lighter without a human pilot onboard, and fly into dangerous territory for military surveillance missions. Some can even fire on targets under command. UAVs are also being developed which can fire on targets automatically, without the need for a command from a human. However these robots are unlikely to see service in the foreseeable future because of the morality issues involved. Other flying robots include cruise missiles, the Entomopter, and the Epson micro helicopter robot.
Two robot snakes. Left one has 64 motors (with 2 degrees of freedom per segment), the right one 10.
  • Snaking: Several snake robots have been successfully developed. Mimicking the way real snakes move, these robots can navigate very confined spaces, meaning they may one day be used to search for people trapped in collapsed buildings. The Japanese ACM-R5 snake robot can even navigate both on land and in water.
  • Skating: A small number of skating robots have been developed, one of which is a multi-mode walking and skating device, Titan VIII. It has four legs, with unpowered wheels, which can either step or roll. Another robot, Plen, can use a miniature skateboard or rollerskates, and skate across a desktop.
  • Swimming: It is calculated that when swimming some fish can achieve a propulsive efficiency greater than 90%. Furthermore, they can accelerate and maneuver far better than any man-made boat or submarine, and produce less noise and water disturbance. Therefore, many researchers studying underwater robots would like to copy this type of locomotion. Notable examples are the Essex University Computer Science Robotic Fish, and the Robot Tuna built by the Institute of Field Robotics, to analyze and mathematically model thunniform motion.


Environmental interaction and navigation

RADAR, GPS, LIDAR, ... are all combined to provide proper navigation and obstacle avoidance

Robots also require navigation hardware and software in order to anticipate on their environment. In particular unforeseen events (eg people and other obstacles that are not stationary) can cause problems or collisions. Some highly advanced robots as ASIMO, EveR-1, Meinü robot have particular good robot navigation hardware and software. Also, self-controlled car, Ernst Dickmanns' driverless car and the entries in the DARPA Grand Challenge are capable of sensing the environment well and make navigation decisions based on this information. Most of the robots include regular a GPS navigation device with waypoints, along with radar, sometimes combined with other sensor data such as LIDAR, video cameras, and inertial guidance systems for better navigation in between waypoints.

Human interaction

Kismet can produce a range of facial expressions.

If robots are to work effectively in homes and other non-industrial environments, the way they are instructed to perform their jobs, and especially how they will be told to stop will be of critical importance. The people who interact with them may have little or no training in robotics, and so any interface will need to be extremely intuitive. Science fiction authors also typically assume that robots will eventually be capable of communicating with humans through speech, gestures, and facial expressions, rather than a command-line interface. Although speech would be the most natural way for the human to communicate, it is quite unnatural for the robot. It will be quite a while before robots interact as naturally as the fictional C-3PO.

  • Speech recognition: Interpreting the continuous flow of sounds coming from a human (speech recognition), in real time, is a difficult task for a computer, mostly because of the great variability of speech. The same word, spoken by the same person may sound different depending on local acoustics, volume, the previous word, whether or not the speaker has a cold, etc.. It becomes even harder when the speaker has a different accent. Nevertheless, great strides have been made in the field since Davis, Biddulph, and Balashek designed the first "voice input system" which recognized "ten digits spoken by a single user with 100% accuracy" in 1952. Currently, the best systems can recognize continuous, natural speech, up to 160 words per minute, with an accuracy of 95%.
  • Gestures: One can imagine, in the future, explaining to a robot chef how to make a pastry, or asking directions from a robot police officer. On both of these occasions, making hand gestures would aid the verbal descriptions. In the first case, the robot would be recognizing gestures made by the human, and perhaps repeating them for confirmation. In the second case, the robot police officer would gesture to indicate "down the road, then turn right". It is quite likely that gestures will make up a part of the interaction between humans and robots. A great many systems have been developed to recognize human hand gestures.
  • Facial expression: Facial expressions can provide rapid feedback on the progress of a dialog between two humans, and soon it may be able to do the same for humans and robots. Frubber robotic faces have been constructed by Hanson Robotics, allowing a great amount of facial expressions due to the elasticity of the rubber facial coating and imbedded subsurface motors (servos)to produce the facial expressions. The coating and servos are build untop of a metal skull. A robot should know how to approach a human, judging by their facial expression and body language. Whether the person is happy, frightened, or crazy-looking affects the type of interaction expected of the robot. Likewise, a robot like Kismet can produce a range of facial expressions, allowing it to have meaningful social exchanges with humans.
  • Artificial emotions Artificial emotions can also be imbedded and are composed of a sequence of facial expressions and/or gestures. As can be seen from the movie Final_Fantasy:_The_Spirits_Within, the programming of these artificial emotions is quite complex and requires a great amount of human observation. To simplify this programming in the movie Final_Fantasy:_The_Spirits_Within, presets were created together with a special software program. This allowed the producers of decreasing the time required tremendously to make the film. These presets could possibly be transferred for use in real-life robots.
  • Personality: Many of the robots of science fiction have a personality, and that is something which may or may not be desirable in the commercial robots of the future. Nevertheless, researchers are trying to create robots which appear to have a personality. i.e. they use sounds, facial expressions and body language to try to convey an internal state, which may be joy, sadness, or fear. One commercial example is Pleo, a toy robot dinosaur, which can exhibit several apparent emotions.

Control

A robot-manipulated marionette, with complex control systems

The mechanical structure of a robot must be controlled to perform tasks. The control of a robot involves three distinct phases - perception, processing, and action (robotic paradigms). Sensors give information about the environment or the robot itself (e.g. the position of its joints or its end effector). This information is then processed to calculate the appropriate signals to the actuators (motors) which move the mechanical.

The processing phase can range in complexity. At a reactive level, it may translate raw sensor information directly into actuator commands. Sensor fusion may first be used to estimate parameters of interest (e.g. the position of the robot's gripper) from noisy sensor data. An immediate task (such as moving the gripper in a certain direction) is inferred from these estimates. Techniques from control theory convert the task into commands that drive the actuators.

At longer time scales or with more sophisticated tasks, the robot may need to build and reason with a "cognitive" model. Cognitive models try to represent the robot, the world, and how they interact. Pattern recognition and computer vision can be used to track objects. Mapping techniques can be used to build maps of the world. Finally, motion planning and other artificial intelligence techniques may be used to figure out how to act. For example, a planner may figure out how to achieve a task without hitting obstacles, falling over, etc.

Control systems may also have varying levels of autonomy. Direct interaction is used for haptic or tele-operated devices, and the human has nearly complete control over the robot's motion. Operator-assist modes have the operator commanding medium-to-high-level tasks, with the robot automatically figuring out how to achieve them. An autonomous robot may go for extended periods of time without human interaction. Higher levels of autonomy do not necessarily require more complex cognitive capabilities. For example, robots in assembly plants are completely autonomous, but operate in a fixed pattern.


Dynamics and kinematics

The study of motion can be divided into kinematics and dynamics. Direct kinematics refers to the calculation of end effector position, orientation, velocity, and acceleration when the corresponding joint values are known. Inverse kinematics refers to the opposite case in which required joint values are calculated for given end effector values, as done in path planning. Some special aspects of kinematics include handling of redundancy (different possibilities of performing the same movement), collision avoidance, and singularity avoidance. Once all relevant positions, velocities, and accelerations have been calculated using kinematics, methods from the field of dynamics are used to study the effect of forces upon these movements. Direct dynamics refers to the calculation of accelerations in the robot once the applied forces are known. Direct dynamics is used in computer simulations of the robot. Inverse dynamics refers to the calculation of the actuator forces necessary to create a prescribed end effector acceleration. This information can be used to improve the control algorithms of a robot.

In each area mentioned above, researchers strive to develop new concepts and strategies, improve existing ones, and improve the interaction between these areas. To do this, criteria for "optimal" performance and ways to optimize design, structure, and control of robots must be developed and implemented.


Robot Research

Much of the research in robotics focuses not on specific industrial tasks, but on investigations into new types of robots, alternative ways to think about or design robots, and new ways to manufacture them.

A first particular new innovation in robotdesign is the opensourcing of robot-projects. To describe the level of advancement of a robot, the term Generation Robots can be used. This term is coined by Professor Hans Moravec, Principal Research Scientist at the Carnegie Mellon University Robotics Institute in describing the near future evolution of robot technology. First, second and third generation robots are First generation robots, Moravec predicted in 1997, should have an intellectual capacity comparable to perhaps a lizard and should become available by 2010. Because the first generation robot would be incapable of learning, however, professor Moravec predicts that the second generation robot would be an improvement over the first and become available by 2020, with an intelligence maybe comparable to that of a mouse. The third generation robot should have an intelligence comparable to that of a monkey. Though fourth generation robots, robots with human intelligence, professor Moravec predicts, would become possible, he does not predict this happening before around 2040 or 2050.

The second is Evolutionary Robots. This is a methodology that uses evolutionary computation to help design robots, especially the body form, or motion and behavior controllers. In a similar way to natural evolution, a large population of robots is allowed to compete in some way, or their ability to perform a task is measured using a fitness function. Those that perform worst are removed from the population, and replaced by a new set, which have new behaviors based on those of the winners. Over time the population improves, and eventually a satisfactory robot may appear. This happens without any direct programming of the robots by the researchers. Researchers use this method both to create better robots, and to explore the nature of evolution. Because the process often requires many generations of robots to be simulated, this technique may be run entirely or mostly in simulation, then tested on real robots once the evolved algorithms are good enough.


Education and Training

The SCORBOT-ER 4u - educational robot.

Robotics as an undergraduate area of study is fairly common, although few universities offer robotics degrees.

In the United States, only Worcester Polytechnic Institute offers a Bachelor of Science in Robotics Engineering. Universities that have graduate degrees focused on robotics include Carnegie Mellon University, MIT, UPENN, and UCLA.

In Australia, there are Bachelor of Engineering degrees at the universities belonging to the Centre for Autonomous Systems (CAS) : University of Sydney, University of New South Wales, and the University of Technology, Sydney. Other universities include Deakin University, Flinders University, Swinburne University of Technology, and the University of Western Sydney. Others offer degrees in Mechatronics.

In India a post-graduate degree in Mechatronics is offered at Madras Institute of Technology, Chennai.

In the UK, Robotics degrees are offered by a number of institutions including the Heriot-Watt University, University of Essex, the University of Liverpool, University of Reading, Sheffield Hallam University, Staffordshire University,University of Sussex, Robert Gordon University and the University of Wales, Newport.

In Mexico, the Monterrey Institute of Technology and Higher Education offers a Bachelor of Science in Digital Systems and Robotics Engineering and a Bachelor of Science in Mechatronics.

In Iran, the Shahrood University of Technology and Hamedan University of Technology offer a Bachelor of Science in Robotics Engineering. Others offer degrees in Mechatronics. Universities that have graduate degrees focused on Mechatronics include Sharif university of Technology, Amirkabir university of technology, Khajeh Nasiroddin Tusi University of Technology, Tabriz university, and Semnan university.

Robots recently became a popular tool in raising interests in computing for middle and high school students. First year computer science courses at several universities were developed which involves the programming of a robot instead of the traditional software engineering based coursework. Examples include Course 6 at MIT and the Institute for Personal Robots in Education at the Georgia Institute of Technology with Bryn Mawr College.

Some specialised robotics jobs require new skills, such as those of robot installer and robot integrator. While universities have long included robotics research in their curricular offerings and tech schools have taught industrial robotic arm control, new college programs in applied mobile robots are under development at universities in both the US and EU, with help from Microsoft, MobileRobots Inc and other companies encouraging the growth of robotics.

Employment in robotics

A robot technician builds small all-terrain robots. (Courtesy: MobileRobots Inc)

As the number of robots increases, robotics-related jobs grow. Some jobs require existing job skills, such as building cables, assembling parts and testing.


Healthcare

Script Pro manufactures a robot designed to help pharmacies fill prescriptions that consist of oral solids or medications in pill form. The pharmacist or pharmacy technician enters the prescription information into its information system. The system, upon determining whether or not the drug is in the robot, will send the information to the robot for filling. The robot has 3 different size vials to fill determined by the size of the pill. The robot technician, user, or pharmacist determines the needed size of the vial based on the tablet when the robot is stocked. Once the vial is filled it is brought up to a conveyor belt that delivers it to a holder that spins the vial and attaches the patient label. Afterwards it is set on another conveyor that delivers the patient’s medication vial to a slot labeled with the patient's name on an LED read out. The pharmacist or technician then checks the contents of the vial to ensure it’s the correct drug for the correct patient and then seals the vials and sends it out front to be picked up. The robot is a very time efficient device that the pharmacy depends on to fill prescriptions.

McKesson’s Robot RX is another healthcare robotics product that helps inpatient pharmacies dispense thousands of medications daily with little or no errors. The robot can be ten feet wide and thirty feet long and can hold hundreds of different kinds of medications and thousands of doses. The pharmacy saves many resources like staff members that are otherwise unavailable in a resource scarce industry. It uses an electromechanical head coupled with a pneumatic system to capture each dose and deliver it to its either stocked or dispensed location. The head moves along a single axis while it rotates 180 degrees to pull the medications. During this process it uses barcode technology to verify its pulling the correct drug. It then delivers the drug to a patient specific bin on a conveyor belt. Once the bin is filled with all of the drugs that a particular patient needs and that the robot stocks, the bin is then released and returned out on the conveyor belt to a technician waiting to load it into a cart for delivery to the floor.

TUG robots, from Aethon, are a necessity for any hospital’s inpatient pharmacy. TUGs are a medication delivery robot. They are stationed at or near the pharmacy on a charging base designed to keep their batteries at optimal levels. Once a pharmacy has a number of meds to send to the floors, they load the TUGs by putting in their code to unlock the drawers and start sorting the meds by delivery station. After it has been loaded the user selects the locations in the order they want them delivered and then hit the send button. The TUG backs up, turns, and goes on it path to its destination. It uses a series of navigational tools to find it way around. For the most part it is laser guided and uses a 180 degree laser to check for walls and obstacles in its path. It also makes use of infrared sensors and sonar for navigation, obstacle avoidance, and detection. Using these navigational tools it uses an internal map that is designed by the TUG itself and an Implementation Specialist from Aethon to drive down a planned path to its destinations. If it needs to navigate between floors the company will, with help from an elevator vendor, set up an elevator computer interface and the TUG will communicate wirelessly with an elevator controller to gain access and control of an elevator to take it to the desired floor. From that point the TUG will make its delivery, return home, and wait for another delivery.

Tuesday, January 27, 2009

Artificial intelligence

Artificial intelligence (AI) is the intelligence of machines and the branch of computer science which aims to create it. Major AI textbooks define the field as "the study and design of intelligent agents," where an intelligent agent is a system that perceives its environment and takes actions which maximize its chances of success. John McCarthy, who coined the term in 1956, defines it as "the science and engineering of making intelligent machines."

The field was founded on the claim that a central property of human beings, intelligence—the sapience of Homo sapiens—can be so precisely described that it can be simulated by a machine. This raises philosophical issues about the nature of the mind and limits of scientific hubris, issues which have been addressed by myth, fiction and philosophy since antiquity. Artificial intelligence has been the subject of breathtaking optimism, has suffered stunning setbacks and, today, has become an essential part of the technology industry, providing the heavy lifting for many of the most difficult problems in computer science.

AI research is highly technical and specialized, so much so that some critics decry the "fragmentation" of the field. Subfields of AI are organized around particular problems, the application of particular tools and around long standing theoretical differences of opinion. The central problems of AI include such traits as reasoning, knowledge, planning, learning, communication, perception and the ability to move and manipulate objects. General intelligence (or "strong AI") is still a long term goal of (some) research.



Perspectives on AI

AI in myth, fiction and speculation

Thinking machines and artificial beings appear in Greek myths, such as Talos of Crete, the golden robots of Hephaestus and Pygmalion's Galatea. Human likenesses believed to have intelligence were built in many ancient societies; some of the earliest being the sacred statues worshipped in Egypt and Greece, and including the machines of Yan Shi, Hero of Alexandria, Al-Jazari or Wolfgang von Kempelen. It was widely believed that artificial beings had been created by Geber, Judah Loew and Paracelsus. Stories of these creatures and their fates discuss many of the same hopes, fears and ethical concerns that are presented by artificial intelligence.

Mary Shelley's Frankenstein, considers a key issue in the ethics of artificial intelligence: if a machine can be created that has intelligence, could it also feel? If it can feel, does it have the same rights as a human being? The idea also appears in modern science fiction: the film Artificial Intelligence: A.I. considers a machine in the form of a small boy which has been given the ability to feel human emotions, including, tragically, the capacity to suffer. This issue, now known as "robot rights", is currently being considered by, for example, California's Institute for the Future, although many critics believe that the discussion is premature.

Another issue explored by both science fiction writers and futurists is the impact of artificial intelligence on society. In fiction, AI has appeared as a saviour of the human race (R. Daneel Olivaw in the Foundation Series), a servant (R2D2 in Star Wars), a comrade (Lt. Commander Data in Star Trek), an extension to human abilities (Ghost in the Shell), a conqueror (The Matrix), a dictator (With Folded Hands), an exterminator (Terminator, Battlestar Galactica) and a race (Asurans in "Stargate Atlantis"). Academic sources have considered such consequences as: a decreased demand for human labor; the enhancement of human ability or experience; and a need for redefinition of human identity and basic values.

Several futurists argue that artificial intelligence will transcend the limits of progress and fundamentally transform humanity. Ray Kurzweil has used Moore's law (which describes the relentless exponential improvement in digital technology with uncanny accuracy) to calculate that desktop computers will have the same processing power as human brains by the year 2029, and that by 2045 artificial intelligence will reach a point where it is able to improve itself at a rate that far exceeds anything conceivable in the past, a scenario that science fiction writer Vernor Vinge named the "technological singularity". Edward Fredkin argues that "artificial intelligence is the next stage in evolution," an idea first proposed by Samuel Butler's Darwin Among the Machines (1863), and expanded upon by George Dyson in his book of the same name in 1998. Several futurists and science fiction writers have predicted that human beings and machines will merge in the future into cyborgs that are more capable and powerful than either. This idea, called transhumanism, which has roots in Aldous Huxley and Robert Ettinger, is now associated with robot designer Hans Moravec, cyberneticist Kevin Warwick and inventor Ray Kurzweil. Transhumanism has been illustrated in fiction as well, for example in the manga Ghost in the Shell and the science fiction series Dune. Pamela McCorduck writes that these scenarios are expressions of an ancient human desire to, as she calls it, "forge the gods."


History of AI research

In the middle of the 20th century, a handful of scientists began a new approach to building intelligent machines, based on recent discoveries in neurology, a new mathematical theory of information, an understanding of control and stability called cybernetics, and above all, by the invention of the digital computer, a machine based on the abstract essence of mathematical reasoning.

The field of modern AI research was founded at a conference on the campus of Dartmouth College in the summer of 1956. Those who attended would become the leaders of AI research for many decades, especially John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon, who founded AI laboratories at MIT, CMU and Stanford. They and their students wrote programs that were, to most people, simply astonishing: computers were solving word problems in algebra, proving logical theorems and speaking English. By the middle 60s their research was heavily funded by the U.S. Department of Defense and they were optimistic about the future of the new field:

  • 1965, H. A. Simon: "[M]achines will be capable, within twenty years, of doing any work a man can do"
  • 1967, Marvin Minsky: "Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved."

These predictions, and many like them, would not come true. They had failed to recognize the difficulty of some of the problems they faced. In 1974, in response to the criticism of England's Sir James Lighthill and ongoing pressure from Congress to fund more productive projects, the U.S. and British governments cut off all undirected, exploratory research in AI. This was the first AI Winter.

In the early 80s, AI research was revived by the commercial success of expert systems (a form of AI program that simulated the knowledge and analytical skills of one or more human experts). By 1985 the market for AI had reached more than a billion dollars and governments around the world poured money back into the field. However, just a few years later, beginning with the collapse of the Lisp Machine market in 1987, AI once again fell into disrepute, and a second, more lasting AI Winter began.

In the 90s and early 21st century AI achieved its greatest successes, albeit somewhat behind the scenes. Artificial intelligence was adopted throughout the technology industry, providing the heavy lifting for logistics, data mining, medical diagnosis and many other areas. The success was due to several factors: the incredible power of computers today, a greater emphasis on solving specific subproblems, the creation of new ties between AI and other fields working on similar problems, and above all a new commitment by researchers to solid mathematical methods and rigorous scientific standards.


Philosophy of AI

Artificial intelligence, by claiming to be able to recreate the capabilities of the human mind, is both a challenge and an inspiration for philosophy. Are there limits to how intelligent machines can be? Is there an essential difference between human intelligence and artificial intelligence? Can a machine have a mind and consciousness? A few of the most influential answers to these questions are given below.

  • Turing's "polite convention": If a machine acts as intelligently as a human being, then it is as intelligent as a human being. Alan Turing theorized that, ultimately, we can only judge the intelligence of machine based on its behavior. This theory forms the basis of the Turing test.
  • The Dartmouth proposal: "Every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it." This assertion was printed in the proposal for the Dartmouth Conference of 1956, and represents the position of most working AI researchers.
  • Newell and Simon's physical symbol system hypothesis: "A physical symbol system has the necessary and sufficient means of general intelligent action." This statement claims that the essence of intelligence is symbol manipulation. Hubert Dreyfus argued that, on the contrary, human expertise depends on unconscious instinct rather than conscious symbol manipulation and on having a "feel" for the situation rather than explicit symbolic knowledge.
  • Gödel's incompleteness theorem: A formal system (such as a computer program) can not prove all true statements. Roger Penrose is among those who claim that Gödel's theorem limits what machines can do.
  • Searle's strong AI hypothesis: "The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds." Searle counters this assertion with his Chinese room argument, which asks us to look inside the computer and try to find where the "mind" might be.
  • The artificial brain argument: The brain can be simulated. Hans Moravec, Ray Kurzweil and others have argued that it is technologically feasible to copy the brain directly into hardware and software, and that such a simulation will be essentially identical to the original. This argument combines the idea that a suitably powerful machine can simulate any process, with the materialist idea that the mind is the result of physical processes in the brain.

AI research

In the 21st century, AI research has become highly specialized and technical. It is deeply divided into subfields that often fail to communicate with each other. Subfields have grown up around particular institutions, the work of particular researchers, particular problems (listed below), long standing differences of opinion about how AI should be done (listed as "approaches" below) and the application of widely differing tools (see tools of AI, below).


Problems of AI

The problem of simulating (or creating) intelligence has been broken down into a number of specific sub-problems. These consist of particular traits or capabilities that researchers would like an intelligent system to display. The traits described below have received the most attention.

Deduction, reasoning, problem solving

Early AI researchers developed algorithms that imitated the step-by-step reasoning that human beings use when they solve puzzles, play board games or make logical deductions. By the late 80s and 90s, AI research had also developed highly successful methods for dealing with uncertain or incomplete information, employing concepts from probability and economics.

For difficult problems, most of these algorithms can require enormous computational resources — most experience a "combinatorial explosion": the amount of memory or computer time required becomes astronomical when the problem goes beyond a certain size. The search for more efficient problem solving algorithms is a high priority for AI research.

Human beings solve most of their problems using fast, intuitive judgments rather than the conscious, step-by-step deduction that early AI research was able to model. AI has made some progress at imitating this kind of "sub-symbolic" problem solving: embodied approaches emphasize the importance of sensorimotor skills to higher reasoning; neural net research attempts to simulate the structures inside human and animal brains that gives rise to this skill.

Knowledge representation

Knowledge representation and knowledge engineering are central to AI research. Many of the problems machines are expected to solve will require extensive knowledge about the world. Among the things that AI needs to represent are: objects, properties, categories and relations between objects; situations, events, states and time; causes and effects; knowledge about knowledge (what we know about what other people know); and many other, less well researched domains. A complete representation of "what exists" is an ontology (borrowing a word from traditional philosophy), of which the most general are called upper ontologies.

Among the most difficult problems in knowledge representation are:

Default reasoning and the qualification problem
Many of the things people know take the form of "working assumptions." For example, if a bird comes up in conversation, people typically picture an animal that is fist sized, sings, and flies. None of these things are true about all birds. John McCarthy identified this problem in 1969 as the qualification problem: for any commonsense rule that AI researchers care to represent, there tend to be a huge number of exceptions. Almost nothing is simply true or false in the way that abstract logic requires. AI research has explored a number of solutions to this problem.
The breadth of commonsense knowledge
The number of atomic facts that the average person knows is astronomical. Research projects that attempt to build a complete knowledge base of commonsense knowledge (e.g., Cyc) require enormous amounts of laborious ontological engineering — they must be built, by hand, one complicated concept at a time.
The subsymbolic form of some commonsense knowledge
Much of what people know isn't represented as "facts" or "statements" that they could actually say out loud. For example, a chess master will avoid a particular chess position because it "feels too exposed" or an art critic can take one look at a statue and instantly realize that it is a fake. These are intuitions or tendencies that are represented in the brain non-consciously and sub-symbolically. Knowledge like this informs, supports and provides a context for symbolic, conscious knowledge. As with the related problem of sub-symbolic reasoning, it is hoped that situated AI or computational intelligence will provide ways to represent this kind of knowledge.

Planning

Intelligent agents must be able to set goals and achieve them. They need a way to visualize the future (they must have a representation of the state of the world and be able to make predictions about how their actions will change it) and be able to make choices that maximize the utility (or "value") of the available choices.

In some planning problems, the agent can assume that it is the only thing acting on the world and it can be certain what the consequences of its actions may be. However, if this is not true, it must periodically check if the world matches its predictions and it must change its plan as this becomes necessary, requiring the agent to reason under uncertainty.

Multi-agent planning uses the cooperation and competition of many agents to achieve a given goal. Emergent behavior such as this is used by evolutionary algorithms and swarm intelligence.

Learning

Machine learning has been central to AI research from the beginning. Unsupervised learning is the ability to find patterns in a stream of input. Supervised learning includes both classification (be able to determine what category something belongs in, after seeing a number of examples of things from several categories) and regression (given a set of numerical input/output examples, discover a continuous function that would generate the outputs from the inputs). In reinforcement learning the agent is rewarded for good responses and punished for bad ones. These can be analyzed in terms decision theory, using concepts like utility. The mathematical analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory.

Natural language processing

Natural language processing gives machines the ability to read and understand the languages that the human beings speak. Many researchers hope that a sufficiently powerful natural language processing system would be able to acquire knowledge on its own, by reading the existing text available over the internet. Some straightforward applications of natural language processing include information retrieval (or text mining) and machine translation.

Motion and manipulation

ASIMO uses sensors and intelligent algorithms to avoid obstacles and navigate stairs.

The field of robotics is closely related to AI. Intelligence is required for robots to be able to handle such tasks as object manipulation and navigation, with sub-problems of localization (knowing where you are), mapping (learning what is around you) and motion planning (figuring out how to get there).

Perception

Machine perception is the ability to use input from sensors (such as cameras, microphones, sonar and others more exotic) to deduce aspects of the world. Computer vision is the ability to analyze visual input. A few selected subproblems are speech recognition, facial recognition and object recognition.

Social intelligence

Kismet, a robot with rudimentary social skills.

Emotion and social skills play two roles for an intelligent agent:

  • It must be able to predict the actions of others, by understanding their motives and emotional states. (This involves elements of game theory, decision theory, as well as the ability to model human emotions and the perceptual skills to detect emotions.)
  • For good human-computer interaction, an intelligent machine also needs to display emotions — at the very least it must appear polite and sensitive to the humans it interacts with. At best, it should appear to have normal emotions itself.

Creativity

A sub-field of AI addresses creativity both theoretically (from a philosophical and psychological perspective) and practically (via specific implementations of systems that generate outputs that can be considered creative).

General intelligence

Most researchers hope that their work will eventually be incorporated into a machine with general intelligence (known as strong AI), combining all the skills above and exceeding human abilities at most or all of them. A few believe that anthropomorphic features like artificial consciousness or an artificial brain may be required for such a project.

Many of the problems above are considered AI-complete: to solve one problem, you must solve them all. For example, even a straightforward, specific task like machine translation requires that the machine follow the author's argument (reason), know what it's talking about (knowledge), and faithfully reproduce the author's intention (social intelligence). Machine translation, therefore, is believed to be AI-complete: it may require strong AI to be done as well as humans can do it.


Approaches to AI

There is no established unifying theory or paradigm that guides AI research. Researchers disagree about many issues. A few of the most long standing questions that have remained unanswered are these: Can intelligence be reproduced using high-level symbols, similar to words and ideas? Or does it require "sub-symbolic" processing? Should artificial intelligence simulate natural intelligence, by studying human psychology or animal neurobiology? Or is human biology as irrelevant to AI research as bird biology is to aeronautical engineering? Can intelligent behavior be described using simple, elegant principles (such as logic or optimization)? Or does artificial intelligence necessarily require solving many unrelated problems?

Cybernetics and brain simulation

The human brain provides inspiration for artificial intelligence researchers, however there is no consensus on how closely it should be simulated.

In the 40s and 50s, a number of researchers explored the connection between neurology, information theory, and cybernetics. Some of them built machines that used electronic networks to exhibit rudimentary intelligence, such as W. Grey Walter's turtles and the Johns Hopkins Beast. Many of these researchers gathered for meetings of the Teleological Society at Princeton University and the Ratio Club in England.

Traditional symbolic AI

When access to digital computers became possible in the middle 1950s, AI research began to explore the possibility that human intelligence could be reduced to symbol manipulation. The research was centered in three institutions: CMU, Stanford and MIT, and each one developed its own style of research. John Haugeland named these approaches to AI "good old fashioned AI" or "GOFAI".

Cognitive simulation
Economist Herbert Simon and Alan Newell studied human problem solving skills and attempted to formalize them, and their work laid the foundations of the field of artificial intelligence, as well as cognitive science, operations research and management science. Their research team performed psychological experiments to demonstrate the similarities between human problem solving and the programs (such as their "General Problem Solver") they were developing. This tradition, centered at Carnegie Mellon University would eventually culminate in the development of the Soar architecture in the middle 80s.
Logical AI
Unlike Newell and Simon, John McCarthy felt that machines did not need to simulate human thought, but should instead try to find the essence of abstract reasoning and problem solving, regardless of whether people used the same algorithms. His laboratory at Stanford (SAIL) focused on using formal logic to solve a wide variety of problems, including knowledge representation, planning and learning. Logic was also focus of the work at the University of Edinburgh and elsewhere in Europe which led to the development of the programming language Prolog and the science of logic programming.
"Scruffy" symbolic AI
Researchers at MIT (such as Marvin Minsky and Seymour Papert) found that solving difficult problems in vision and natural language processing required ad-hoc solutions – they argued that there was no simple and general principle (like logic) that would capture all the aspects of intelligent behavior. Roger Schank described their "anti-logic" approaches as "scruffy" (as opposed to the "neat" paradigms at CMU and Stanford). Commonsense knowledge bases (such as Doug Lenat's Cyc) are an example of "scruffy" AI, since they must be built by hand, one complicated concept at a time.
Knowledge based AI
When computers with large memories became available around 1970, researchers from all three traditions began to build knowledge into AI applications. This "knowledge revolution" led to the development and deployment of expert systems (introduced by Edward Feigenbaum), the first truly successful form of AI software. The knowledge revolution was also driven by the realization that truly enormous amounts of knowledge would be required by many simple AI applications.

Sub-symbolic AI

During the 1960s, symbolic approaches had achieved great success at simulating high-level thinking in small demonstration programs. Approaches based on cybernetics or neural networks were abandoned or pushed into the background. By the 1980s, however, progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition, especially perception, robotics, learning and pattern recognition. A number of researchers began to look into "sub-symbolic" approaches to specific AI problems.

Bottom-up, embodied, situated, behavior-based or nouvelle AI
Researchers from the related field of robotics, such as Rodney Brooks, rejected symbolic AI and focussed on the basic engineering problems that would allow robots to move and survive. Their work revived the non-symbolic viewpoint of the early cybernetics researchers of the 50s and reintroduced the use of control theory in AI. These approaches are also conceptually related to the embodied mind thesis.
Computational Intelligence
Interest in neural networks and "connectionism" was revived by David Rumelhart and others in the middle 1980s. These and other sub-symbolic approaches, such as fuzzy systems and evolutionary computation, are now studied collectively by the emerging discipline of computational intelligence.
Formalisation
In the 1990s, AI researchers developed sophisticated mathematical tools to solve specific subproblems. These tools are truly scientific, in the sense that their results are both measurable and verifiable, and they have been responsible for many of AI's recent successes. The shared mathematical language has also permitted a high level of collaboration with more established fields (like mathematics, economics or operations research). Russell & Norvig (2003) describe this movement as nothing less than a "revolution" and "the victory of the neats."

Intelligent agent paradigm

The "intelligent agent" paradigm became widely accepted during the 1990s. An intelligent agent is a system that perceives its environment and takes actions which maximizes its chances of success. The simplest intelligent agents are programs that solve specific problems. The most complicated intelligent agents are rational, thinking human beings. The paradigm gives researchers license to study isolated problems and find solutions that are both verifiable and useful, without agreeing on one single approach. An agent that solves a specific problem can use any approach that works — some agents are symbolic and logical, some are sub-symbolic neural networks and others may use new approaches. The paradigm also gives researchers a common language to communicate with other fields—such as decision theory and economics—that also use concepts of abstract agents.

Integrating the approaches

An agent architecture or cognitive architecture allows researchers to build more versatile and intelligent systems out of interacting intelligent agents in a multi-agent system. A system with both symbolic and sub-symbolic components is a hybrid intelligent system, and the study of such systems is artificial intelligence systems integration. A hierarchical control system provides a bridge between sub-symbolic AI at its lowest, reactive levels and traditional symbolic AI at its highest levels, where relaxed time constraints permit planning and world modelling. Rodney Brooks' subsumption architecture was an early proposal for such a hierarchical system.


Tools of AI research

In the course of 50 years of research, AI has developed a large number of tools to solve the most difficult problems in computer science. A few of the most general of these methods are discussed below.

Search and optimization

Many problems in AI can be solved in theory by intelligently searching through many possible solutions: Reasoning can be reduced to performing a search. For example, logical proof can be viewed as searching for a path that leads from premises to conclusions, where each step is the application of an inference rule. Planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called means-ends analysis. Robotics algorithms for moving limbs and grasping objects use local searches in configuration space. Many learning algorithms use search algorithms based on optimization.

Simple exhaustive searches are rarely sufficient for most real world problems: the search space (the number of places to search) quickly grows to astronomical numbers. The result is a search that is too slow or never completes. The solution, for many problems, is to use "heuristics" or "rules of thumb" that eliminate choices that are unlikely to lead to the goal (called "pruning the search tree"). Heuristics supply the program with a "best guess" for what path the solution lies on.

A very different kind of search came to prominence in the 1990s, based on the mathematical theory of optimization. For many problems, it is possible to begin the search with some form of a guess and then refine the guess incrementally until no more refinements can be made. These algorithms can be visualized as blind hill climbing: we begin the search at a random point on the landscape, and then, by jumps or steps, we keep moving our guess uphill, until we reach the top. Other optimization algorithms are simulated annealing, beam search and random optimization.

Evolutionary computation uses a form of optimization search. For example, they may begin with a population of organisms (the guesses) and then allow them to mutate and recombine, selecting only the fittest to survive each generation (refining the guesses). Forms of evolutionary computation include swarm intelligence algorithms (such as ant colony or particle swarm optimization) and evolutionary algorithms (such as genetic algorithms and genetic programming).

Logic

Logic was introduced into AI research by John McCarthy in his 1958 Advice Taker proposal. The most important technical development was J. Alan Robinson's discovery of the resolution and unification algorithm for logical deduction in 1963. This procedure is simple, complete and entirely algorithmic, and can easily be performed by digital computers. However, a naive implementation of the algorithm quickly leads to a combinatorial explosion or an infinite loop. In 1974, Robert Kowalski suggested representing logical expressions as Horn clauses (statements in the form of rules: "if p then q"), which reduced logical deduction to backward chaining or forward chaining. This greatly alleviated (but did not eliminate) the problem.

Logic is used for knowledge representation and problem solving, but it can be applied to other problems as well. For example, the satplan algorithm uses logic for planning, and inductive logic programming is a method for learning. There are several different forms of logic used in AI research.

  • Propositional or sentential logic is the logic of statements which can be true or false.
  • First-order logic also allows the use of quantifiers and predicates, and can express facts about objects, their properties, and their relations with each other.
  • Fuzzy logic, a version of first-order logic which allows the truth of a statement to be represented as a value between 0 and 1, rather than simply True (1) or False (0). Fuzzy systems can be used for uncertain reasoning and have been widely used in modern industrial and consumer product control systems.
  • Default logics, non-monotonic logics and circumscription are forms of logic designed to help with default reasoning and the qualification problem.
  • Several extensions of logic have been designed to handle specific domains of knowledge, such as: description logics; situation calculus, event calculus and fluent calculus (for representing events and time); causal calculus; belief calculus; and modal logics.

Probabilistic methods for uncertain reasoning

Many problems in AI (in reasoning, planning, learning, perception and robotics) require the agent to operate with incomplete or uncertain information. Starting in the late 80s and early 90s, Judea Pearl and others championed the use of methods drawn from probability theory and economics to devise a number of powerful tools to solve these problems.

Bayesian networks are very general tool that can be used for a large number of problems: reasoning (using the Bayesian inference algorithm), learning (using the expectation-maximization algorithm), planning (using decision networks) and perception (using dynamic Bayesian networks).

Probabilistic algorithms can also be used for filtering, prediction, smoothing and finding explanations for streams of data, helping perception systems to analyze processes that occur over time (e.g., hidden Markov models and Kalman filters).

A key concept from the science of economics is "utility": a measure of how valuable something is to an intelligent agent. Precise mathematical tools have been developed that analyze how an agent can make choices and plan, using decision theory, decision analysis, information value theory. These tools include models such as Markov decision processes, dynamic decision networks, game theory and mechanism design.

Classifiers and statistical learning methods

The simplest AI applications can be divided into two types: classifiers ("if shiny then diamond") and controllers ("if shiny then pick up"). Controllers do however also classify conditions before inferring actions, and therefore classification forms a central part of many AI systems.

Classifiers are functions that use pattern matching to determine a closest match. They can be tuned according to examples, making them very attractive for use in AI. These examples are known as observations or patterns. In supervised learning, each pattern belongs to a certain predefined class. A class can be seen as a decision that has to be made. All the observations combined with their class labels are known as a data set.

When a new observation is received, that observation is classified based on previous experience. A classifier can be trained in various ways; there are many statistical and machine learning approaches.

A wide range of classifiers are available, each with its strengths and weaknesses. Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems; this is also referred to as the "no free lunch" theorem. Various empirical tests have been performed to compare classifier performance and to find the characteristics of data that determine classifier performance. Determining a suitable classifier for a given problem is however still more an art than science.

The most widely used classifiers are the neural network, kernel methods such as the support vector machine, k-nearest neighbor algorithm, Gaussian mixture model, naive Bayes classifier, and decision tree. The performance of these classifiers have been compared over a wide range of classification tasks in order to find data characteristics that determine classifier performance.

Neural networks

A neural network is an interconnected group of nodes, akin to the vast network of neurons in the human brain.

The study of artificial neural networks began in the decade before the field AI research was founded. In the 1960s Frank Rosenblatt developed an important early version, the perceptron. Paul Werbos developed the backpropagation algorithm for multilayer perceptrons in 1974, which led to a renaissance in neural network research and connectionism in general in the middle 1980s. The Hopfield net, a form of attractor network, was first described by John Hopfield in 1982.

Common network architectures which have been developed include the feedforward neural network, the radial basis network, the Kohonen self-organizing map and various recurrent neural networks. Neural networks are applied to the problem of learning, using such techniques as Hebbian learning, competitive learning and the relatively new architectures of Hierarchical Temporal Memory and Deep Belief Networks.

Control theory

Control theory, the grandchild of cybernetics, has many important applications, especially in robotics.

Specialized languages

AI researchers have developed several specialized languages for AI research:

  • IPL includes features intended to support programs that could perform general problem solving, including lists, associations, schemas (frames), dynamic memory allocation, data types, recursion, associative retrieval, functions as arguments, generators (streams), and cooperative multitasking.
  • Lisp is a practical mathematical notation for computer programs based on lambda calculus. Linked lists are one of Lisp languages' major data structures, and Lisp source code is itself made up of lists. As a result, Lisp programs can manipulate source code as a data structure, giving rise to the macro systems that allow programmers to create new syntax or even new domain-specific programming languages embedded in Lisp. There are many dialects of Lisp in use today.
  • Prolog is a declarative language where programs are expressed in terms of relations, and execution occurs by running queries over these relations. Prolog is particularly useful for symbolic reasoning, database and language parsing applications. Prolog is widely used in AI today.
  • STRIPS is a language for expressing automated planning problem instances. It expresses an initial state, the goal states, and a set of actions. For each action preconditions (what must be established before the action is performed) and postconditions (what is established after the action is performed) are specified.
  • Planner is a hybrid between procedural and logical languages. It gives a procedural interpretation to logical sentences where implications are interpreted with pattern-directed inference.

AI applications are also often written in standard languages like C++ and languages designed for mathematics, such as Matlab and Lush.

Evaluating artificial intelligence

How can one determine if an agent is intelligent? In 1950, Alan Turing proposed a general procedure to test the intelligence of an agent now known as the Turing test. This procedure allows almost all the major problems of artificial intelligence to be tested. However, it is a very difficult challenge and at present all agents fail.

Artificial intelligence can also be evaluated on specific problems such as small problems in chemistry, hand-writing recognition and game-playing. Such tests have been termed subject matter expert Turing tests. Smaller problems provide more achievable goals and there are an ever-increasing number of positive results.

The broad classes of outcome for an AI test are:

  • optimal: it is not possible to perform better
  • strong super-human: performs better than all humans
  • super-human: performs better than most humans
  • sub-human: performs worse than most humans

For example, performance at checkers (draughts) is optimal, performance at chess is super-human and nearing strong super-human, and performance at many everyday tasks performed by humans is sub-human.

Competitions and prizes

There are a number of competitions and prizes to promote research in artificial intelligence. The main areas promoted are: general machine intelligence, conversational behaviour, data-mining, driverless cars, robot soccer and games.


Applications of artificial intelligence

Artificial intelligence has successfully been used in a wide range of fields including medical diagnosis, stock trading, robot control, law, scientific discovery, video games and toys. Frequently, when a technique reaches mainstream use it is no longer considered artificial intelligence, sometimes described as the AI effect. It may also become integrated into artificial life.