Torbjørn Semb Dahl

Research

Publication

Research Statement

Understanding the structural and interaction-related principles behind the emergence of intelligent behaviour is the greatest scientific challenge of the present time. Studying these principles in robots is advantageous for two reasons. First, while robots face all the complexities of real world interaction, they are easier to study than animals, including humans. Animals provide working examples but, for practical and ethical reasons, we cannot easily manipulate their control systems for experimental purposes. Robots allow us more freedom to explore different principles for autonomous control. Second, by reproducing intelligent behaviour in robots we can automate tasks that are undesirable or dangerous to humans.

One of the most baffling aspects of intelligent behaviour in animals, including humans, is its pervasive adaptability. The behaviour of an animal or a human continues to adapt throughout its lifetime. Brain areas can be reprogrammed to provide new functionality in the case of serious injury such as the loss of a limb. In spite of this adaptability animal and human behaviour is coherent, focused and stable. My interest is in developing algorithms for robot behaviours that have similar properties.

My goal is to develop a successful, internationally recognised, research programme in the areas of intelligent robotics and related fields. Since my period as a post-doc I have not been in an environment conducive to academic research and, as a result, I have had limited success in securing funding, supervising research students and doing research in general. My immediate aim is to get into a position that allows me to develop as a researcher as well as a teacher and manager. I will, given such an opportunity, make every effort necessary to accelerate my research career.

Current Research

My current research focuses on three areas of intelligent robotics. The first attempts to increase the capabilities of existing RL algorithms to a level where they can be used by robots to learn complex time-extended behaviours from human teachers. When using high level learning methods such as repetition of guided behaviour and imitation from observation, teachers do not have to be computer programmers. This will significantly reduce the cost of developing the desired robot behaviours. The goal is to develop algorithms that can do reward estimation and state identification using a learned hierarchy of state-action sequences represented as self-organising maps. Such a hierarchy is a significantly more efficient form of memory than traditional traces.

The second area of effort is the development and application of a behaviour development methodology. This methodology is inspired by evolution and has been shown to produce novel solutions with a distinct biological flavor. In applying this methodology to increasingly complex problem domains, e.g., multi-robot control, we aim to identify new and better ways of structuring robot behaviours and also to further develop the methodology itself.

The last area is the implementation of robot behaviours with a ‘neural’ interface. Such behaviours can be seamlessly combined with neural network modules to produce increasingly complex learning capabilities. The ornithologist Bruce Moore has presented evidence that different types of learning, as found in birds, form a strict hierarchy. This suggests that complex types of learning can only take place on top of simpler types. This work models this hierarchy on robots in order to gain new insights into the structure of adaptive behaviours and learning as well as its integration with pre-programmed behaviours.

Research Experience

  • At the Norwegian Defence Research institute I lead a team of scientists researching models of humans for tactical and strategic simulations. The work included game theoretic solutions, reinforcement learning, neural networks and rule-based systems.
  • As a post-doc I researched distributed reinforcement learning in cooperative multi-robot systems. I demonstrated how specialised behaviours would reliably appear in homogeneous groups of adaptive robots and used this phenomenon to develop an adaptive, distributed resource allocation algorithm based on vacancy chains, a resource distribution mechanism found in human and animal societies.
  • My PhD research was on the development of adaptive robot behaviours with inspiration from evolution. In particular I looked at the incremental development of complex adaptive strategies using self-contained intermediate solutions. I showed how such an approach produces unconventional solutions that were more robust than existing monolithic solutions.
  • As a research assistant I developed a proof engine that added the capability of considering background domain knowledge to the knowledge discovery (data mining) tool Primus.
  • For my MSc project I applied a prototype proof procedure to the problem of controlling a multi-elevator system using logic programming.
  • As a research student at Hewlett-Packard I developed an agent-based task allocation system for an IT help desk. I also looked at the evolution of online market mechanisms in a context of minimal intelligence traders.

Research Supervision

PhD Completions

Ongoing PhD Supervision

PhD Examinations

  • External examiner, Qiming Shen, University of Hertfordshire, 2010
  • External examiner, Fei Chao, Aberystwyth University, 2009