Application of reinforcement learning to the defense against attacks on communication networks

Kleanthis Malialisa, Sam Devlina & Daniel Kudenkoa, Distributed reinforcement learning for adaptive and robust network intrusion response, Connection Science, DOI: 0.1080/09540091.2015.1031082.

Distributed denial of service (DDoS) attacks constitute a rapidly evolving threat in the current Internet. Multiagent Router Throttling is a novel approach to defend against DDoS attacks where multiple reinforcement learning agents are installed on a set of routers and learn to rate-limit or throttle traffic towards a victim server. The focus of this paper is on online learning and scalability. We propose an approach that incorporates task decomposition, team rewards and a form of reward shaping called difference rewards. One of the novel characteristics of the proposed system is that it provides a decentralised coordinated response to the DDoS problem, thus being resilient to DDoS attacks themselves. The proposed system learns remarkably fast, thus being suitable for online learning. Furthermore, its scalability is successfully demonstrated in experiments involving 1000 learning agents. We compare our approach against a baseline and a popular state-of-the-art throttling technique from the network security literature and show that the proposed approach is more effective, adaptive to sophisticated attack rate dynamics and robust to agent failures.

Example of both bottom-up and top-down processes that are integrated in a solution for the recognition of shapes

Ching L. Teo, Cornelia Fermüller, and Yiannis Aloimonos, A Gestaltist approach to contour-based object recognition: Combining bottom-up and top-down cues, The International Journal of Robotics Research April 2015 34: 627-652, first published on March 25, 2015, DOI: 10.1177/0278364914558493.

This paper proposes a method for detecting generic classes of objects from their representative contours that can be used by a robot with vision to find objects in cluttered environments. The approach uses a mid-level image operator to group edges into contours which likely correspond to object boundaries. This mid-level operator is used in two ways, bottom-up on simple edges and top-down incorporating object shape information, thus acting as the intermediary between low-level and high-level information. First, the mid-level operator, called the image torque, is applied to simple edges to extract likely fixation locations of objects. Using the operator’s output, a novel contour-based descriptor is created that extends the shape context descriptor to include boundary ownership information and accounts for rotation. This descriptor is then used in a multi-scale matching approach to modulate the torque operator towards the target, so it indicates its location and size. Unlike other approaches that use edges directly to guide the independent edge grouping and matching processes for recognition, both of these steps are effectively combined using the proposed method. We evaluate the performance of our approach using four diverse datasets containing a variety of object categories in clutter, occlusion and viewpoint changes. Compared with current state-of-the-art approaches, our approach is able to detect the target with fewer false alarms in most object categories. The performance is further improved when we exploit depth information available from the Kinect RGB-Depth sensor by imposing depth consistency when applying the image torque.

Robot kidnapping detection based on support vector machines

Dylan Campbell, Mark Whitty, Metric-based detection of robot kidnapping with an SVM classifier, Robotics and Autonomous Systems, Volume 69, July 2015, Pages 40-51, ISSN 0921-8890, DOI: 10.1016/j.robot.2014.08.004.

Kidnapping occurs when a robot is unaware that it has not correctly ascertained its position, potentially causing severe map deformation and reducing the robot’s functionality. This paper presents metric-based techniques for real-time kidnap detection, utilising either linear or SVM classifiers to identify all kidnapping events during the autonomous operation of a mobile robot. In contrast, existing techniques either solve specific cases of kidnapping, such as elevator motion, without addressing the general case or remove dependence on local pose estimation entirely, an inefficient and computationally expensive approach. Three metrics that measured the quality of a pose estimate were evaluated and a joint classifier was constructed by combining the most discriminative quality metric with a fourth metric that measured the discrepancy between two independent pose estimates. A multi-class Support Vector Machine classifier was also trained using all four metrics and produced better classification results than the simpler joint classifier, at the cost of requiring a larger training dataset. While metrics specific to 3D point clouds were used, the approach can be generalised to other forms of data, including visual, provided that two independent ways of estimating pose are available.

A nice SLAM approach based on hybrid Normal Distribution Transform (NDT) + occupancy grid maps intended for long term operation in dynamic environments

Erik Einhorn, Horst-Michael Gross, Generic NDT mapping in dynamic environments and its application for lifelong SLAM, Robotics and Autonomous Systems, Volume 69, July 2015, Pages 28-39, ISSN 0921-8890, DOI: 10.1016/j.robot.2014.08.008.

In this paper, we present a new, generic approach for Simultaneous Localization and Mapping (SLAM). First of all, we propose an abstraction of the underlying sensor data using Normal Distribution Transform (NDT) maps that are suitable for making our approach independent from the used sensor and the dimension of the generated maps. We present several modifications for the original NDT mapping to handle free-space measurements explicitly. We additionally describe a method to detect and handle dynamic objects such as moving persons. This enables the usage of the proposed approach in highly dynamic environments. In the second part of this paper we describe our graph-based SLAM approach that is designed for lifelong usage. Therefore, the memory and computational complexity is limited by pruning the pose graph in an appropriate way.

Study of the explanation of probability and reasoning in the human mind through mental models, probability logic and classical logic

P.N. Johnson-Laird, Sangeet S. Khemlani, Geoffrey P. Goodwin, Logic, probability, and human reasoning, Trends in Cognitive Sciences, Volume 19, Issue 4, April 2015, Pages 201-214, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.02.006.

This review addresses the long-standing puzzle of how logic and probability fit together in human reasoning. Many cognitive scientists argue that conventional logic cannot underlie deductions, because it never requires valid conclusions to be withdrawn – not even if they are false; it treats conditional assertions implausibly; and it yields many vapid, although valid, conclusions. A new paradigm of probability logic allows conclusions to be withdrawn and treats conditionals more plausibly, although it does not address the problem of vapidity. The theory of mental models solves all of these problems. It explains how people reason about probabilities and postulates that the machinery for reasoning is itself probabilistic. Recent investigations accordingly suggest a way to integrate probability and deduction.

Neurological evidences of the hierarchical arrangement of the process of motor skill learning

Jörn Diedrichsen, Katja Kornysheva, Motor skill learning between selection and execution, Trends in Cognitive Sciences, Volume 19, Issue 4, April 2015, Pages 227-233, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.02.003.

Learning motor skills evolves from the effortful selection of single movement elements to their combined fast and accurate production. We review recent trends in the study of skill learning which suggest a hierarchical organization of the representations that underlie such expert performance, with premotor areas encoding short sequential movement elements (chunks) or particular component features (timing/spatial organization). This hierarchical representation allows the system to utilize elements of well-learned skills in a flexible manner. One neural correlate of skill development is the emergence of specialized neural circuits that can produce the required elements in a stable and invariant fashion. We discuss the challenges in detecting these changes with fMRI.

Interesting paper on fault tolerance applied to robotics, with good survey of the subject

D. Crestani, K. Godary-Dejean, L. Lapierre, Enhancing fault tolerance of autonomous mobile robots, Robotics and Autonomous Systems, Volume 68, June 2015, Pages 140-155, ISSN 0921-8890, DOI: 10.1016/j.robot.2014.12.015.

Experience demonstrates that autonomous mobile robots running in the field in a dynamic environment often breakdown. Generally, mobile robots are not designed to efficiently manage faulty or unforeseen situations. Even if some research studies exist, there is a lack of a global approach that really integrates dependability and particularly fault tolerance into the mobile robot design.
This paper presents an approach that aims to integrate fault tolerance principles into the design of a robot real-time control architecture. A failure mode analysis is firstly conducted to identify and characterize the most relevant faults. Then the fault detection and diagnosis mechanisms are explained. Fault detection is based on dedicated software components scanning faulty behaviors. Diagnosis is based on the residual principle and signature analysis to identify faulty software or hardware components and faulty behaviors. Finally, the recovery mechanism, based on the modality principle, proposes to adapt the robot’s control loop according to the context and current operational functions of the robot.
This approach has been applied and implemented in the control architecture of a Pioneer 3DX mobile robot.

Novelty detection as a way for enhancing learning capabilities of a robot, and a brief but interesting survey of motivational theories and their difference with attention

Y. Gatsoulis, T.M. McGinnity, Intrinsically motivated learning systems based on biologically-inspired novelty detection, Robotics and Autonomous Systems, Volume 68, June 2015, Pages 12-20, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.02.006.

Intrinsic motivations play an important role in human learning, particularly in the early stages of childhood development, and ideas from this research field have influenced robotic learning and adaptability. In this paper we investigate one specific type of intrinsic motivation, that of novelty detection and we discuss the reasons that make it a powerful facility for continuous learning. We formulate and present one original type of biologically inspired novelty detection architecture and implement it on a robotic system engaged in a perceptual classification task. The results of real-world robot experiments we conducted show how this original architecture conforms to behavioural observations and demonstrate its effectiveness in terms of focusing the system’s attention in areas that are potential for effective learning.

On the history of IEEE Transactions on Robotics and Automation, ICRA, and others

Sabanovic, S.; Milojevic, S.; Asaro, P.; Francisco, M., Robotics Narratives and Networks [History], Robotics & Automation Magazine, IEEE , vol.22, no.1, pp.137,146, March 2015, DOI: 10.1109/MRA.2014.2385564.

Somewhere around 1983, maybe late 1982, there was talk beginning about doing something more formal within IEEE that dealt with robotics and automation. Informally, activity was getting started through the Control Society,…also Systems, Man and Cybernetics, which obviously makes a lot of sense with the telerobotics things and a few others. But we wanted to build a more permanent home for it, so there was one of the first meetings. George Saridis chaired the meeting. I know George Bekey was there, Tony Bejczy, Lou Paul, probably another half dozen people.

Abstract data-type for exchanging information in real-time systems, prioritizing the access to newest data rather than to oldest

Dantam, N.T.; Lofaro, D.M.; Hereid, A.; Oh, P.Y.; Ames, A.D.; Stilman, M., The Ach Library: A New Framework for Real-Time Communication, Robotics & Automation Magazine, IEEE , vol.22, no.1, pp.76,85, March 2015, DOI: 10.1109/MRA.2014.2356937.

Correct real-time software is vital for robots in safety-critical roles such as service and disaster response. These systems depend on software for locomotion, navigation, manipulation, and even seemingly innocuous tasks such as safely regulating battery voltage. A multiprocess software design increases robustness by isolating errors to a single process, allowing the rest of the system to continue operation. This approach also assists with modularity and concurrency. For real-time tasks, such as dynamic balance and force control of manipulators, it is critical to communicate the latest data sample with minimum latency. There are many communication approaches intended for both general-purpose and real-time needs [9], [13], [15], [17], [19]. Typical methods focus on reliable communication or network transparency and accept a tradeoff of increased message latency or the potential to discard newer data. By focusing instead on the specific case of real-time communication on a single host, we reduce communication latency and guarantee access to the latest sample. We present a new interprocess communication (IPC) library, Ach which addresses this need, and discuss its application for real-time multiprocess control on three humanoid robots (Figure 1). (Ach is available at http://www.golems.org/projects/ach.html. The name Ach comes from the common abbreviation for the motor neurotransmitter Acetylcholine and the computer networking term ACK.).