Tag Archives: Neural Networks Explanation

A good review of the state of the art in hybridizing NNs and physical knowledge

Mikel Merino-Olagüe, Xabier Iriarte, Carlos Castellano-Aldave, Aitor Plaza, Hybrid modelling and identification of mechanical systems using Physics-Enhanced Machine Learning, Engineering Applications of Artificial Intelligence, Volume 159, Part C, 2025, 10.1016/j.engappai.2025.111762.

Obtaining mathematical models for mechanical systems is a key subject in engineering. These models are essential for calculation, simulation and design tasks, and they are usually obtained from physical principles or by fitting a black-box parametric input–output model to experimental data. However, both methodologies have some limitations: physics based models may not take some phenomena into account and black-box models are complicated to interpretate. In this work, we develop a novel methodology based on discrepancy modelling, which combines physical principles with neural networks to model mechanical systems with partially unknown or unmodelled physics. Two different mechanical systems with partially unknown dynamics are successfully modelled and the values of their physical parameters are obtained. Furthermore, the obtained models enable numerical integration for future state prediction, linearization and the possibility of varying the values of the physical parameters. The results show how a hybrid methodology provides accurate and interpretable models for mechanical systems when some physical information is missing. In essence, the presented methodology is a tool to obtain better mathematical models, which could be used for analysis, simulation and design tasks.

Building explanations for AI plans by modifying user’s models to make those plans optimal within them

Sarath Sreedharan, Tathagata Chakraborti, Subbarao Kambhampati, Foundations of explanations as model reconciliation, Artificial Intelligence, Volume 301,
2021, DOI: 10.1016/j.artint.2021.103558.

Past work on plan explanations primarily involved the AI system explaining the correctness of its plan and the rationale for its decision in terms of its own model. Such soliloquy is wholly inadequate in most realistic scenarios where users have domain and task models that differ from that used by the AI system. We posit that the explanations are best studied in light of these differing models. In particular, we show how explanation can be seen as a \u201cmodel reconciliation problem\u201d (MRP), where the AI system in effect suggests changes to the user’s mental model so as to make its plan be optimal with respect to that changed user model. We will study the properties of such explanations, present algorithms for automatically computing them, discuss relevant extensions to the basic framework, and evaluate the performance of the proposed algorithms both empirically and through controlled user studies.

Generating contrafactual explanations of Deep RL decisions to identify flawed agents

Matthew L. Olson, Roli Khanna, Lawrence Neal, Fuxin Li, Weng-Keen Wong, Counterfactual state explanations for reinforcement learning agents via generative deep learning, . Artificial Intelligence, Volume 295, 2021 DOI: 10.1016/j.artint.2021.103455.

Counterfactual explanations, which deal with “why not?” scenarios, can provide insightful explanations to an AI agent’s behavior [Miller [38]]. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanations based on generative deep learning. Specifically, a counterfactual state illustrates what minimal change is needed to an Atari game image such that the agent chooses a different action. We also evaluate the effectiveness of counterfactual states on human participants who are not machine learning experts. Our first user study investigates if humans can discern if the counterfactual state explanations are produced by the actual game or produced by a generative deep learning approach. Our second user study investigates if counterfactual state explanations can help non-expert participants identify a flawed agent; we compare against a baseline approach based on a nearest neighbor explanation which uses images from the actual game. Our results indicate that counterfactual state explanations have sufficient fidelity to the actual game images to enable non-experts to more effectively identify a flawed RL agent compared to the nearest neighbor baseline and to having no explanation at all.