ARTÍCULO
TITULO

Multiparty Dynamics and Failure Modes for Machine Learning and Artificial Intelligence

David Manheim    

Resumen

An important challenge for safety in machine learning and artificial intelligence systems is a set of related failures involving specification gaming, reward hacking, fragility to distributional shifts, and Goodhart?s or Campbell?s law. This paper presents additional failure modes for interactions within multi-agent systems that are closely related. These multi-agent failure modes are more complex, more problematic, and less well understood than the single-agent case, and are also already occurring, largely unnoticed. After motivating the discussion with examples from poker-playing artificial intelligence (AI), the paper explains why these failure modes are in some senses unavoidable. Following this, the paper categorizes failure modes, provides definitions, and cites examples for each of the modes: accidental steering, coordination failures, adversarial misalignment, input spoofing and filtering, and goal co-option or direct hacking. The paper then discusses how extant literature on multi-agent AI fails to address these failure modes, and identifies work which may be useful for the mitigation of these failure modes.

 Artículos similares

       
 
Yizhou Zhuang, Xiaoyao Hu, Wenbin He, Danyi Shen and Yijun Zhu    
Landslides not only cause great economic and human life losses but also seriously affect the safe operation of infrastructure such as highways. Rainfall is an important condition for inducing landslides, especially when a fault and weak interlayer exist ... ver más
Revista: Water

 
Zahra Gharibreza, Mahmoud Ghazavi and M. Hesham El Naggar    
Unsaturated soil covers a significant part of the world, and studying the behavior of deep foundations in this medium is an important step in increasing accuracy and economic efficiency in geotechnical studies. This paper presents an analytical solution ... ver más
Revista: Water

 
Margot Hurlbert, John Bosco Acharibasam, Ranjan Datta, Sharon Strongarm and Ethel Starblanket    
Indigenous Peoples in Canada have shown great strength and resilience in maintaining their cultures and ways of life to date in the face of settler colonialism. Centering the Water crises within Indigenous sovereignty and self-determination, we explore t... ver más
Revista: Water

 
Ayman El-Zohairy, Hani Salim, Hesham Shaaban and Mahmoud T. Nawar    
Fatigue in steel?concrete composite beams can result from cyclic loading, causing stress fluctuations that may lead to cumulative damage and eventual failure over an extended period. In this paper, the experimental findings from fatigue loading tests on ... ver más
Revista: Infrastructures

 
Sipho G. Thango, Georgios A. Drosopoulos, Siphesihle M. Motsa and Georgios E. Stavroulakis    
A methodology to predict key aspects of the structural response of masonry walls under blast loading using artificial neural networks (ANN) is presented in this paper. The failure patterns of masonry walls due to in and out-of-plane loading are complex d... ver más
Revista: Infrastructures