Redirigiendo al acceso original de articulo en 15 segundos...
Inicio  /  Information  /  Vol: 15 Par: 2 (2024)  /  Artículo
ARTÍCULO
TITULO

Do Large Language Models Show Human-like Biases? Exploring Confidence?Competence Gap in AI

Aniket Kumar Singh    
Bishal Lamichhane    
Suman Devkota    
Uttam Dhakal and Chandra Dhakal    

Resumen

This study investigates self-assessment tendencies in Large Language Models (LLMs), examining if patterns resemble human cognitive biases like the Dunning?Kruger effect. LLMs, including GPT, BARD, Claude, and LLaMA, are evaluated using confidence scores on reasoning tasks. The models provide self-assessed confidence levels before and after responding to different questions. The results show cases where high confidence does not correlate with correctness, suggesting overconfidence. Conversely, low confidence despite accurate responses indicates potential underestimation. The confidence scores vary across problem categories and difficulties, reducing confidence for complex queries. GPT-4 displays consistent confidence, while LLaMA and Claude demonstrate more variations. Some of these patterns resemble the Dunning?Kruger effect, where incompetence leads to inflated self-evaluations. While not conclusively evident, these observations parallel this phenomenon and provide a foundation to further explore the alignment of competence and confidence in LLMs. As LLMs continue to expand their societal roles, further research into their self-assessment mechanisms is warranted to fully understand their capabilities and limitations.

 Artículos similares

       
 
Vincenzo Manca    
This paper presents an agile method of logical semantics based on high-order Predicate Logic. An operator of predicate abstraction is introduced that provides a simple mechanism for logical aggregation of predicates and for logical typing. Monadic high-o... ver más
Revista: Information

 
Andrei Paraschiv, Teodora Andreea Ion and Mihai Dascalu    
The advent of online platforms and services has revolutionized communication, enabling users to share opinions and ideas seamlessly. However, this convenience has also brought about a surge in offensive and harmful language across various communication m... ver más
Revista: Information

 
Rongsheng Li, Jin Xu, Zhixiong Cao, Hai-Tao Zheng and Hong-Gee Kim    
In the realm of large language models (LLMs), extending the context window for long text processing is crucial for enhancing performance. This paper introduces SBA-RoPE (Segmented Base Adjustment for Rotary Position Embeddings), a novel approach designed... ver más
Revista: Applied Sciences

 
Jiahao Fan and Weijun Pan    
In recent years, automatic speech recognition (ASR) technology has improved significantly. However, the training process for an ASR model is complex, involving large amounts of data and a large number of algorithms. The task of training a new model for a... ver más
Revista: Aerospace

 
Xin Tian and Yuan Meng    
The judicious configuration of predicates is a crucial but often overlooked aspect in the field of knowledge graphs. While previous research has primarily focused on the precision of triples in assessing knowledge graph quality, the rationality of predic... ver más
Revista: Algorithms