Redirigiendo al acceso original de articulo en 16 segundos...
Inicio  /  Applied Sciences  /  Vol: 14 Par: 7 (2024)  /  Artículo
ARTÍCULO
TITULO

Extending Context Window in Large Language Models with Segmented Base Adjustment for Rotary Position Embeddings

Rongsheng Li    
Jin Xu    
Zhixiong Cao    
Hai-Tao Zheng and Hong-Gee Kim    

Resumen

In the realm of large language models (LLMs), extending the context window for long text processing is crucial for enhancing performance. This paper introduces SBA-RoPE (Segmented Base Adjustment for Rotary Position Embeddings), a novel approach designed to efficiently extend the context window by segmentally adjusting the base of rotary position embeddings (RoPE). Unlike existing methods, such as Position Interpolation (PI), NTK, and YaRN, SBA-RoPE modifies the base of RoPE across different dimensions, optimizing the encoding of positional information for extended sequences. Through experiments on the Pythia model, we demonstrate the effectiveness of SBA-RoPE in extending context windows, particularly for texts exceeding the original training lengths. We fine-tuned the Pythia-2.8B model on the PG-19 dataset and conducted passkey retrieval and perplexity (PPL) experiments on the Proof-pile dataset to evaluate model performance. Results show that SBA-RoPE maintains or improves model performance when extending the context window, especially on longer text sequences. Compared to other methods, SBA-RoPE exhibits superior or comparable performance across various lengths and tasks, highlighting its potential as an effective technique for context window extension in LLMs.

 Artículos similares

       
 
Flavia Namagembe,Agnes Nakakawa,Fiona P. Tulinayo,Henderik A. Proper,Sietse Overbeek     Pág. 30 - 66
The growth and uptake of e-government in developing economies are still affected by the interoperability challenge, which can be perceived as an orchestration of several issues that imply the existence of gaps in methods used for e-government planning an... ver más

 
Seok-Ho Han, Husna Mutahira and Hoon-Seok Jang    
Ensuring food security has become of paramount importance due to the rising global population. In particular, the agriculture sector in South Korea faces several challenges such as an aging farming population and a decline in the labor force. These issue... ver más
Revista: Applied Sciences

 
Peiqi Sun, Xuwen Cao and Liusuo Zhang    
Diets regulate animal development, reproduction, and lifespan. However, the underlying molecular mechanisms remain elusive. We previously showed that a chemically defined CeMM diet attenuates the development and promotes the longevity of C. elegans, but ... ver más

 
Wassen Aldjanabi, Abdelghani Dahou, Mohammed A. A. Al-qaness, Mohamed Abd Elaziz, Ahmed Mohamed Helmi and Robertas Dama?evicius    
As social media platforms offer a medium for opinion expression, social phenomena such as hatred, offensive language, racism, and all forms of verbal violence have increased spectacularly. These behaviors do not affect specific countries, groups, or comm... ver más
Revista: Informatics

 
Steven Bouma, Christophe Hurter and Alexandru Telea    
Creating simplified visualizations of large 3D trail sets with limited occlusion and preservation of the main structures in the data is challenging. We address this challenge for the specific context of 3D fiber trails created by DTI tractography. For th... ver más
Revista: Algorithms