ARTÍCULO
TITULO

Applicability Analysis and Ensemble Application of BERT with TF-IDF, TextRank, MMR, and LDA for Topic Classification Based on Flood-Related VGI

Wenying Du    
Chang Ge    
Shuang Yao    
Nengcheng Chen and Lei Xu    

Resumen

Volunteered geographic information (VGI) plays an increasingly crucial role in flash floods. However, topic classification and spatiotemporal analysis are complicated by the various expressions and lengths of social media textual data. This paper conducted applicability analysis on bidirectional encoder representation from transformers (BERT) and four traditional methods, TextRank, term frequency?inverse document frequency (TF-IDF), maximal marginal relevance (MMR), and linear discriminant analysis (LDA), and the results show that for user type, BERT performs best on the Government Affairs Microblog, whereas LDA-BERT performs best on the We Media Microblog. As for text length, TF-IDF-BERT works better for texts with a length of <70 and length >140 words, and LDA-BERT performs best with a text length of 70?140 words. For the spatiotemporal evolution pattern, the study suggests that in a Henan rainstorm, the textual topics follow the general pattern of ?situation-tips-rescue?. Moreover, this paper detected the hotspot of ?Metro Line 5? related to a Henan rainstorm and discovered that the topical focus of the Henan rainstorm spatially shifts from Zhengzhou, first to Xinxiang, and then to Hebi, showing a remarkable tendency from south to north, which was the same as the report issued by the authorities. We integrated multi-methods to improve the overall topic classification accuracy of Sina microblogs, facilitating the spatiotemporal analysis of flooding.

 Artículos similares

       
 
Zihan Gui, Heshuai Qi, Faliang Gui, Baoxian Zheng, Shiwu Wang and Hua Bai    
Poyang Lake, the largest freshwater lake in China, is an important regional water resource and a landmark ecosystem. In recent years, it has experienced a period of prolonged drought. Using appropriate drought indices to describe the drought characterist... ver más
Revista: Water

 
Maja Poznanovic Spahic, Goran Marinkovic, Darko Spahic, Sanja Sakan, Ivana Jovanic, Marina Magazinovic and Nata?a Obradovic    
The study of aquifers of the Lece andesite complex (LAC) and its surroundings yielded a new procedural stepwise analysis that allowed the assessment of the origin of elements, particularly in areas affected by both anthropogenic and natural influences. T... ver más
Revista: Water

 
Chen Wang, Si-jia Zhao, Zong-qiang Ren and Qi Long    
Classifying a time series is a fundamental task in temporal analysis. This provides valuable insights into the temporal characteristics of data. Although it has been applied to traffic flow and individual-centered accessibility analysis, it has yet to be... ver más

 
Yeong-Ho Kwak, Seung-Yong Kim, Young-Shin Go, Dong-Hun Lee, Ha-Yun Song, Sang Ok Chung, Jeong Bae Kim and Bohyung Choi    
We integrated stomach content analysis (SCA) and stable isotope analysis (SIA) to understand ontogenetic niche shifts in the invasive freshwater fish, bluegill, Lepomis macrochirus, inhabiting the Yedang Reservoir in Korea. Based on the total length (TL)... ver más
Revista: Water

 
Hao Zou, Jing-Sen Cai, E-Chuan Yan, Rui-Xuan Tang, Lin Jia and Kun Song    
Due to the spatial variability of hydraulic properties, probabilistic slope seepage analysis becomes necessary. This study conducts a probabilistic analysis of slope seepage under rainfall, considering the spatial variability of saturated hydraulic condu... ver más
Revista: Water