Resumen
In the context of rapid urbanization, the refined management of cities is facing higher requirements. In improving urban population management levels and the scientific allocation of resources, fine-scale population data plays an increasingly important role. The current population estimation studies mainly focus on low spatial resolution, such as city-scale and county scale, without considering differences in population distributions within cities. This paper mines and defines the spatial correlations of multi-source data, including urban building data, point of interest (POI) data, census data, and administrative division data. With populations mainly distributed in residential buildings, a population estimation model at a subdistrict scale is established based on building classifications. Composed of spatial information and attribute information, POI data are spaced irregularly. Based on this characteristic, the text classification method, frequency-inverse document frequency (TF-IDF), is applied to obtain functional classifications of buildings. Then we screen out residential buildings, and quantify characteristic variables in subdistricts, including perimeter, area, and total number of floors in residential buildings. To assess the validity of the variables, the random forest method is selected for variable screening and correlation analysis, because this method has clear advantages when dealing with unbalanced data. Under the assumption of linearity, multiple regression analysis is conducted, to obtain a linear model of the number of buildings, their geometric characteristics, and the population in each administrative division. Experiments showed that the urban fine-scale population estimation model established in this study can estimate the population at a subdistrict scale with high accuracy. This method improves the precision and automation of urban population estimation. It allows the accurate estimation of the population at a subdistrict scale, thereby providing important data to support the overall planning of regional energy resource allocation, economic development, social governance, and environmental protection.