Inicio  /  Computation  /  Vol: 10 Par: 11 (2022)  /  Artículo
ARTÍCULO
TITULO

Robust Variable Selection and Regularization in Quantile Regression Based on Adaptive-LASSO and Adaptive E-NET

Innocent Mudhombo and Edmore Ranganai    

Resumen

Although the variable selection and regularization procedures have been extensively considered in the literature for the quantile regression (????) ( Q R ) scenario via penalization, many such procedures fail to deal with data aberrations in the design space, namely, high leverage points (X-space outliers) and collinearity challenges simultaneously. Some high leverage points referred to as collinearity influential observations tend to adversely alter the eigenstructure of the design matrix by inducing or masking collinearity. Therefore, in the literature, it is recommended that the problems of collinearity and high leverage points should be dealt with simultaneously. In this article, we suggest adaptive ?????????? L A S S O and adaptive E-?????? N E T penalized ???? Q R (???? Q R -???????????? A L A S S O and ???? Q R -???? A E -?????? N E T ) procedures where the weights are based on a ???? Q R estimator as remedies. We extend this methodology to their penalized weighted ???? Q R versions of ?????? W Q R -?????????? L A S S O , ?????? W Q R -E-?????? N E T procedures we had suggested earlier. In the literature, adaptive weights are based on the RIDGE regression (???? R R ) parameter estimator. Although the use of this estimator may be plausible at the l1 l 1 estimator (???? Q R at ??=0.5 t = 0.5 ) for the symmetrical distribution, it may not be so at extreme quantile levels. Therefore, we use a ???? Q R -based estimator to derive adaptive weights. We carried out a comparative study of ???? Q R -?????????? L A S S O , ???? Q R -E-?????? N E T , and the ones we suggest here, ??????. v i z . , ???? Q R -???????????? A L A S S O , ???? Q R -???? A E -?????? N E T , weighted ???? Q R ???????????? A L A S S O penalized and weighted ???? Q R adaptive ???? A E -?????? N E T penalized (?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T ) procedures. The simulation study results show that ???? Q R -???????????? A L A S S O , ???? Q R -???? A E -?????? N E T , ?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T generally outperform their nonadaptive counterparts. At predictor matrices with collinearity inducing points under normality, the ???? Q R -???????????? A L A S S O and ???? Q R -???? A E -?????? N E T , respectively, outperform the non-adaptive procedures in the unweighted scenarios, as follows: in all 16 cases (100%) with respect to correctly selected (shrunk) zero coefficients; in 88% with respect to correctly fitted models; and in 81% with respect to prediction. In the weighted penalized ?????? W Q R scenarios, ?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T outperform their non-adaptive versions as follows: in 75% of the time with respect to both correctly fitted models and correctly shrunk zero coefficients and in 63% with respect to prediction. At predictor matrices with collinearity masking points under normality, the ???? Q R -???????????? A L A S S O and ???? Q R -???? A E -?????? N E T , respectively, outperform the non-adaptive procedures in the unweighted scenarios as follows: in prediction, in 100% 100 % and 88% 88 % of the time; with respect to correctly fitted models in 100% 100 % and 50% 50 % (while in 50% 50 % equally); and with respect to correctly shrunk zero coefficients in 100% 100 % of the time. In the weighted scenario, ?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T outperform their respective non-adaptive versions as follows; with respect to prediction, both in 63% 63 % of the time; with respect to correctly fitted models, in 88% 88 % of the time while with respect to correctly shrunk zero coefficients in 100% 100 % of the time. At predictor matrices with collinearity inducing points under the t-distribution, the ???? Q R -???????????? A L A S S O and ???? Q R -???? A E -?????? N E T procedures outperform their respective non-adaptive procedures in the unweighted scenarios as follows: in prediction, in 100% 100 % and 75% 75 % of the time; with respect to correctly fitted models 88% 88 % of the time each; and with respect to correctly shrunk zero 88% 88 % and in 100% 100 % of the time. Additionally, the procedures ?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T and their unweighted versions result in the former outperforming the latter in all respective cases with respect to prediction whilst there is no clear "winner" with respect to the other two measures. Overall, the ?????? W Q R -???????????? A L A S S O generally outperforms all other models with respect to all measures. At the predictor matrix with collinearity-masking points under the t-distribution, all adaptive versions outperformed their respective non-adaptive versions with respect to all metrics. In the unweighted scenarios, the ???? Q R -???????????? A L A S S O and ???? Q R -???? A E -?????? N E T dominate their non-adaptive versions as follows: in prediction, in 63% 63 % and 75% 75 % of the time; with respect to correctly fitted models, in 100% 100 % and 38% 38 % (while in 62% 62 % equally); in 100% 100 % of the time with respect to correctly shrunk zero coefficients. In the weighted scenarios, all adaptive versions outperformed their non-adaptive versions as follows: 62% 62 % of the time in both respective cases with respect to prediction while it is vice-versa with respect to correctly fitted models and with respect to correctly shrunk zero coefficients. In the weighted scenarios, ?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T dominate their respective non-adaptive versions as follows; with respect to correctly fitted models, in 62% 62 % of the time while with respect to correctly shrunk zero coefficients in 100% 100 % of the time in both cases. At the design matrix with both collinearity and high leverage points under the heavy-tailed distributions (t-distributions with ???(1;6) d ? ( 1 ; 6 ) degrees of freedom) scenarios, the dominance of the adaptive procedures over the non-adaptive ones is again evident. In the unweighted scenarios, the procedures ???? Q R -???????????? A L A S S O and ???? Q R -???? A E -?????? N E T outperform their non-adaptive versions as follows; in prediction, in 75% 75 % and 62% 62 % of the time; with respect to correctly fitted models, they perform better in 100% 100 % and 88% 88 % of the time, while with respect to correctly shrunk zero coefficients, they outperform their non-adaptive ones 100% 100 % of the time in both cases. In the weighted scenarios, ?????? W Q R -???????????? A L A S S O and ?????? W Q R -???? A E -?????? N E T dominate their non-adaptive versions as follows; with respect to prediction, in 100% 100 % of the time in both cases; and with respect to both correctly fitted models and correctly shrunk zero coefficients, they both do 88% 88 % of the time. Results from applications of the suggested procedures to real life data sets are more or less in line with the simulation studies results.

 Artículos similares

       
 
Cihan Ates, Dogan Bicat, Radoslav Yankov, Joel Arweiler, Rainer Koch and Hans-Jörg Bauer    
In this study, we propose a population-based, data-driven intelligent controller that leverages neural-network-based digital twins for hypothesis testing. Initially, a diverse set of control laws is generated using genetic programming with the digital tw... ver más
Revista: Algorithms

 
Yuwen Fu, E. Xia, Duan Huang and Yumei Jing    
Machine learning has been applied in continuous-variable quantum key distribution (CVQKD) systems to address the growing threat of quantum hacking attacks. However, the use of machine learning algorithms for detecting these attacks has uncovered a vulner... ver más
Revista: Applied Sciences

 
Yiran Liu, Boyi Chen, Jinbao Chen and Yanbin Liu    
This paper investigates a rapid modeling method and robust analysis of hypersonic vehicles using multidisciplinary integrated techniques. First, the geometrical configuration is described using parametric methods based on the class?shape technique. Aerod... ver más
Revista: Applied Sciences

 
Ruben Tapia-Olvera, Francisco Beltran-Carbajal and Antonio Valderrabano-Gonzalez    
The synchronous generator is one of the most important active components in current electric power systems. New control methods should be designed to guarantee an efficient dynamic performance of the synchronous generator in strongly interconnected nonli... ver más
Revista: Applied Sciences

 
Chen Chen, Weidong Zhou and Lina Gao    
A proper filtering method for jump Markov system (JMS) is an effective approach for tracking a maneuvering target. Since the coexisting of heavy-tailed measurement noises (HTMNs) and one-step random measurement delay (OSRMD) in the complex scenarios of t... ver más