Resumen
Multi-task learning (MTL) is a paradigm to learn multiple tasks simultaneously by utilizing a shared network, in which a distinct header network is further tailored for fine-tuning for each distinct task. Personalized federated learning (PFL) can be achieved through MTL in the context of federated learning (FL) where tasks are distributed across clients, referred to as personalized federated MTL (PF-MTL). Statistical heterogeneity caused by differences in the task complexities across clients and the non-identically independently distributed (non-i.i.d.) characteristics of local datasets degrades the system performance. To overcome this degradation, we propose FedGradNorm, a distributed dynamic weighting algorithm that balances learning speeds across tasks by normalizing the corresponding gradient norms in PF-MTL. We prove an exponential convergence rate for FedGradNorm. Further, we propose HOTA-FedGradNorm by utilizing over-the-air aggregation (OTA) with FedGradNorm in a hierarchical FL (HFL) setting. HOTA-FedGradNorm is designed to have efficient communication between the parameter server (PS) and clients in the power- and bandwidth-limited regime. We conduct experiments with both FedGradNorm and HOTA-FedGradNorm using MT facial landmark (MTFL) and wireless communication system (RadComDynamic) datasets. The results indicate that both frameworks are capable of achieving a faster training performance compared to equal-weighting strategies. In addition, FedGradNorm and HOTA-FedGradNorm compensate for imbalanced datasets across clients and adverse channel effects.