|
|
|
Jeiyoon Park, Chanhee Lee, Chanjun Park, Kuekyeng Kim and Heuiseok Lim
Despite its significant effectiveness in adversarial training approaches to multidomain task-oriented dialogue systems, adversarial inverse reinforcement learning of the dialogue policy frequently fails to balance the performance of the reward estimator ...
ver más
|
|
|