Recent posts

[ICML submitted] FLAG: Flow Policy MaxEnt-RL by Latent Augmented Guidance

Abstract: Maximum entropy reinforcement learning (MaxEnt-RL) enables robust exploration, yet practical implementations often restrict policies to simple Gaussians. While recent MaxEnt-RL approaches incorporate expressive generative policies via weighted supervised learning, they use importance sa...