Taming Wild Reward Functions: The Score Function Gradient Estimator Trick
This post explains the need for the score function gradient estimator trick and how it works.
This post explains the need for the score function gradient estimator trick and how it works.
Discussing good, bad and ugly practices of reinforcement learning in neural machine translation.
Discussing good, bad and ugly practices of reinforcement learning in neural machine translation.
How can we train semantic parsers if neither question-parse nor question-answer pairs can be collected?