Counterfactual Learning of Semantic Parsers When Even Gold Answers Are Unattainable
How can we train semantic parsers if neither question-parse nor question-answer pairs can be collected?
How can we train semantic parsers if neither question-parse nor question-answer pairs can be collected?
Discussing good, bad and ugly practices of reinforcement learning in neural machine translation.
This post explains the need for the score function gradient estimator trick and how it works.