Description
(Available in versions 1.1.0 and higher.)
This operation
re-estimates smoothed n-gram models by imposing marginalization constraints similar to those used for Kneser-Ney modeling on Absolute Discounting models. Specifically, the algorithm modifies lower-order distributions so that the expected frequencies of lower-order n-grams within the model are equal to the smoothed relative frequency estimates of the baseline smoothing method. Unlike Kneser-Ney, this algorithm may require multiple iterations to converge, due to changes in the state probabilities.
Usage
ngrammarginalize [--opts] [in.mod [out.mod]]
--iterations: type = int, default = 1, number of iterations of steady state probability calculation
--max_bo_updates: type = int, default = 10, maximum within iteration updates to backoff weights
--output_each_iteration: type = bool, default = false, whether to output a model after each iteration in addition to final model
--steady_state_file: type = string, default = "", name of separate file to derive steady state probabilities
|
|
class NGramMarginal(StdMutableFst *model);
|
|
Examples
ngrammarginalize --iterations=5 earnest.mod >earnest.marg.mod
int total_iterations = 5;
vector<double> weights;
for (int iteration = 1; iteration <= total_iterations; ++iteration) {
StdMutableFst *model = StdMutableFst::Read("in.mod", true);
NGramMarginal ngrammarg(model);
ngrammarg.MarginalizeNGramModel(&weights, iteration, total_iterations);
if (iteration == total_iterations)
ngrammarg.GetFst().Write("out.mod");
delete model;
}
Caveats
Note that this method assumes that the baseline smoothed model provides smoothed relative frequency estimates for all n-grams in the model. Thus the method is not generally applicable to models trained using Kneser-Ney smoothing, since lower-order n-gram weights resulting from that method do not represent relative frequency estimates. See reference below for further information on the algorithm.
References
B. Roark, C. Allauzen and M. Riley. 2013. "
Smoothed marginal distribution constraints for language modeling". In
Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 43-52. The BibTex entry is
here.