Extended libraries The following are additional free libraries built using Pynini: pynini rewrite: Python module for ap plying rewrite rules edit transducer...
OpenFst Quick Tour Below is a brief tutorial on the OpenGrm SFST library based on a running example. We use the command line SFST utilities for this; we could have...
OpenGrm SFst Library: Stochastic Finite State Transducer Library OpenGrm SFst Version 1.1.0 is now available for download. SFst is a library for normalizing...
OpenGrm SFst Background Material The following is provided as background reading about stochastic finite state transducers and related material. For material...
FST optimization There are several ways to `optimize` a weighted finite state transducer (WFST). The C template function fst::Optimize underlies Pynini`s optimize...
OpenGrm Pynini: Finite state grammar development in Python Version 2.0.9 is now available for download. OpenGrm Pynini , like Thrax , compiles grammars expressed...
OpenGrm NGram Library Version 1.3.8 is now available for download. The OpenGrm NGram library is used for making and modifying n gram language models encoded...
OpenGrm Thrax Grammar Development Tools Version 1.3.1 now available for download. The OpenGrm Thrax tools compile grammars expressed as regular expressions...
Specialty operators This describes specialty FST functions for grammar compilation. fst::CrossProduct The cross product operation generates a transducer from two...
OpenGrm SFst Available Operations The following operations are provided for SFSTs. Care must be taken that the input FSTs meet the specified requirements (e.g. canonical...
OpenGrm SFST Glossary $ backoff complete FST : a canonical FST for which each state s that has a failure transition to a state s` and another transition with...
OpenGrm Advanced Usage Below are a variety of topics covered in greater depth or of more specialized interest than found in the Quick Tour. Reading the Quick...
OpenGrm NGram Library Quick Tour This tour is organized around the stages of n gram model creation, modification and use: corpus I/O ( ngramsymbols , farcompilestrings...
OpenGrm Thrax Grammar Compiler The OpenGrm Thrax Grammar Compiler is a set of tools for compiling grammars expressed as regular expressions and context dependent...
Symbol table operations This describes functions for symbol table operations. Each FST arc has an input ( ilabel ) and output ( olabel ) label. Symbol tables can be...
Path iteration This describes classes for iterating over paths in an FST. fst::PathIterator This template class provides a basic iterator over paths. It is constructed...
Thrax Release 0.1 (Alpha version.) Thrax Release 1.0 Removed dependency on ICU for UTF8 string parsing: with icu configuration flag no longer needed and...
OpenGrm Libraries OpenGrm is a collection of open source libraries for constructing, combining, applying and searching formal grammars and related representations...
OpenGrm SFst COPYING Licensed under the Apache License, Version 2.0 (the `License`); you may not use these files except in compliance with the License. You may obtain...
OpenGrm NGram README OpenGrm NGram Release 1.3 The OpenGrm NGram library is used for making and modifying n gram language models encoded as weighted finite state...
String (de)compilation This directory contains functions useful for mapping strings into FSAs ( compilation ) and for mapping string FSTs onto strings ( printing...
Thrax README Thrax Release 1.2 Thrax is a toolkit for compiling grammars based on regular expressions and context dependent rewrite rules into weighted finite state...
Known Bugs Temporary bug in thrax 1.2.2 where AssertNull and AssertEmpty are not being properly registered. This will get fixed soon, but in the meantime as a...
NGramMarginal Description (Available in versions 1.1.0 and higher.) This operation re estimates smoothed n gram models by imposing marginalization constraints...
Thrax COPYING Licensed under the Apache License, Version 2.0 (the `License`); you may not use these files except in compliance with the License. You may obtain a copy...
OpenGrm NGram COPYING Licensed under the Apache License, Version 2.0 (the `License`); you may not use these files except in compliance with the License. You may obtain...
NGramShrink Description This operation shrinks or prunes an n gram language model in one of three ways: count pruning: prunes based on count cutoffs for...
NGramMake Description This operation produces a smoothed, normalized language model from input n gram count FST. It smooths the model in one of six ways: witten...
NGramCount Description This utility counts n grams from an input FST archive. This produces a count FST with the same topology as the eventual normalized model,...
NGramMerge Description This operation merges two n gram language models or two n gram count FSTs. The operation provides options for weighting the two input FSTs...
NGramInfo Description The command line utility ngraminfo prints various information about an n gram model obtained from the NGramModel class and the underlying...
NGramPerplexity Description Command line utility to calculate the perplexity of a corpus given a model. Verbose mode gives the per word contribution to the perplexity...
NGramPrint Description By default, only n grams are printed (without backoff epsilon transitions), in the same format as discussed above for reading in n gram...
NGram Model Format The following gives the encoding of all n gram models produced by the utilities here, including those with unnormalized counts, as a cyclic weighted...