The Bulgarian Grammar checker is based on a language model derived from the frequency list of the annotated Bulgarian National Corpus. It checks 893,626,788 3-grams with POS tags, including punctuation. The results show the probability of an arbitrary 3-gram with part-of-speech tags to be valid in the language model.
The language model is executed in the form of finite automata. For each sentence, the model consecutively applies 3-grams, and those that are below the threshold are flagged as potential errors.