More artificial Finnish
Vowel harmony is an interesting feature of some languages, such as Finnish and Turkish: some vowels are said to harmonize with each other, so vowels from incompatible vowel groups can’t be in the same word. In Finnish, which has a bunch of suffixes, the vowel of a suffix depends on what vowel group the word it attaches to has. (More on vowel harmony, exemplified by Turkish, here.)
Mark Dominus ran into this when trying to generate plausible-sounding Finnish with Markov chains:
M. Vacklin pointed out that a number of words in my sample output violated the Finnish rules of vowel harmony… Vowel harmony is a phenomenon found in certain languages, including Finnish. These languages class vowels into two antithetical groups. Vowels from one group never appear in the same word as vowels from the other group. When one has a prefix or a suffix that normally has a group A vowel, and one wants to join it to a word with group B vowels, the vowel in the suffix changes to match. This happens a lot in Finnish, which has a zillion suffixes. In many languages, including Finnish, there is also a third group of vowels which are “neutral” and can be mixed with either group A or with group B.
The vowel harmony thing is interesting in this context for the following reason. My pseudo-Finnish was generated by a Markov process: each letter was selected at random so as to make the overall frequency of the output match that of real Finnish. Similarly, the overall frequency of two- and three-letter sequences in pseudo-Finnish should match that in real Finnish. Is this enough to generate plausible (although nonsensical) Finnish text? For English, we might say maybe. But for Finnish the answer is no, because this process does not respect the vowel harmony rules. The Markov process doesn’t remember, by the time it gets to the end of a long word, whether it is generating a word in vowel category A or B, and so it doesn’t know which vowels it whould be generating.