KeithSchwarz.comBack | Forward

The (re)Constitution

What is this?

You're probably wondering what on earth all this is (and what all that gibberish means). The (re)Constitution is a randomly-generated string of text created by analyzing the original United States Constitution and reconstructing it using a method known as Markov Chains.

Markov Chains work on principles of probability. The idea is simple- by reading in a document and storing the probability of words appearing in certain orders, a new document can be generated based on the contents of the first.

The generation algorithm is quite simple, and works like this:

  1. Take a source document and read it into memory.
  2. From this document, calculate the probabilities of each combination of letters appearing after the previous ten letters appear.
  3. Choose a random seed word of ten letters.
  4. Based on this word, choose a character combination that appeared after it in the original text.
  5. Output that character combination, then apply the previous step to those characters.
  6. Terminate when a certain number of characters have been written.

This algorithm contains no implementation of grammatical rules or even clues as to how words are spelled. Merely by processing the input source text (in our case, the United States Constitution) the Markov Chain algorithm can correctly place words a good deal of the time, and rarely (if ever) makes spelling mistakes not present in the original text.

Back to the (re)Constitution