Watch a few minutes of C-SPAN and you quickly realize that most political speeches are ordinary and predictable. They contain stock phrases and tired terminology, not the inspired poetry of great oratory.
That means political speech-writing ought to be automatable. In other words, it's just a matter of stringing together a series of building-block terms ("death tax," "budget request," "clean coal") in the right sort of order.
And that's basically what the algorithm developed by Valentin Kassarnig, a researcher at the University of Massachusetts, Amherst, does. It follows the convention of hundreds of real speeches to create new speeches. See if you can tell that the following was computer-written:
Mr. Speaker. For years, honest but unfortunate consumers have had the ability to plead their case to come under bankruptcy protection and have their reasonable and valid debts discharged. The way the system is supposed to work, the bankruptcy court evaluates various factors including income, assets and debt to determine what debts can be paid and how consumers can get back on their feet. They need to have money to pay for child care. They need transportation. It allows them to get reestablished, and we think this is certainly very helpful..
Kassarnig analyzed 3,857 speech segments from 53 Congressional floor debates in 2005. In all, the dataset includes about 50,000 sentences, each of which contained about 23 words. He identified each speech by whether it was made by a Republican or Democrat, and used both a "language model" (where the computer predicts the sixth word of a string based on the first five) and a "topic model" (which predicts a topic based on the previous subject of the speech) to generate new sentences.
"In my opinion, the experiment has clearly shown that the majority of the speeches follow a certain pattern that distinguishes them from any other texts," Kassarnig says in an email. "[It] has also shown that people are so used to hear those kind of speeches that they tend to disregard the actual content."
Kassarnig doesn't think the method will be used by real pols, even the lazy ones. But, in a paper discussing the experiment, he says the program could be useful for summarizing lots of speeches on the same topic (for example, if members of the same party all stand up and say more or less the same thing).
Indeed, you could see how such a system could end the need for Congressional reporters. Who wants to sit through 20 speeches saying the same thing, if a computer can record everything and report back on the themes involved?