Out of all the emails a man named John Arnold sent to his coworkers, Andy Zipper got the worst of John's wrath. Specifically, John’s emails to Andy contained 4.7% less joy, 10% more sadness, 8.7% more fear, and 7% less anticipation than they did for the rest of his colleagues.
Whether John actually feared, loathed, or despaired emailing Andy only John can truly know, but Saif Mohammad, a research officer at the National Research Council Canada's Institute for Information Technology, says that the data mining tool used to detect the emotional temperature of John’s words has between a 60% to 70% emotional accuracy rate at the sentence level when retested by fellow humans. And it can decipher more than just the negative stuff—in fact, Mohammad’s tool can analyze words in text for up to eight basic emotions. Within a year, he’s hoping to roll out a Gmail app that can help users measure the emotional content of their inboxes and outboxes.
How might an app like this be useful? Self-reflection, for one. When Mohammed and his colleague Tony Yang analyzed one set of 32,045 employee emails, they found significant differences in the way men and women communicated. Men received more emails with "trust" words, while women received more emails tagged with joy. When men wrote to women, they tended to use more anticipatory words, and when men wrote to men, they used more terms loaded with fear. Women, meanwhile, communicated more sadness to other women, but also less trust.
"If our differences are making some people treat us differently—and negatively in some cases—then you want to know what’s going on here," Mohammad says. "My goal is not to make everybody the same, it’s not to propagate stereotypes, but to give people power to analyze their own data, and how they speak."
It's easy to see how Mohammad's tool might empower personal communication, but it's also fairly easy to imagine how it might be abused, too. If you work in sales, it's conceivable that an employer might mandate a certain degree of happy words in messages to clients. To take another step in that dystopian direction, what if advertisers only wanted copy next to articles that made readers feel a palpable sense of joy?
Mohammad hasn't thought about the app in those capacities. Instead, he stresses its value as a personal analytic tool. "You could look if there’s an abrupt change. That can be sort of a self-discovery: ‘When I started my Ph.D., program, my sadness seems to have increased a whole lot!’" Mohammad laughs. "That might seem frivolous, but in some cases it could be pretty useful," he adds.
If you’re depressed at work, but also find that you’re receiving emails with a high degree of hostility, that knowledge alone might make you feel less dysfunctional, weak, or subjective, Mohammad says. In cases where kids are targeted by cyber-bullies, he suggests that sentiment analysis could help them critically understand why they feel so low.
Mohammad’s tool doesn’t take the whole complexity of human communication into consideration—colloquialisms and sarcasm are just a few examples of types of communication that can't be measured. Machine-learning has also not yet been able to determine why one person’s communications might be loaded with hostility or other emotions towards another. He concedes that these algorithms are to be taken with a grain of salt, and that they aren't particularly helpful in analyzing the depths of one relationship between two people. If you’re looking to get a handle on the general mood of lots of emails, tweets, or Facebook posts, however, machine-learning could be a friend.
Emails aren’t the only pieces of text Mohammad has analyzed using the eight basic human emotions (joy, sadness, anger, fear, trust, disgust, surprise, anticipation) identified by psychologist Robert Plutchik more than 30 years ago. He’s applied the tool to fairy tales, novels, and Shakespeare. He’s analyzed suicide notes compiled online by journalist Art Kleiner, an archive of love letters, and is currently working with the psychology department at the University of Pennsylvania to see whether his sentiment analysis algorithm could play a role in predicting heart attacks based on a person’s language.
"Emotions are central to our life," Mohammad says. "There are implications in health, there are implications in social cultural aspects, there are implications in product marketing."
Mohammad is working with the Google Apps API to roll out his data mining tool for Gmail, and will eventually make a public call for volunteers willing to test their inboxes.
John Arnold and Andy Zipper, meanwhile, are real people. They used to work for Enron, and their emails are now available for anyone to decipher after the Federal Energy Regulatory Commission published some 200,000 Enron employee emails between 1998 and 2002. Mohammad chose Enron for his gendered email analysis, and John Arnold’s emails in particular, he says, because Arnold's were typical of Enron employees.
"I hope he’s not mad about it," Mohammad says.