In an english corpus database I use I found that 10 % of usage was an anathema .
The enron corpus has led to a better understanding of how language is used and how social networks function , and it has improved efforts to uncover social groups based on e-mail communication .
It analyzes the corpus of existing translations , and finds statistical matches .
One way to gauge the prevalence of a word is to consult the oxford english corpus , a body of 2 billion words .
Such tools owe a debt to an unlikely , though appropriate , source : the electronic mail database known as the enron corpus .
A search of the spoken category of the corpus of contemporary american english finds that I is about eight times more common than me-but who is 57 times more common than whom .