All Our N-gram are Belong to You
I couldn’t come up with a better title for this than the one Google Research gave it… basically, they’re releasing word and phrase counts for their dataset of a trillion words. I dig it for many reasons – words are cool, that’s a lot of words, and making the data available to the public for insane linguistic research is so Google.