Really large NLP corpora

Jeeze people. You’re all noisy. I’m sure it was all done for posterity’s sake.

23M     irclogs/MagNET/#perl.log
29M     irclogs/freenode/#mysql.log
36M     irclogs/freenode/#debian.log
37M     irclogs/foonetic/#xkcd.log
39M     irclogs/OFTC/#debian.log
43M     irclogs/freenode/#jquery.log
44M     irclogs/freenode/#perl.log

$ for file in irclogs/MagNET/#perl.log irclogs/freenode/#mysql.log irclogs/freenode/#debian.log irclogs/foonetic/#xkcd.log irclogs/OFTC/#debian.log irclogs/freenode/#jquery.log irclogs/freenode/#perl.log; do echo -n "$file: " ; head -1 $file ; done
irclogs/MagNET/#perl.log: --- Log opened Thu May 26 08:31:32 2011
irclogs/freenode/#mysql.log: --- Log opened Wed Dec 28 09:03:49 2011
irclogs/freenode/#debian.log: --- Log opened Tue Mar 12 12:52:40 2013
irclogs/foonetic/#xkcd.log: --- Log opened Wed Dec 28 19:33:43 2011
irclogs/OFTC/#debian.log: --- Log opened Tue Jul 12 19:25:48 2011
irclogs/freenode/#jquery.log: --- Log opened Tue Jan 31 16:47:51 2012
irclogs/freenode/#perl.log: --- Log opened Thu Dec 15 09:31:47 2011
This entry was posted in debian, freenode, irc, javascript, language, mysql, Natural Language Processing, perl. Bookmark the permalink.

Leave a Reply