Category: language

  • Recovering videos from DV tapes with Canon ZR80

    I am recovering some tapes from back in the day that some of you may enjoy. Here is a log of the process so that maybe you can recover some of your own DV tapes. Seems to work well in modern Debian. To attach to the camcorder, I used a PCI-e card that has an…

  • I took this picture

    Could you folks please update my credentials? KTHXBI

  • Really large NLP corpora

    Jeeze people. You’re all noisy. I’m sure it was all done for posterity’s sake. 23M irclogs/MagNET/#perl.log 29M irclogs/freenode/#mysql.log 36M irclogs/freenode/#debian.log 37M irclogs/foonetic/#xkcd.log 39M irclogs/OFTC/#debian.log 43M irclogs/freenode/#jquery.log 44M irclogs/freenode/#perl.log $ for file in irclogs/MagNET/#perl.log irclogs/freenode/#mysql.log irclogs/freenode/#debian.log irclogs/foonetic/#xkcd.log irclogs/OFTC/#debian.log irclogs/freenode/#jquery.log irclogs/freenode/#perl.log; do echo -n “$file: ” ; head -1 $file ; done irclogs/MagNET/#perl.log: — Log opened Thu…

  • PocketSphinx on android via the NDK

    While working on my project for the Spring ’10 “NLP on Mobile Devices” course, I put together a PocketSphinx ndk build. You can pull it down from my git repo: $ git clone git:// I haven’t written any of the JNI marshaling functions yet, though.

  • Logs from talk with Daniel & Zalmai

    I had a phone conversation with Daniel Mills and Zalmai Zahir today in order to vet our test sentences for ling 567. I’ve got to say that Daniel is really showing his worth as a linguist here. I’m happy to bask in his glow and make sure the fonts render right. Go Daniel! The write-up…

  • Font for composing Lushootseed

    At the recommendation of David Beck, I have installed a TTF font from It took a few minutes for me to figure out how to get it going, but it was pretty straightforward after that. Here are some quick instructions for those of you running Debian variants such as Ubuntu. Fetch the zip files…

  • Lushootseed characters

    Here are some of the characters used to represent text in the Lushootseed languages. This is an imperfect representation. There doesn’t seem to be a COMBINING LATIN SMALL LETTER W, so I’m using a second character in these cases. I also can’t find any fonts that render a c with both a caron and a…

  • A quick update – I’m a grad student!

    Hey all! I’m sorry I haven’t been very active with my packages recently. I all-of-a-sudden started grad school and have been swamped with studying. I also started a contract and have been busy trying to learn a new codebase while contributing something other than snark. I promise I’ll get back to packaging IronRuby and IronPython…

  • Well, that was an eventful day!

    *whew* I did a bunch of things yesterday. We took our kindergärtner to her first Friday at her new school (and were about 10 minutes tardy. oops). We then took our toddler to a nearby playground with swings and slides and let her expend some energy. After she had been sufficiently exercised, we walked back…

  • some links to my language-related blog posts