Far Off Places - Texas Expat

| No Comments | No TrackBacks

Just a quick note to point out that we’ve just launched the iOS version of Far Off Places, the magazine of written whimsy I help publish. The magazine’s app is available in both iPhone and iPad versions (I prefer the iPhone/iPod Touch version myself!); you can find it listed as Far Off Places under Newsstand, where it is (at the time of this writing) #1 on the U.S. and U.K. charts for literary journals.

Post-NAACL - Texas Expat

| No Comments | No TrackBacks

Just setting off for home after my annual week of conferencing. This year it was NAACL rather than trusty old CogSci — and with a correspondingly tougher crowd as a result. I gave a talk on the second day to a (large) full room of computational linguists, computer scientists, and other and sundry NLP folk, which quite frankly went better than it had any right to do. Good questions, good feedback, and while I’m pretty sure that I lost half my audience the first time I uttered the phrase “cognitive plausibility” I felt like I kept the rest around long enough to at least leave ‘em with a solid introduction to the HRG algorithm.

And to my surprise I then spent the next day and a half fielding questions and generally getting interesting feedback. The best of these by far was from the chair of the session, who cornered me afterward to grinningly confess that, my talk had been “a very Mirella-ish piece of work.” I’m pretty sure that’s a good, if slightly mixed, thing. On the one hand it’s quite flattering to have my work identified as being on her level; on the other, it’s slightly galling to have my lack of a distinct personal style so neatly pointed out (“I was just waiting for the Mechanical Turk study at the end!”).

Ah, well.

It was a pretty great conference all around, actually — I might be getting the hang of this business by now. I’ve collected a circle of conference-going friends, if naught else; a shout out to Alessandra, Marten, Yonatan et al. And three cheers for holding academic conferences in countries where beer is served in sensible quantities. Shame about Québécois breweries, though.

Convert for Alfred - Texas Expat

| No Comments | No TrackBacks

One of my least favourite things in life is converting between units of measurement, especially when cooking. All my mum’s recipes use imperial units, you see, yet my kitchen’s supply of measuring widgets is exclusively metric. Constantly launching a web browser and running the conversion through Google is just tiresome — but having just made the switch from Quicksilver to Alfred, I decided to have a go at building an Alfred Extension to do quick unit conversions using Google’s calculator API.

To use it, download the extension and install it via Alfred’s preferences window. Summon Alfred as usual, then make unit conversions with:


It can handle anything Google knows about, so queries like

  • “convert 500 grams to lbs”
  • “convert 375 fahrenheit to celsius”
  • “convert 2 cups to liters”

will work just fine. For brevity’s sake, you can omit the ‘to’. Happy trans-cultural baking!

c_progressbar - Texas Expat

| No Comments | No TrackBacks

I do a lot of programming, in a lot of different languages. Each language evokes a particular sort of mindset when I’m using it, as well: Python and Objective-C are for games, Ruby and C are for research, PHP is for webdev. Within each mindset, though, there’s a fair amount of blending — and I can’t tell you how often I wish I was writing in Ruby when I find myself writing in C.

My usual workflow for research-oriented coding involves prototyping a tool or model in Ruby (fast to write, easy to understand), then porting the final, working version into C for use with real data (my models are usually trained on massive statistical corpora, something Ruby is, unfortunately, ill-equipped to handle). This is all well and good, but I always miss the usability that I can so easily sprinkle into the Ruby version. OptionParser is one such gem; Ruby/ProgressBar is another.

In a moment of frustration, I re-implemented Ruby/ProgressBar in pure, unadulterated C99. If you’re interested in that sort of thing, you can grab a copy from my GitHub repository:

git clone git@github.com:doches/progressbar.git

For comparison, here is how you use a progressbar in Ruby:

require 'rubygems'
require 'progressbar'

progress = ProgressBar.new("Loading",100)
(0..99).each do |i| 
  # Do some stuff

…and in C:

#include "progressbar.h"

progressbar *progress = progressbar_new("Loading",100);
for(int i=0;i<100;i++) {
  // Do some stuff

More examples, including custom formatting and indeterminate progress, are in test/progressbar_demo.c.

CogSci 2010 - Texas Expat

| No Comments | No TrackBacks

I just got back from CogSci 2010, to which I had successfully submitted a paper, “Meaning Representation in Natural Language Categorization.” Unfortunately, I wasn’t invited to present it as a talk — but on the upside, I was invited to present it as a poster. As a consequence, I may now write the following sentence, of which I am more proud than practically anything I have done in my life to date:

Fountain, T. & Lapata, M. (2010). Meaning Representation in Natural Language Categories. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.


The work I presented deals with whether corpus co-occurrence can be used as a stand-in for norming data, at least in the context of a categorization task; as part of that work I collected a rather large amount of data that I am now making publicly available. The dataset extends the McRae et al feature norms by grouping the words in forty-one categories, and includes norming data, integrated into the standard McRae features, for each of the newly-added category labels.

For over a year now I’ve been wanting to start writing games for the iPhone and iPod Touch — in mid-February I was finally seized by the bug and sat down to figure it all out. The learning curve was pretty huge, frankly — I remember realizing one evening that I was writing code in a language I didn’t know (Objective C) using an IDE I’d never used before (XCode) to work with an API I’d previously avoided like the plague (OpenGL).

Fun stuff.

On the upside, running your own code on the iPhone is really cool — I haven’t felt this much like I was “hacking” (in the classical sense of the term) in a long time. The upshot of it all is that I’ve just released my first game, Levelheaded, for the iPhone. It’s a port of my old Global Game Jam project from last year, Jarhead, updated for the iPhone’s touch input and limited screen real estate. Now, onward with the marketing!

As a follow-on to my ConfigParser sugar for the ever-awesome Espresso, I’ve written a workable sugar that turns Espresso into a full-featured LaTeX editor. Syntax highlighting, basic itemizers, templates for figures, tables, and common documents, plus code completion for a large number of commonly used LaTeX functions.

The code is hosted on GitHub; I’d appreciate a comment if you’re using the Sugar, just to help keep me motivated to work on the thing. Similarly, if you feel like contributing I’d love to accept patches or pull requests.

I’ve recently fallen in love with MacRabbit’s lovely new text editor for OS X, Espresso. For one, it’s a fantastic editor — not quite as swiss-army-esque as the stalward TextMate, but close. Unlike TextMate, Espresso is downright gorgeous — and gorgeous in a way that doesn’t interfere with actually using the thing. A great tool, beautifully designed, and with a deliciously extensible core to boot.

Naturally, this last bit is what really got me hooked. Espresso plugins are called ‘Sugars’, and can provide truly ludicrous extensions to the editor. Want to add a syntax highlighter for a new language? Specialized code folding for a certain coding idiom? Autocomplete and suggestion tools for a library? You can do it, and (almost) entirely in XML. Beautiful.

The first thing I did after downloading the trial version was hunt around for relevant Sugars, which I found to be somewhat sparse. Espresso is, unfortunately, quite a young editor. For instance, I couldn’t find a Sugar for working with configuration files produced/read by Python’s ConfigParser module — something I desperately need for a super-secret project I’m working on. Several hours of reading, writing, and frantic github searching later, I present:


An Espresso Sugar providing syntax highlighting for configuration files layed out according to RFC 822. Among other things, it understands Windows .ini files and config files produced by Python’s ConfigParser module.


Clone the Github project somewhere, with the following:

git clone git://github.com/doches/ConfigParser.sugar.git ./ConfigParser.sugar

And then link it to your syntaxes directory:

ln -s "$(pwd)/ConfigParser.sugar" "/Users/$(whoami)/Library/Application Support/Espresso/Sugars/"

The Flixel Experience - Texas Expat

| No Comments | No TrackBacks

Learning Actionscript3 and working with Flixel these last few weekends has been an absolute joy. Two weeks ago I sat down, knowing absolutely nothing about Actionscript coding, and banged out Invaders — a bog-simple Space Invaders clone with a minimalist streak and a misguided theme of non-violence. It was, to say the least, a terrible game. But I wrote it in something like eight hours — eight hours at the start of which I didn’t even know the language.

Actionscript 3 is fantastic. Flixel is beautiful.

Last weekend I thought I’d try it again, and got myself immersed in building another simple game, a little avoidance game where you navigate a ship through a cluttered trench. Owing to (unfortunate) science-related obligations, I had to put it away for the week — but a few hours of polishing menus and hacking audio this afternoon brought it up to something like a completed state. It’s nothing special, but you can play Trench Run on Kongregate if you’re so inclined. There are certainly worse ways to spend your next 97 seconds.

The Flixel community, while small, is growing by leaps and bounds, and is already more vibrant than other development communities I’ve been involved with. I made heavy use of Timothy Hely’s ludicrously detailed Flixel tutorial in creating both games, and had to poke around IRC for a bit to get some questions answered. The library itself is open source as well, so a little bit of hacking is all that ever stood between me and an (albeit) ugly solution to whatever problems I might encounter. A new version of Flixel is supposed to drop any day now, and I can’t wait!

One thing I would dearly love to see though: an official Flixel repository on Github or the like. There are enough people who’d love to contribute back to the library (myself included, I’d like to think) that this could turn into something huge rather quickly. Go Flixel! Go!

Ed. -- Ask, and ye shall receive: Flixel on GitHub.

I should start this off by making something clear: I am an extremely lazy programmer. I long ago adopted the Ruby mantra that programmer time is more precious than machine time, and tend to write most of my code in a way that is clear and comprehensible, with little regard for speed of execution.

When working with massive NLP datasets, this is perhaps not an entirely good idea.

Lately my experiments have begun to take up more and more days to complete. Part of this, of course, is Ruby’s fault — the language, or at least the current interpreter, is far from fast — and part of the blame lies with me, for writing fundamentally lazy code. I have, however, come up with something of an interesting fix for the problem by parallelizing almost all of my big experiments. Easily done, really — but I then realized that I don’t have access to any of the department’s clusters. But I do have a login that works on all of the public-access machines in the (huge) undergraduate labs.


Enter clusterfuck, a subversive job-distribution tool I’ve been working on to solve this niggly little dilemma. It’s basically a tool for automating the process of ssh-ing into multiple machines and starting jobs on each of them, writ large. It’s installable via GitHub (I’ll push a gem out to Gemcutter when I have a nice, polished version ready) and configured rake-style via clusterfiles:

Clusterfuck::Task.new do |task|
    task.hosts = %w{clark asimov}
    task.jobs = ["hostname","hostname","hostname","hostname"]
    task.temp = "fragments"
    task.username = "SSHUSERNAME"
    task.password = "SSHPASSWORD"
    #task.debug = true

You kick off a batch of cluster jobs via the command clusterfuck (which takes a clusterfile as an argument, or defaults to clusterfile if it exists). This example file will run the command hostname four times on two machines (two each, unless one goes down or is slow to respond) and save the output of each command into a directory fragments. A bit of a toy example to be sure, but replacing the uninteresting hostname job with something more time-consuming and extend the list of hosts to, say, several dozen idle machines can get large, decomposable jobs completed in a fraction of the time necessary to run ‘em on a single machine.

Why “clusterfuck”? The early versions of this tool were, to say the least, somewhat unreliable, and had a frustrating tendency to leave cpu-intensive processes floating around. Needless to say, the network admins were not happy about this state of affairs, and the entire thing turned into a giant, well, you get the idea.