Monday, October 17, 2011

Browsing man pages in vim

So I've been playing around a lot with my .vimrc lately, and this is one of the more useful things I've added. I have forgotten where it came from exactly, but here it is:

source $VIMRUNTIME/ftplugin/man.vim
nmap K :Man <cword><CR>

What's it good for? The default key binding for K (that's "shift-k" I guess) in vim is to look up the word under the cursor using the man command. The sad thing about this process is that vim gets replaced by less (or whatever pager you happen to be using), and that once you're done reading the page, you have to press "Enter" one additional time to get back into vim and back to whatever you were doing. Kinda breaks your flow, you know?

Once you've added the three lines above to your .vimrc things are quite different. When you hit K, the man page opens as a new split window inside of vim so you're staying in the same environment. All the usual binds for switching between windows work, so you can keep the man page open while going back to your code. Better yet, the man page will be "syntax highlighted" using different colors for headings, text, and (you guessed it) references to other man pages. And the best thing? You can browse man pages the same way you browse tags: use "ctrl-]" to open another man page and use "ctrl-t" to "go back" to the previous one.

Now that's how man pages were supposed to be integrated with your editor. Very nice indeed... :-D

There's one small problem that I have not been able to work around yet: The original K could be preceded by the section number to look in, but this won't work in the replacement above. I am not enough of a vim hacker yet to add that capability. Shame on me?

Update: Actually, I forced myself to learn just enough of vimscript to cobble together something ugly for section numbers:

" experimental hack to get section numbers to work as well

function ManWrapper(n, w)
  if a:n > 0
    let cnt = a:n-line(".")+1
    execute "Man" cnt a:w
    execute "Man" a:w 

com -count=0 -nargs=+ CMan :call ManWrapper(<count>, <f-args>)
nmap X :CMan <cword><cr>

Yes, I know, it's quite horrific! If you know this dreadful language better, please tell me how to rewrite this cleanly.

Tuesday, October 11, 2011

Random Design Patterns, Part 1

I have no special reason to start writing these, except that I've been re-reading some patterns stuff recently. And while they are "warm" in my brain, I might as well try to write them down as that always seems to help me "solidify" things. None of the patterns I'll write about are new in any way, so feel free to skip these posts you pattern gurus!

First pattern, simple as can be: Null Object. Say you have some operation that returns a Sprite object that you then do something to. What if there is no sprite? You could return NULL or nil or None or whatever your language of choice calls the thing. But then you have to check the returned value:
sprite = some_operation()
if sprite:
If instead you return a Sprite instance that simply doesn't do anything, your code becomes a little more straightforward:
sprite = some_operation()
Not exactly a big deal, but of course it could add up to something more significant if you were using this in a more complicated way.

It's certainly not a good idea to always ignore the fact that you didn't find the sprite in question, for example you don't want to keep inserting NullSprite objects into a list over and over. So when you do care, you need a way of telling that it's a real sprite. One way is to guarantee that only a single NullSprite is ever created and to make that one globally accessible. Then, in places where you care, you can say this:
sprite = some_operation()
if sprite is not Sprite.NULL:
Whether Null Object is particularly useful therefore depends on how often you care versus how often you don't care (but still have to check if you use the language's builtin version of "no such object"). As with all design patterns: Think before you apply the pattern!

Monday, October 10, 2011

Sets of Dictionaries in Python

I may come to regret this post in the future, we'll see. So I was hacking on some sysadmin tool that collects data. Each item is a dictionary of various things, and I had all those dictionaries in a list. Wait! What if I parse another piece of data that results in an identical dictionary? I don't want to keep growing the list to infinity with duplicates, do I? So without much thought I replaced the list with a set, but that doesn't work:
>>> set([{}])
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unhashable type: 'dict'
Of course this makes sense: Python implements sets as dictionaries and dictionaries as hash tables and you cannot use something mutable as a key in a hash table. (If you don't see why that's so, think harder.) But sensible or not, what I certainly don't want to do is search my list of dictionaries for duplicates before every single insertion! So I came up with a little hack to make dictionaries hashable:
def hash_string_dict(obj):
    Return a unique-ish string for a dictionary mapping
    string-ish things to string-ish things.
    import hashlib
    k = "".join(sorted(str(obj.keys())))
    v = "".join(sorted(str(obj.values())))
    digest = hashlib.sha1(k+v).hexdigest()
    return digest
Alright, so the dictionaries are not really hashable, instead I produce a hashable digest of a dictionary's contents. You can now give me a lecture on how this is not very efficient, and some part of me would agree. However, given a long enough list of dictionaries, the time I spend on computing these digests will be less than looking through the whole list for a duplicate.

So instead of a list of dictionaries or a set of dictionaries, I end up with a dictionary of dictionaries: The key in the outer dictionary is the digest of the inner dictionary. If a duplicate comes along it'll produce an identical digest and I can forget about it after one (expected!) constant time lookup.

Of course all of this depends crucially on my dictionaries being immutable as far as my application is concerned. Python itself doesn't have the luxury of wondering about this, it has to "worst-case" it and assume all dictionaries are mutable, period. I wonder: Should I package this as a container class? :-D

Update: Note that it's sort of important that the keys and values you have in that dictionary produce "useful" string representations. In other words, don't apply this trick without thinking through the kind of data you're pushing around and whether the digest has a reasonable chance of being accurate enough for duplicate detection.

Sunday, October 9, 2011

And a new blog?

I used to have a tiny blog of my own at the now-defunct which was run by our local student ACM chapter. For some reason JHU decided that they want the name back to run their own blogging service, so the students (and me, and a whole bunch of other people who were using their service) got booted off that URL. Of course JHU never did set up their own service, sigh.

Anyway, apparently the students got frustrated enough to not re-open the blogging service under a different name, so for the past year or so I've not had a blog. Fast-forward to yesterday: I needed some information from one of my old blog posts, so I decided to request the raw data (thanks for the SQL dump Rich!) and to start over somewhere else. So here I am.

Not sure how long it'll take me to import the old posts, but I guess eventually you'll be able to read them here. Not that they were terribly interesting before, but hey, gotta have a blog, no?