Will >> Will's blog

purpose: Will Kahn-Greene's blog of Miro, PyBlosxom, Python, GNU/Linux, random content, PyBlosxom, Miro, and other projects mixed in there ad hoc, half-baked, and with a twist of lemon

Fri, 10 May 2013

My thoughts on Elasticsearch: Part 1: indexing

Summary

I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.

Thus I decided to write a series of blog posts.

This one covers indexing. Later ones will cover mappings, searching and other things.

It's also long, rambling and contains code. The rest is after the break.

Sun, 15 Apr 2012

mozilla status: April 15th, 2012

I haven't had time to blog much in the last few months. At work, I've been spending all my time with elasticsearch, elasticutils, and SUMO bug fixing. I've been working on the conversion from Sphinx search to elasticsearch for SUMO since I started at Mozilla, but I've only recently felt like I'm really getting the hang of it. There are a bunch of elasticutils-related things I want to blog about, but those will come in fugure entries.

In my spare time, I've been working on richard. This project has nothing to do with Richard of air mozilla fame, but rather is a video indexing web application. It's the software that runs pyvideo.org.

pyvideo.org has the distinction of being the first Django application I've built from the ground up. That distinction is both a virtue (yay for first apps!) and a vice (boo for silly things I did when doing it!).

The one thing I did that I'm really proud of is that when building the software, I knew I needed help if it was to succeed and thus I worked to make it easy and inviting for contributors to get involved:

The end result of that is that there are 4 contributors to richard including myself and one of them is very active.

Asheesh did a talk at LibrePlanet 2012 that mentioned Mako's power law of contributions to open source projects. The gist of it is that most open source projects only ever have one contributor. [3]

Well, I've got 5 on my video index web application software that I "launched" a month ago. I'm feeling good about that.

[1]Several of my friends point out that GitHub kind of takes the D out of DVCS.
[2]Though didn't have any tests when I "launched".
[3]I may fix this paragraph after Asheesh corrects me.