Will >> Will's blog
Wed, 14 May 2014
I just kicked off a script that's going to take around 4 hours to complete mostly because the API it's running against doesn't want me doing more than 60 requests/minute. Given I've got like 13k requests to do, that takes a while.
I'm (ab)using Elasticsearch to store the data from my script so that I can analyze it more easily--terms facet is pretty handy here.
Given that I've got some free time now, I spent 5 minutes setting up Kibana.
- download the tarball
- untar it into a directory
- edit kibana-3.0.1/config.js to point to my local Elasticsearch cluster (the defaults were fine, so I could have skipped this step)
- cd kibana-3.0.1/ and run python -m SimpleHTTPServer 5000 (I'm using a Python-y thing here, but you can use any web-server)
- point my browser to http://localhost:5000
Now I'm using Kibana.
Now that I've got it working, first thing I do is click on the cog in the upper right hand corner, click on the Index tab and change the index to the one I wanted to look at. Now I'm looking at the data my script is producing.
The Kibana site says Kibana excels at timestamped data, but I think it's helpful for what I'm looking at now despite it not being timestamped. I get immediate terms facets on the fields for the doc type I'm looking at. I can run queries, pick specific columns, reorder, do graphs, save my dashboard to look at later, etc.
If you're doing Elasticsearch stuff, it's worth looking at if only to give you another tool to look at data with.
Fri, 10 May 2013
I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.
Thus I decided to write a series of blog posts.
This one covers indexing. Later ones will cover mappings, searching and other things.
It's also long, rambling and contains code. The rest is after the break.
Sun, 15 Apr 2012
I haven't had time to blog much in the last few months. At work, I've been spending all my time with elasticsearch, elasticutils, and SUMO bug fixing. I've been working on the conversion from Sphinx search to elasticsearch for SUMO since I started at Mozilla, but I've only recently felt like I'm really getting the hang of it. There are a bunch of elasticutils-related things I want to blog about, but those will come in fugure entries.
In my spare time, I've been working on richard. This project has nothing to do with Richard of air mozilla fame, but rather is a video indexing web application. It's the software that runs pyvideo.org.
pyvideo.org has the distinction of being the first Django application I've built from the ground up. That distinction is both a virtue (yay for first apps!) and a vice (boo for silly things I did when doing it!).
The one thing I did that I'm really proud of is that when building the software, I knew I needed help if it was to succeed and thus I worked to make it easy and inviting for contributors to get involved:
- I wrote documentation: license file, README, documentation covering how to install it for hacking, how to contribute, where to find me, ...
- I parked the code on GitHub to make it easier for people to access. 
- I made sure there were a series of issues in the issue tracker that showed the next round of things that needed to be done.
- I made sure I had an IRC channel and that people knew where to find me to ask questions.
- I quickly got the documentation built on ReadTheDocs.
- I had a test infrastructure set up. 
- I respond to everyone who sends an email, creates a pull request, writes an issue, says hi on IRC, ...
The end result of that is that there are 4 contributors to richard including myself and one of them is very active.
Well, I've got 5 on my video index web application software that I "launched" a month ago. I'm feeling good about that.
|||Several of my friends point out that GitHub kind of takes the D out of DVCS.|
|||Though didn't have any tests when I "launched".|
|||I may fix this paragraph after Asheesh corrects me.|