Will >> Will's blog

purpose: Will Kahn-Greene's blog of Miro, PyBlosxom, Python, GNU/Linux, random content, PyBlosxom, Miro, and other projects mixed in there ad hoc, half-baked, and with a twist of lemon

Wed, 14 May 2014

Fiddling with Kibana

I just kicked off a script that's going to take around 4 hours to complete mostly because the API it's running against doesn't want me doing more than 60 requests/minute. Given I've got like 13k requests to do, that takes a while.

I'm (ab)using Elasticsearch to store the data from my script so that I can analyze it more easily--terms facet is pretty handy here.

Given that I've got some free time now, I spent 5 minutes setting up Kibana.

Steps:

  1. download the tarball
  2. untar it into a directory
  3. edit kibana-3.0.1/config.js to point to my local Elasticsearch cluster (the defaults were fine, so I could have skipped this step)
  4. cd kibana-3.0.1/ and run python -m SimpleHTTPServer 5000 (I'm using a Python-y thing here, but you can use any web-server)
  5. point my browser to http://localhost:5000

Now I'm using Kibana.

Now that I've got it working, first thing I do is click on the cog in the upper right hand corner, click on the Index tab and change the index to the one I wanted to look at. Now I'm looking at the data my script is producing.

The Kibana site says Kibana excels at timestamped data, but I think it's helpful for what I'm looking at now despite it not being timestamped. I get immediate terms facets on the fields for the doc type I'm looking at. I can run queries, pick specific columns, reorder, do graphs, save my dashboard to look at later, etc.

If you're doing Elasticsearch stuff, it's worth looking at if only to give you another tool to look at data with.

Fri, 10 May 2013

My thoughts on Elasticsearch: Part 1: indexing

Summary

I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.

Thus I decided to write a series of blog posts.

This one covers indexing. Later ones will cover mappings, searching and other things.

It's also long, rambling and contains code. The rest is after the break.

Sun, 15 Apr 2012

mozilla status: April 15th, 2012

I haven't had time to blog much in the last few months. At work, I've been spending all my time with elasticsearch, elasticutils, and SUMO bug fixing. I've been working on the conversion from Sphinx search to elasticsearch for SUMO since I started at Mozilla, but I've only recently felt like I'm really getting the hang of it. There are a bunch of elasticutils-related things I want to blog about, but those will come in fugure entries.

In my spare time, I've been working on richard. This project has nothing to do with Richard of air mozilla fame, but rather is a video indexing web application. It's the software that runs pyvideo.org.

pyvideo.org has the distinction of being the first Django application I've built from the ground up. That distinction is both a virtue (yay for first apps!) and a vice (boo for silly things I did when doing it!).

The one thing I did that I'm really proud of is that when building the software, I knew I needed help if it was to succeed and thus I worked to make it easy and inviting for contributors to get involved:

The end result of that is that there are 4 contributors to richard including myself and one of them is very active.

Asheesh did a talk at LibrePlanet 2012 that mentioned Mako's power law of contributions to open source projects. The gist of it is that most open source projects only ever have one contributor. [3]

Well, I've got 5 on my video index web application software that I "launched" a month ago. I'm feeling good about that.

[1]Several of my friends point out that GitHub kind of takes the D out of DVCS.
[2]Though didn't have any tests when I "launched".
[3]I may fix this paragraph after Asheesh corrects me.