Will's blog

purpose: Will Kahn-Greene's blog of Miro, PyBlosxom, Python, GNU/Linux, random content, PyBlosxom, Miro, and other projects mixed in there ad hoc, half-baked, and with a twist of lemon

[home | blog home]

Wed, 20 Mar 2013

ElasticUtils sprint at PyCon US 2013

What is it?

ElasticUtils is a Python library for building and executing ElasticSearch searches.

PyCon US 2013 sprint

I was only at the sprints for a single day. Rob and I spent some time working on elasticutils. Several good things came out of that:

  1. Rob wrote up an elasticutils Django middleware which throws a 501 or 503 page if an unhandled pyelasticsearch or requests exception is raised
  2. I fixed the Django tasks, added a test, and updated the documentation
  3. I cleaned up the Django ElasticSearchTestCase class
  4. I spent a bunch of time thinking about queries, syntax and functionality

Someone on IRC asked whe the next version of elasticutils will go out. I have no schedule right now, but I think it's important to let the code get used by projects that don't mind being bleeding edge and bake for a bit. The code in master tip right now is 0.7.dev and the big change since 0.6 is that we switched from pyes to pyelasticsearch. That's a big change---the more baking it does, the better.

Having said that, a release depends mostly on how much free time I have in the near future. I'm about to lose all free time for a bit, so my guess is that we won't see a 0.7 release until this summer unless there's a compelling reason to push one out.

In the meantime, I'm actively maintaining the v0.5 and v0.6 branches. I'd like to stop maintaining the v0.5 branch, but need to get Mozillians and AMO off of it first.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Fri, 08 Feb 2013

SUMO: Now ... in pirate!

A while back, I wrote a post about poxx.py which talked about a script I based on Ned Batchelder's poxx.py script and overhauled to provide a faux "Swedish Chef" translation of Miro strings allowing me to test localizations of the application.

The transform from English to "Swedish Chef" had the following four impotant properties:

  1. the output is vaguely readable
  2. the output is longer than the input which helps us find ui issues
  3. the output is clearly distinguished from English which helps us find strings that aren't getting translated
  4. the output is mildly amusing which is sometimes important in dark times

Back in August, I made some changes and pulled it into fjord. This helped us suss out localization issues on a new site. However, I wasn't really happy with it. Amongst other things, I always wondered if "Swedish Chef" was kind of culturally insensitive.

A couple of weeks ago, I overhauled poxx.py again. This time, PIRATE! It continues to have the four properties I think are important for a test locale.

We're using it now for SUMO development. It's the grog to your Jolly Roger:

SUMO -- In Pirate!

SUMO -- In Pirate!

We're using this script on both SUMO and Fjord now. You can use it for your site, too! The code is at https://github.com/mozilla/kitsune/tree/master/scripts/.

If you see any problems with it, toss me a message in a bottle.

Note

This localization is only available in development environments. Unlike Miro where we shipped the Swedish Chef translation (or used to---I'm not sure if they do anymore), you cannot see this on the -dev, -stage or -prod SUMO sites.

Mon, 31 Dec 2012

SUMO and Input: 2012

This was my first full year at Mozilla and it was intense. I essentially worked on four projects: SUMO, Input, ElasticUtils and Gaia. This blog post talks about the first two which are worked on by the James' Rifles SUMINPUT Megalosaur team.

We accomplished a lot on SUMO this year. I spent a couple of hours last week throwing together a rough "year in review" script that looked at Bugzilla and git and crunched some numbers:

Twas the year: 2012
===================


Bugzilla
========

Bugs created: 938

              rrosario : 201
               a.topal : 188
                willkg : 108
           scoobidiver : 51
               igarcia : 41
                mverdi : 36
      swarnavasengupta : 30
                 james : 29
                  bram : 19
            tobbi.bugs : 17

Bugs resolved: 1025

              rrosario : 335
                       : WORKSFORME 18
                       :    INVALID 16
                       :  DUPLICATE 23
                       :    WONTFIX 7
                       :      FIXED 263
                       : INCOMPLETE 8
               a.topal : 182
                       : WORKSFORME 36
                       :    INVALID 41
                       :  DUPLICATE 11
                       :    WONTFIX 70
                       :      FIXED 21
                       : INCOMPLETE 3
                willkg : 131
                       :  DUPLICATE 6
                       :      FIXED 110
                       : WORKSFORME 2
                       :    WONTFIX 11
                       :    INVALID 2
                rdalal : 84
                       :      FIXED 84
                 james : 51
                       : WORKSFORME 6
                       :    INVALID 5
                       :  DUPLICATE 3
                       :    WONTFIX 15
                       :      FIXED 14
                       : INCOMPLETE 8
               mcooper : 37
                       :      FIXED 36
                       :    INVALID 1
            tobbi.bugs : 29
                       :      FIXED 29
             tgavankar : 28
                       :    WONTFIX 1
                       : WORKSFORME 1
                       :      FIXED 26
           scoobidiver : 28
                       :      FIXED 4
                       :  DUPLICATE 4
                       : WORKSFORME 11
                       :    WONTFIX 3
                       :    INVALID 6
               bmo2010 : 13
                       :      FIXED 1
                       :  DUPLICATE 3
                       : WORKSFORME 3
                       :    INVALID 6

            INCOMPLETE : 21
             DUPLICATE : 61
            WORKSFORME : 82
               INVALID : 91
               WONTFIX : 117
                 FIXED : 653

git
===

Total commits: 916

         Ricky Rosario : 430
      Will Kahn-Greene : 192
           Rehan Dalal : 98
           Mike Cooper : 44
             Erik Rose : 34
                 Tobbi : 29
        Tanay Gavankar : 23
           Kadir Topal : 11
             Tim Watts : 10
         Berker Peksag : 9
           James Socol : 7
            Victor Neo : 6
      Cesar Carruitero : 5
           David Lilly : 4
                  Ibai : 3
        Isac Lagerblad : 2
                 icaaq : 1
           TylerDowner : 1
              browning : 1
         ricky rosario : 1
    Anatoli Papirovski : 1
     Clauber Stipkovic : 1
          Jason Thomas : 1
                atopal : 1
      Florin Strugariu : 1

There are some interesting bits in there:

  1. Ricky does a lot of work! Holy cow!

  2. There were 23 people who contributed code to Kitsune (the SUMO codebase) this year. Of those, about half are volunteer contributors.

    Compare with 2011, we had 19 people who contributed to the code base and less than half were volunteer contributors.

  3. We resolved more bugs than we created in 2012. We did that in 2011 as well, so that's two years in a row. I've never seen that happen before on a project I work on.

The codebase is pretty different now than it was at the beginning of the year. I helped with the following semi-massive overhauls:

  1. The push for more metrics and dashboards to view the numbers.
  2. The switch from Sphinx to ElasticSearch.
  3. The new Information Architecture which affected browsing and searching across the site.
  4. The site redesign which covered both the desktop and mobile versions of the site.
  5. The upgrade to Django 1.4.
  6. The switch from arecibo to sentry.
  7. The push to switch from fixtures to model makers for all our tests.
  8. The switch from weekly deployments on Tuesdays to deploying whenever we want. Continuous deployment is fantastic.
  9. Started switching the whole site from Webtrends to Google Analytics. I saw Ricky write up a bunch of bugs to finish up that work, so I'll say it's in progress.
  10. During the redesign, Rehan redid all the CSS and switched us to use LESS.
  11. I spun off some code I wrote for richard, then ported to Fjord, then improved into a project called django-eadred. That makes it a lot easier to generate sample data for a variety of purposes like new contributors, bootstrapping, and large random data sets.

On top of that, we did a lot of work on the documentation and making it easier to get to a working Kitsune development environment. We switched to a sprint-based work flow using Scrumbugz. We also nixed our daily checkin conference call for an IRC-based checkin system that we wrote called Standup.

It's been a big year.

For Input, it was a bigger year. We decided to abandon the old Input codebase (omfg yay) in favor of rewriting it from the ground up. The rewrite took a couple of months and then has sort of been sitting around waiting for a security review. In the meantime, we (actually, Mike did) fixed a bunch of issues with the old site code because that's what's currently in production.

Rewriting Input wouldn't have taken so long except that we did a lot of work fixing bugs in external libraries and updating Playdoh. That work definitely cut into our schedule, but it benefitted a bunch of other groups/people/sites, so that's good.

That's the gist of the year: it was a lot of work, but we accomplished a ton.

w00t for 2012!

Thu, 02 Aug 2012

ElasticUtils v0.4 released!

What is it?

ElasticUtils is a Python library for building and executing ElasticSearch searches.

v0.4 released!

I released v0.4 a couple of days ago. This release adds new functionality, fixes some issues, adds more tests, and includes improved documentation.

On top of that, we removed the requirement for Django and moved the Django-aiding components into elasticutils.contrib.django. I personally like this because it makes it much easier to write test scripts to see how things react.

For the complete list of what's new, What's new in Version 0.4.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Fri, 18 May 2012

elasticutils status -- May 18th, 2012

A few months ago, I "took over" maintenance of elasticutils. We use it in SUMO as the API for building search queries with elasticsearch.

One of the first things I did was spend some time figuring out whether we should keep working on elasticutils at all. django-haystack also provides a django-ish API for working with elasticsearch. Why have two libraries that at a high level do the same thing?

The thing is that they're not exactly the same. django-haystack is really great and supports a variety of backends for search, elasticsearch being one of them. Right now, it only has support for elasticsearch in 2.0 which is in either an alpha or beta state now (their web-site could use some updates). However, because it supports a bunch of backends, it only supports functionality that works across all of them.

elasticutils, on the other hand, is elasticsearch-specific. As elasticsearch adds functionality, we can, too. That's the compelling reason to keep working on this library. However, django-haystack has some awesome ideas that we'd like to implement in elasticutils, too. This will fix some sharp edges in elasticutils, but also make it much easier for projects to switch from one to the other.

Currently, elasticutils only handles the query side of things. django-haystack handles that, but also has an API for defining mappings, indexing, and all the other things you need with a search system.

Thus, Rob Hudson and I are going to embrace and extend elasticutils to:

  1. fix the current situation where it seems every elasticutils user is actually using their own branch with additional functionality in it (ew!)
  2. implement the rest of the things you need with a search system
  3. document the things we've learned while working with elasticutils because at a minimum, it seems most of the Mozilla projects that use elasticutils bumped into, spent time on, and solved the same problems---that's a huge waste of time and a failure on my part

One of the things users of a library need is for the library to be a mature project with releases, tagged version, documentation, tests, stability, reliability, reproduceability, communication, community and all that. Thus, I'm also going to spend some time to turn this into a real project. Towards that end, I created #elasticutils on irc.mozilla.org where we'll talk dirty elasticutils stuff. If we end up with more people pitching in, we'll create a mailing list. But for now, IRC will do.

My next step is to spend a little time cleaning up what's in the master branch, then tag and release a baseline version.

After that, I'm going to spend time identifying, thinking about and merging in the divergent functionality in the various branches while Rob works on continuing his imperative mapping work.

I think in a couple of months, we'll be in a better place and that'll make it easier for Mozilla projects and anyone else who wants to use elasticutils to use and contribute to it.

If you're a user of elasticutils, please come hang out with us! Let us know how we can better help you.

Wed, 02 May 2012

webdev workweek thoughts

I'm at the webdev work week in Santa Cruz, CA, USA this week. It's great to meet people I've been talking to for the last 6 months. It's also kind of nice to take a break from the SUMO sprints. I've been spending the time lifting my head and seeing what's been happening while I wasn't paying attention.

List of three things on my mind:

  1. I wish I had more time. There are so many people doing exciting things that I'd love to work on, too, but I just don't have enough time. It's hard to suppress the, "Wow---that looks awesome! I'll help you on that project!" reflex. I just don't have time. It's kind of a bummer. On the flip side, if I do discover myself with free time, I have a long (long) list of things I'm interested in.
  2. It's nice to talk about technical things. It's nice to make a (bad) joke about mime types and get a response. (An HTTP response! ha ha!)
  3. Everyone is really super and friendly. I don't think I've been in a group this big that's lacking a token jerk. This has caused me to idly wonder if maybe I'm the token jerk. If so, I hope someone leaves me a note so I can adjust accordingly.

Things I'm taking away (so far):

  1. I should dig up my compilers book and re-read it because I really liked studying compilers way back when.
  2. I screwed up with spreading elasticsearch knowledge. I keep having conversations about problems we've seen and solved in SUMO that other projects are having. Plus it seems like everyone has their own branch of elasticutils with differing features. I'm going to try to spend some time working on making the situation better both by documenting things and also improving elasticutils so we're not all maintaining our own versions.
  3. Need to spend more time learning things.


pyblosxom::1.5.3.wgkg

Copyright 1996 to 2013, Will Guaraldi Kahn-Greene, under the Creative Commons BY-SA 3.0 license

Creative Commons License
Will's Blog by William Kahn-Greene is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.