Will's blog

purpose: Will Kahn-Greene's blog of Miro, PyBlosxom, Python, GNU/Linux, random content, PyBlosxom, Miro, and other projects mixed in there ad hoc, half-baked, and with a twist of lemon

[home | blog home]

Page 1 of 29  >> (less recent)

Fri, 10 May 2013

My thoughts on Elasticsearch: Part 1: indexing

Summary

I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.

Thus I decided to write a series of blog posts.

This one covers indexing. Later ones will cover mappings, searching and other things.

It's also long, rambling and contains code. The rest is after the break.

read more after the break...

Wed, 03 Apr 2013

pyvideo status: April 3rd, 2013

What is pyvideo.org

pyvideo.org is an index of Python-related conference and user-group videos on the Internet. Saw a session you liked and want to share it? It's likely you can find it, watch it, and share it with pyvideo.org.

Status

  • Videos for PyCon US 2013 are still going up. There are 115 posted and live now. There are around 30 that are waiting for presenters to look at the metadata and tell Carl whether the metadata is good or not. More on that later.

  • Several new people submitted patches to richard! Several of the patches were fixes to broken things they saw on pyvideo.org. I've applied the fixes to the site directly, but have been waiting on making any non-critical updates to the site until after things have cooled off. I think I'll do a site update in the next week or so.

  • PyData 2013 was recorded. When videos are posted, they'll be in the PyData category. I don't know what the posting schedule is.

  • I was contacted a couple of times by the inimitable Montréal Python to post their videos. They're going to test out steve which is the tool I've been writing for the last 6 months to make it possible for other folks to generate the video metadata needed by pyvideo.org.

    I eagerly look forward to their progress and to their videos getting on the site.

    If it works out well, I'll blog more about steve and look for volunteers to use steve to generate the video metadata for the ever increasing backlog.

  • Several people are gittip'ing me. It's not a lot of money, but that and the many emails I've gotten over the last few weeks about the site have been really great. I work on pyvideo.org in my free time of which I don't have a lot. It's nice to know that prioritizing pyvideo.org work over other things helps you.

That's the gist of things!

Most of the PyCon US 2013 videos that aren't live are waiting for presenters to tell Carl at NextDayVideo (carl at nextdayvideo dot com) whether the metadata is good.

  • If you see your name on this list and you've told Carl the metadata is fine already, please send him a friendly reminder.
  • If you see your name on this list and you haven't told Carl anything, please send him a "yes, this is great!" or the list of things you need corrected.
  • If you see a friend on this list, tell your friend to do one of the above.

I'll update this list as I'm aware of changes. However, I don't work for NextDayVideo, so it's entirely possible my list is not current and/or there are errors. If so, please let me know.

Here's the list (last updated 2013-04-12 7:13am -0400):

  • Digital signal processing through speech, hearing, and Python -- Mel Chua
  • Faster Python Programs through Optimization -- Mike Müller
  • Python beyond the CPU -- Andy Terrel, Travis Oliphant, Mark Florisson
  • Code to Cloud in under 45 minutes -- John Wetherill
  • A Gentle Introduction to Computer Vision -- Katherine Scott, Anthony Oliver
  • Documenting Your Project in Sphinx -- Brandon Rhodes
  • Contribute with me! Getting started with open source development -- Jessica McKellar
  • Intermediate Twisted: Test-Driven Networking Software -- Itamar Turner-Trauring
  • Gittip: Inspiring Generosity -- Chad Whitacre
  • The Magic of Metaprogramming -- Jeff Rush
  • You can be a speaker at PyCon! -- Anna Ravenscroft
  • sys._current_frames(): Take real-time x-rays of your software for fun and performance -- Leonardo Rochael
  • Planning and Tending the Garden: The Future of Early Childhood Python Education -- Kurt Grandis
  • powerful pyramid features -- Carlos de la Guardia
  • Python for Robotics and Hardware Control -- Jonathan Foote
  • Copyright and You -- Frank Siler
  • Chef: Automating web application infrastructure -- Kate Heddleston
  • Numba: A Dynamic Python compiler for Science -- Travis Oliphant, Siu Kwan Lam, Mark Florisson
  • Integrating Jython with Java -- Jim Baker, Shashank Bharadwaj
  • Iteration & Generators: the Python Way -- Luciano Ramalho
  • ApplePy: An Apple ][ emulator in Python -- James Tauber
  • Distributed Coordination with Python -- Ben Bangert
  • Become a logging expert in 30 minutes -- Gavin M. Roy
  • PyNES: Python programming for Nintendo 8 bits -- Guto Maia
  • Purely Python Imaging with Pymaging -- Jonas Obrist
  • Namespaces in Python -- Eric Snow

These are all set now:

  • IPython in-depth: high-productivity interactive and parallel python -- Fernando Perez, Brian Granger, Min RK
  • Pyramid for Humans -- Paul Everitt
  • Learn Python Through Public Data Hacking -- David Beazley
  • Rethinking Errors: Learning from Scala and Go -- Bruce Eckel

Wed, 20 Mar 2013

ElasticUtils sprint at PyCon US 2013

What is it?

ElasticUtils is a Python library for building and executing ElasticSearch searches.

PyCon US 2013 sprint

I was only at the sprints for a single day. Rob and I spent some time working on elasticutils. Several good things came out of that:

  1. Rob wrote up an elasticutils Django middleware which throws a 501 or 503 page if an unhandled pyelasticsearch or requests exception is raised
  2. I fixed the Django tasks, added a test, and updated the documentation
  3. I cleaned up the Django ElasticSearchTestCase class
  4. I spent a bunch of time thinking about queries, syntax and functionality

Someone on IRC asked whe the next version of elasticutils will go out. I have no schedule right now, but I think it's important to let the code get used by projects that don't mind being bleeding edge and bake for a bit. The code in master tip right now is 0.7.dev and the big change since 0.6 is that we switched from pyes to pyelasticsearch. That's a big change---the more baking it does, the better.

Having said that, a release depends mostly on how much free time I have in the near future. I'm about to lose all free time for a bit, so my guess is that we won't see a 0.7 release until this summer unless there's a compelling reason to push one out.

In the meantime, I'm actively maintaining the v0.5 and v0.6 branches. I'd like to stop maintaining the v0.5 branch, but need to get Mozillians and AMO off of it first.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Tue, 19 Mar 2013

Adding Persona authentication to richard

tl;dr

This is a post covering my first time experience with integrating Persona authentication into my Django project named richard. I briefly cover why I did it, what I used, and list the commits I did the work in as an example of how it can be done. I hope this helps others implement it on their sites..

why

A month ago, I added Persona authentication support to richard. This allowed me to use Persona authentication for pyvideo.org. I did this for several reasons:

  1. I wanted to try it out and see how well it worked on a small Django site (tl;dr works great---I'll use this on all my sites)
  2. I wanted people to authenticate with an email-based identity rather than a social network based identity
  3. I wanted to allow people to create accounts on pyvideo.org, but didn't want to deal with the responsibility of protecting things like passwords

So that's where I'm coming from.

how

I used django-browserid which gives you some JavaScript and a few template tags that make it easy to incorporate Persona authentication into a Django app.

It took about 15 minutes to get it working. I've made some minor edits to the code since then and updated to v0.8 of django-browserid. All told, I think I've spent a couple of hours on Persona implementation.

In the process of doing that work, I hit a few minor issues, created some pull requests, helped with other pull requests and became one of the maintainers. Yay!

Here are the commits I did the work in. I figured the diffs might help you implement similar things on your sites:

That last commit updates to django-browserid master tip to pick up a fix to login failures if BROWSERID_CREATE_USER is False. That fix will be released in v0.8.1 soon.

further reading

The Mozilla Persona site helps understand why it exists and has a Developer FAQ.

The django-browserid docs are pretty good and walk through setting it up, advanced usage, and troubleshooting. I encourage you to read through them in full---it'll give you a better understanding of the pieces.

Dan Callahan did a talk at PyCon US 2013 on Persona. That's worth watching. It covers why Mozilla built it, how it works, and why it's important that it works that way. He also demos integrating it into sites and talks about using Persona authentication alongside other authentication methods.

If you're interested in adding Persona authentication to your Django site and need help, let me know.

Sat, 16 Feb 2013

Django Eadred v0.2 released! Django app for generating sample data.

Django Eadred gives you some scaffolding for generating sample data to make it easier for new contributors to get up and running quickly, bootstrapping required database data, and generating large amounts of random data for testing graphs and things like that.

For v0.2, I added some helper methods for generating names, email addresses, sentences and paragraphs. It's definitely the case that these helpers won't handle all use cases, but I think they'll help specific ones.

There are no backwards-compatability problems with v0.1.

To update, do:

pip install -U eadred

Sun, 03 Feb 2013

pyvideo status: February 3rd, 2013

What is pyvideo.org

pyvideo.org is an index of Python-related conference and user-group videos on the Internet. Saw a session you liked and want to share it? It's likely you can find it, watch it, and share it with pyvideo.org.

Status

  • Videos for PyCon AU 2012 are posted.

    That's probably the last conference I'm going to do on my own. More about that later.

  • I've made some big changes to richard. For one, formatted fields use Markdown instead of HTML now (yay!). I've improved the API. I've made a lot of layout tweaks and user interface improvements.

  • I pushed out steve v0.1 and then promptly made a bunch of fixes, tweaks and changes. So I need to do a new release soon. steve is the utility people can use to generate conference data for pyvideo.org. See the commandline chapter for details.

I've been working on getting steve and richard to the point where I'm neither doing all the work nor am I the bottleneck for work being done.

I still need to write up a blog post on how to use steve to generate JSON files for pyvideo.org. That will make it possible for anyone to add conference video.

I'm working on changing richard to allow for other people to edit video metadata. It'll continue to be curated, but this will make it possible for other people to help because there are like 1600 videos and the repository continues to grow and I'm just one man. I have some of this worked out on paper, but it needs to be implemented.

That's the current push. I'm hoping to have a lot of this done for PyCon 2013.

Mon, 05 Nov 2012

Donate to MediaGoblin, get a chance at free PyCon tickets!

MediaGoblin is currently running a fund-raising campaign to raise funds for Chris to work on MediaGoblin for the next year implementing federation, making it easier to install and use, and other features as well.

We just announced that the next 25 people who donate $$200.00 or more will get a chance to get free tickets to PyCon.

That's pretty awesome!

If you've been waiting to support MediaGoblin, now's the time to do it!

Tue, 23 Oct 2012

Gaia: First week

For the next few months, I'm switching projects to help work on Gaia. I essentially started yesterday, but I'm still missing a bunch of pieces, so I haven't actually done any work. What I have done is spent time immersing myself in the project and trying to get my bearings.

Thus this blog post covers how I got my bearings so far.

Gaia is a project in heavy flux and moving fast. The state and stability of things changes day to day. There are things that aren't documented. There are things that are documented that are out of date. There are dozens of etherpads, lists of bugs, wiki pages, and tips and tricks scattered around. This is the way it is currently. Even this will probably change.

However it's not all chaos and entropy. While a lot of things are in flux, some things stay the same. That's why I decided to write this blog post of things I think help get you up and running faster.

Note

This blog post definitely has a lifespan. If you're reading this in 2013, it's probably out of date.

On Monday

Read through these hacking wiki pages:

Join the #gaia and #b2g IRC channels on irc.mozilla.org.

Join the dev-gaia mailing list.

Fork and clone the Gaia github repository.

That seems like a short list, but take the time to catalog in your head all the things that are there.

On Tuesday

Go to the Gaia weekly meeting.

Mute and facemute yourself. Make sure to follow along in the Etherpad notes they link to. The meeting I went to used this Etherpad: https://etherpad.mozilla.org/gaia-meeting-notes

About 1/3 of the way down that pad, there's a list of components, who's working on them, their status, etc---that's current as of the time of this writing.

In going to the meeting and reading through the notes, you'll get a sense of who's who, who's working on what, what the current sprint priorities are, and you might also get an indication of where you can help out.

After that, work on getting Gaia working in the B2G desktop nightly build.

The B2G desktop periodically has stability issues. If you run into problems, ask on #gaia on IRC.

Note

I have a ThinkPad x200 running Debian testing and I couldn't get the B2G desktop to work well enough to use. The animations were super slow. I have problems with graphics acceleration on this laptop with other applications, so I'm pretty sure that's the problem. Because of that, I switched to a Macbook Pro running OSX 10.8.

I have no experience with B2G desktop on other systems. I've heard it works fine in Linux in some situations, but I have no clue what the details are.

On Wednesday

Assuming you have everything working so far, now's the time to start looking for bugs to work on and/or testing the existing apps.

As of the time of this writing, the B2G/Triage wiki page has a variety of lists of bugs in various states. There's the P1 and P2 lists in the Gaia section.

Also, I've accumulated these lists, but they may not be valid anymore:

I think the workflow is something like this:

  1. find a bug you can work on that's not assigned to anyone
  2. assign that bug to yourself
  3. work on it
  4. produce a patch --- must include tests!
  5. create a pull request on github
  6. find a reviewer to look at it --- probably want someone who works on that component; ask on #gaia on IRC
  7. go through review until it's good
  8. get someone to land it --- I'm fuzzy on this step, but the person needs commit access to the repository on github; ask on #gaia on IRC

Thursday, Friday, etc

Rinse, repeat.

Conclusion

Hope this helps someone else! I think the important thing is to go to a Gaia weekly meeting.

Addendums

Random thoughts that didn't fit anywhere else in this hastily written post:

  1. If you bump into incorrect information in the Gaia/Hacking wiki page, please update it or ask someone on #gaia to verify it's incorrect.
  2. If you ask a question and no one replies to you on IRC, wait a bit, then ask again. Folks are busy and in different time zones, but they are paying attention.
  3. If you see anything in this blog post that's incorrect, find me on IRC. I'm willkg.
  4. e.me stands for "everything.me"
  5. FTU stands for "first-time-usage"

Also, I overheard this on IRC and it helped:

<fzzzy> here's something important to understand about ffos:
there's b2g, and there's gaia

<fzzzy> b2g is the big compiled blob of c++ and some js modules

<fzzzy> gaia is all js, but it is preprocessed into a profile
directory

<fzzzy> if you double-click B2G.app, you get a gaia profile
that is inside of the app

<fzzzy> if you run b2g-bin from the command line, you can pass
the -profile /path/to/profile flag, and b2g will use that gaia

<fzzzy> it just depends if you want to just kick the tires, or
actually hack on gaia itself

Tue, 16 Oct 2012

Django Eadred v0.1 released! Django app for generating sample data.

I work on a few projects that had a need for generating sample data to make it easier for new contributors to get up and running quickly with little effort. These projects are fairly data-driven---they're kind of useless without data.

To satisfy that need, we wrote an app in richard to generate sample data across all the other apps in the project. Then I rewrote it for input.

Then we had a hankering for it in SUMO, plus I thought it made sense to turn it into its own app. So I spun it out into its own project.

Thus django-eadred was born.

Generally, it allows you to define a sampledata.py module with a generate_sampledata function that takes command line options to generate sample data for any app you want to generate sample data for.

You can use it to define different ways of generating sample data specified by the command line.

You can use it to generate random data, non-random data, initial data, data for contributors, sample data for large data sets, fixture data, etc.

Check out django-eadred.readthedocs.org for use cases, documentation and project details.

Mon, 15 Oct 2012

Donate to GNU MediaGoblin! Help us cross the chasm!

The GNU MediaGoblin project is raising funds to allow Chris Webber to work on it full time for the next year. The project has done really well over the last year and a half and has come a long way. However, there's a bunch of work that needs to be done and the sooner it gets done, the better. Essentially, we're staring at a chasm between "bootstrapping the project" where we needed enough to grow a community and have something people can build on and "1.0" where it's generally usable by our target audience.

Because of that, Chris quit his job at Creative Commons to work on MediaGoblin full time in a valiant attempt to get us across that chasm.

There are a lot more details on the MediaGoblin campaign page and a movie that Chris and Deb put together that explain why and why now.

Support GNU MediaGoblin!

Support GNU MediaGoblin!

Please help fund MediaGoblin so we can get across that chasm!

Please Tweet, Dent, Facebook, blog and otherwise get the word out, too! Use the campaign url when you do. That helps a ton! Thank you!

Page 1 of 29  >> (less recent)


pyblosxom::1.5.3.wgkg

Copyright 1996 to 2013, Will Guaraldi Kahn-Greene, under the Creative Commons BY-SA 3.0 license

Creative Commons License
Will's Blog by William Kahn-Greene is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.