Will >> Will's blog
Sun, 20 Jul 2014
This is the status report for development on Input. I publish a status report to the input-dev mailing list every couple of weeks or so covering what was accomplished and by whom and also what I'm focusing on over the next couple of weeks. I sometimes ruminate on some of my concerns. I think one time I told a joke.
Last status report was at the end of June. This status report covers the last few things we landed in 2014q2 as well as everything we've done so far in 2014q3.
Landed and deployed:
- 6ecd0ce [bug 1027108] Change default doc theme to mozilla sphinx (Anna Philips)
- 070f992 [bug 1030526] Add cors; add api feedback get view
- f6f5bc9 [bug 1030526] Explicitly declare publicly-visible fields
- c243b5d [bug 1027280] Add GengoHumanTranslater.translate; cleanup
- 3c9cdd1 [bug 1027280] Add human tests; overhaul Gengo tests
- ff39543 [bug 1027280] Add support for the Gengo sandbox
- 258c0b5 [bug 1027280] Add test for get_balance
- 44dd8e5 [bug 1027280] Implement Gengo Human push_translations
- 35ae6ec [bug 1027280] Clean up API code
- a7bf90a [bug 1027280] Finish pull_translations and tests
- c9db147 [bug 1027286] Gengo translation system status
- f975f3f [bug 1027291] Implement spot Gengo human translation
- f864b6b [bug 1027295] Add translation_sync cron job
- c58fd44 [bug 1032226] en-GB should copyover, too
- 7480f87 [bug 1032226] Tweak the code to be more defensive
- 7ac1114 [bug 1032571] CSRF exempt the API
- ac856eb [bug 1032571] Fix tests to catch csrf issues in the api
- 74e8e09 [bug 1032967] Handle unsupported language pairs
- 74a409e [bug 1026503] First pass at vagrantification
- a7a440f Continued working on docs; ditched hacking howto
- 44e702b [bug 1018727] Backfill translations
- 69f9b5b Fix date_end issue
- e59d4f6 [bug 1033852] Better handle unsupported src languages
- cc3c4d7 Add list of unsupported languages to admin
- 32e7434 [bug 1014874] Fix translate ux
- 672abba [bug 1038774] Hide responses from hidden products
- e23eca5 Fix a goof in the last commit
- 6f78e2e [bug 947767] Nix authentication for API stuff
- a9f2179 Fix response view re: non-existent products
- e4c7c6c [Bug 1030905] fjord feedback api tests for dates (Ian Kronquist)
- 0d8e024 [bug 935731] Add FactoryBoy
- 646156f Minor fixes to the existing API docs
- f69b58b [bug 1033419] Heartbeat backend prototype
- f557433 [bug 1033419] Add docs for heartbeat posting
Landed, but not deployed:
- 7c7009b [bug 935731] Switch all tests to use FactoryBoy
- 2351fb5 Generate locales so ubuntu will quite whining (Ian Kronquist)
Current head: 7ea9fc3
At a high level, this is:
- Landed automated Gengo human translation and a bunch of minor fixes to make it work more smoothly.
- Reworked how we build development environments to use vagrant. This radically simplifies the instructions and should make it a lot easier for contributors to build a development environment. This in turn should lead to more people working on Input.
- Fixed a bug where products marked as "hidden" were still showing up in the dashboard.
- Implemented a GET API for Input responses. (https://wiki.mozilla.org/Firefox/Input/Dashboards_for_Everyone)
- Implemented the backend for the Heartbeat prototype. (https://wiki.mozilla.org/Firefox/Input/Heartbeat)
- Also, I'm fleshing out the Input section in the wiki complete with project plans. (https://wiki.mozilla.org/Firefox/Input)
Over the next two weeks
- Continue fleshing out project plans for in-progress projects on the wiki.
- Gradient sentiment and product picker work.
What I need help with
- We have a new system for setting up development environments. I've tested it on Linux. Ian has, too (pretty sure he's using Linux). We could use some help testing it on Windows and Mac OSX.
Do the instructions work on Windows? Do the instructions work on Mac OSX? Are there important things the instructions don't cover? Is there anything confusing?
- I'm changing the way I'm managing Fjord development. All project plans will be codified in the wiki. A rough roadmap of which projects are on the drawing board, in-progress, completed, etc is also on the wiki. I threw together a structure for all of this that I think is good, but it could use some review.
Do these project plans provide useful information? Are there important questions that need answering that the plans do not answer?
If you're interested in helping, let me know! We hang out on #input on irc.mozilla.org and there's the input-dev mailing list.
I think that covers it!
Tue, 01 Jul 2014
I'm going to start doing quarterly post-mortems for Input development. The goal is to be more communicative about what happened, why, what's in the works and what I need more help with.
NB: "Fjord" is the name of the codebase that runs Input.
Bug and git stats
Bugzilla ======== Bugs created: 63 Bugs fixed: 54 git === Total commits: 151 Will Kahn-Greene : 142 (+14758, -4599, files 438) ossreleasefeed : 3 (+197, -42, files 9) Anna Philips : 2 (+734, -6, files 24) Joshua Smith : 2 (+65, -31, files 5) Swarnava Sengupta : 1 (+2, -2, files 1) Ricky Rosario : 1 (+0, -0, files 0) Total lines added: 15756 Total lines deleted: 4680 Total files changed: 477
We added a lot of lines of code this quarter:
- April 1st, 2014: 15195 total, 6953 Python
- July 1st, 2014: 20456 total, 9247 Python
That's a pretty big jump in LOC. I think a bunch of that is the translation-related changes.
5 non-core people contributed to Fjord development.
I spent some time over the weekend finishing up Vagrant provisioning script and rewriting the docs. I'm planning to spend some more time in 2014q3 reducing the complexity and barriers for setting up a Fjord development environment to the point where someone can contribute.
Additionally, I'm planning to create more bugs that are contributor-friendly. I started doing that in the last week. I think a good goal for Input is to have around 20 contributor-y bugs hanging around at any given time.
Site health dashboard: I wrote a mediocre site health dashboard that's good enough to give me a feel for how the site is performing before and after a deployment. This still needs some work, but I'll schedule that for a rainy day.
Client side smoke tests: I wrote smoke tests for the client side. I based it on the defunct input-tests code that QA was maintaining up until we rewrote Input. There are still a bunch of tests that I want to write to have a better coverage of things, but having something is way better than nothing. I'm hoping the smoke tests will reduce the amount of manual testing I'm doing, too.
Vagrant: I took some inspiration from Erik Rose and DXR and wrote a Vagrant provisioning shell script. This includes a docs overhaul as well. This work is almost done, but needs some more testing and will probably land in the next week or two. This will make peoples' lives easier.
Automated translation system (human and machine): I wrote an automated translation system. It's generalized so that it isn't model/field specific. It's also generalized so that we can add plugins for other translation systems. It's currently got plugins for Dennis, Gengo machine translation and Gengo human translation. I turned the automated human translation on yesterday and it seems to be working well. That was a HUGE project. I'm glad it's done.
One thing it includes is a lot of auditing and metrics gathering. This will make it possible for me to go back in time and look at how the translation system worked on various Input feedback responses and hone the system going forward to reduce the number of human translations we're doing and also reduce the number of problems we have doing them.
Better query syntax: We were upgraded to Elasticsearch 0.90.10. I switched the query syntax for the dashboard search field to use Elasticsearch simple_query_string. That allows users to express search queries they weren't previously able to express.
utm_source and utm_campaign handling: I finished the support for handling utm_source and utm_campaign querystring parameters. This allows us to differentiate between organic feedback and non-organic feedback.
More like this: I added a "more like this" section to the response view. This makes it possible for UA analyzers to look at a response and see other responses that are similar.
Dashboards for you, dashboards for everyone!
I'm putting this in its own section because it's intriguing. I'll write another blog post about it later in July as things gel.
On Thursday, a couple of days after d3 training that Matt organizied, I threw together a better GET API for Input feedback responses. It's not documented, it probably has some bugs and it's probably going to change a bit, but the gist of it is that it lets you more easily build a dashboard that meets your needs against live Input data.
Here's a proof-of-concept:
That's looking at live Input data using the new GET API. The code is in a GitHub gist. It auto-updates every 2 minutes.
The problem is that I've got a ton of Input work to do and I just can't write dashboard code on Input fast enough. Further, of the people I've talked to that use the front page dashboard, they all have really different questions they're asking of the data. I'm hoping this alleviates that bottleneck by letting you and everyone else write dashboards that meet your needs.
I encourage you to take my proof-of-concept, fork the gist, tweak it, use bl.ocks.org or something to "host" the gist. Build the dashboard that answers your questions. Share it with other people. Plus, let me know about it. If you have issues with the API, submit a bug and tell me.
If this scratches the itch I think needs scratching, it should result in a bunch of interesting dashboards. If that happens, I'll write some code in Input to create a curated list of them so people can find them more easily.
This was a really crazy quarter and parts of it really sucked, but we got a lot accomplished and we laid some groundwork for some really interesting things for 2014q3.
Mon, 23 Jun 2014
I publish a status report to the input-dev mailing list every couple of weeks or so covering what was accomplished and by whom and also what I'm focusing on over the next couple of week. I sometimes ruminate on some of my concerns. I think one time I told a joke.
Since the last report:
Landed and deployed:
Landed, but not deployed:
- c348989 Add bug triaging for new contributors section
- 5b7dc67 Add Gengo API tests and skip_if infrastructure
- 98d30fb [bug 1026131] Add Gengo human translations bookkeeping
- 38d8584 [bug 1026131] Rework translations system logging code
- 1d9e67a [bug 1027293] Add audit records to response view
Mostly I spent the last couple of weeks working on automated Gengo human translation support. This involved some infrastructure rewriting plus some additional infrastructure so that when we push all this out, we can see what's going on as it is happening.
Additionally, I went through and updated the mentor metadata for mentored bugs, added a bunch of new mentored bugs and worked with two potential contributors on them.
Over the next week (last week in 2014q2):
- finish up automated Gengo human translation work
First thing in 2014q3, I'll spend some time "opening up" the development side of the project. This will make it easier/possible to follow and participate in development. I'm still figuring out some of the details and it's likely I'll continue to change how things work over the course of the quarter, but plan to follow advice from the Community Building team and Erik Rose who seems to be doing really super with DXR.
Tue, 13 May 2014
Better search syntax is here!
Yesterday I landed the changes for bug 986589 which affects all the search boxes and search feeds on Input. Now they use the Elasticsearch simple-query-search query instead of the hand-rolled query parser I wrote.
This was only made possible in the last month after we were updated from Elasticsearch 0.20.6 (or whatever it was) to 0.90.10.
Tell me more about this ... syntax.
I'm pretty psyched! It's pretty much the minimum required syntax for useful searching. It's kind of lame it took a year to get to this point, but so it goes.
To quote the Elasticsearch 0.90 documentation:
+ signifies AND operation | signifies OR operation - negates a single token " wraps a number of tokens to signify a phrase for searching * at the end of a term signifies a prefix query ( and ) signify precedence
Negation and prefix were the two operators my hand-rolled query parser didn't have.
What does this mean for you?
It means that you need to use the new syntax for searches on the dashboard and other parts of the site.
Further, this affects feeds, so if you're using the Atom feed, you'll probably need to update the search query there, too.
Also, we added a ? next to search boxes which links to a wiki page that documents the syntax with examples. It's a wiki page, so if the documentation is subpar or it's missing examples, feel free to let me know or fix it yourself.
Thu, 19 Dec 2013
It was a big year for Input. In 2012, we spent the last half rewriting Input. In 2013, it went through secreview, had a bunch of things fixed and then we migrated to the new system.
Since then, we've been fixing bugs, reimplementing features that were lost and writing the scaffolding for the new set of User Advocacy dashboards and tools.
Let's look at some Bugzilla and git stats for the year:
Twas the year: 2013 =================== Bugzilla ======== Bugs created: 150 willkg : 100 cwwmozilla : 5 fbraun : 4 mgrimes : 4 tdowner : 3 stephen.donner : 3 me+bugzilla : 2 gasell+mozilla : 2 mcooper : 2 glind : 2 mozaakash : 1 kdurant35rules : 1 hitmanarky : 1 kbrosnan : 1 bob.silverberg : 1 splewako : 1 rrosario : 1 mattbasta : 1 educmale : 1 feer56 : 1 326374 : 1 anthony : 1 shopov.bogomil : 1 peterbe : 1 l10n : 1 chrismore.bugzilla : 1 landis : 1 dron.rathore : 1 rq : 1 MattN+bmo : 1 joshua-smith : 1 cturra : 1 swagat.kanungo : 1 Bugs resolved: 268 willkg : 157 : WONTFIX 50 : FIXED 89 : WORKSFORME 8 : DUPLICATE 9 : INVALID 1 cwwmozilla : 57 : FIXED 1 : WONTFIX 7 : WORKSFORME 29 : DUPLICATE 1 : INVALID 19 mgrimes : 10 : FIXED 1 : DUPLICATE 1 : WORKSFORME 5 : INVALID 3 shopov.bogomil : 7 : WONTFIX 1 : WORKSFORME 2 : INVALID 1 : FIXED 2 : DUPLICATE 1 mcooper : 6 : DUPLICATE 1 : FIXED 5 mozilla : 5 : FIXED 5 me+bugzilla : 4 : WONTFIX 1 : FIXED 1 : DUPLICATE 1 : INVALID 1 mozaakash : 2 : WORKSFORME 1 : INVALID 1 trifandreialin : 2 : WORKSFORME 2 rrosario : 2 : FIXED 2 joshua-smith : 2 : FIXED 1 : INVALID 1 aaron.train : 2 : WONTFIX 1 : DUPLICATE 1 stephen.donner : 1 : INCOMPLETE 1 emorley : 1 : FIXED 1 curtisk : 1 : INVALID 1 unghost : 1 : WORKSFORME 1 rajul.iitkgp : 1 : FIXED 1 jruderman : 1 : INCOMPLETE 1 chris.lonnen : 1 : FIXED 1 nigelbabu : 1 : FIXED 1 tofumatt : 1 : FIXED 1 cturra : 1 : FIXED 1 fwenzel : 1 : FIXED 1 mbrandt : 1 : FIXED 1 INCOMPLETE : 2 DUPLICATE : 15 INVALID : 28 WORKSFORME : 48 WONTFIX : 60 FIXED : 115 git === Total commits: 277 Will Kahn-Greene : 249 (+51602, -16851, files 1130) Mike Cooper : 11 (+38528, -236, files 217) Brandon Burton : 7 (+42, -215, files 9) Ricky Rosario : 4 (+36, -19, files 6) Bob Silverberg : 3 (+19, -9, files 3) Rajul : 1 (+3, -0, files 1) Joshua Smith : 1 (+10, -5, files 1) bogomil : 1 (+1, -1, files 1) Total lines added: 90241 Total lines deleted: 17336 Total files changed: 1368
I want to highlight some interesting bits:
We resolved more bugs than we created. That's partially due to us going through and closing out old bugs for the old Input that aren't relevant anymore.
According to the Bugzilla and git data, there were 47 contributors to Input this year: 326374, Bob Silverberg, Brandon Burton, Joshua Smith, MattN+bmo, Mike Cooper, Rajul, Ricky Rosario, Will Kahn-Greene, aaron.train, anthony, bogomil, chris.lonnen, chrismore.bugzilla, cturra, curtisk, cwwmozilla, dron.rathore, educmale, emorley, fbraun, feer56, fwenzel, gasell+mozilla, glind, hitmanarky, jruderman, kbrosnan, kdurant35rules, l10n, landis, mattbasta, mbrandt, me+bugzilla, mgrimes, mozaakash, nigelbabu, peterbe, rajul.iitkgp, rq, splewako, stephen.donner, swagat.kanungo, tdowner, tofumatt, trifandreialin, and unghost.
That doesn't include localizers who do a ton of work translating the strings in the Input ui.
That includes some of the folks who work on the input-tests repository, but possibly misses some.
Most of the 47 contributors are not "core developers". That's cool, but I could be doing a better job here making it easier for non-core developers.
Those are the stats.
At a high-level, we accomplished the following:
- stood up a new Input code base
- the beginnings of spam identification and removal
- Input API for feedback submission
- Firefox OS feedback form
- infrastructure for an Analysts group with special privileges
- the beginnings of an Occurrence Comparison report dashboard
One thing I discovered in 2013q4 was that it's really hard to be the mostly-solo dev on a project like this. I'm lucky that I'm part of a larger team, so peer reviews for work I've done is possible and timely. However, I find I'm switching contexts between the technical details of what I'm working on now and the high-level details of a bunch of possible future tasks/projects. That's really hard to do day-to-day and still maintain development momentum. I have some thoughts on how to serialize my work so that I'm doing less context switching and I can focus on individual things more deeply which should produce better outcomes.
My goals for Input for 2014 are these:
- clean up the code base: there's still a bunch of weird stuff in there from the rapid development work we did in 2012
- reduce barriers to entry for new contributors: better documentation, fewer steps to get up and running, more bugs marked for mentoring, more outreach, ...
- build infrastructure that we can use for better User Advocacy tools: watched alerts, email notifications, dashboards, ...
- flesh out tests: we're really light on smoketests and regression-catching tests
- work with Matt and Cheng to figure out where Input fits into the grand scheme of things; how can we make it a general-purpose feedback system? how can we handle non Firefox products and initiatives?
Yay for 2013!
My script only showed top tens which misses tons of people who did work. I redid the data and that increases the number of contributors from 16 to 47. Oops!
Mon, 31 Dec 2012
This was my first full year at Mozilla and it was intense. I essentially worked on four projects: SUMO, Input, ElasticUtils and Gaia. This blog post talks about the first two which are worked on by the James' Rifles SUMINPUT Megalosaur team.
We accomplished a lot on SUMO this year. I spent a couple of hours last week throwing together a rough "year in review" script that looked at Bugzilla and git and crunched some numbers:
Twas the year: 2012 =================== Bugzilla ======== Bugs created: 938 rrosario : 201 a.topal : 188 willkg : 108 scoobidiver : 51 igarcia : 41 mverdi : 36 swarnavasengupta : 30 james : 29 bram : 19 tobbi.bugs : 17 Bugs resolved: 1025 rrosario : 335 : WORKSFORME 18 : INVALID 16 : DUPLICATE 23 : WONTFIX 7 : FIXED 263 : INCOMPLETE 8 a.topal : 182 : WORKSFORME 36 : INVALID 41 : DUPLICATE 11 : WONTFIX 70 : FIXED 21 : INCOMPLETE 3 willkg : 131 : DUPLICATE 6 : FIXED 110 : WORKSFORME 2 : WONTFIX 11 : INVALID 2 rdalal : 84 : FIXED 84 james : 51 : WORKSFORME 6 : INVALID 5 : DUPLICATE 3 : WONTFIX 15 : FIXED 14 : INCOMPLETE 8 mcooper : 37 : FIXED 36 : INVALID 1 tobbi.bugs : 29 : FIXED 29 tgavankar : 28 : WONTFIX 1 : WORKSFORME 1 : FIXED 26 scoobidiver : 28 : FIXED 4 : DUPLICATE 4 : WORKSFORME 11 : WONTFIX 3 : INVALID 6 bmo2010 : 13 : FIXED 1 : DUPLICATE 3 : WORKSFORME 3 : INVALID 6 INCOMPLETE : 21 DUPLICATE : 61 WORKSFORME : 82 INVALID : 91 WONTFIX : 117 FIXED : 653 git === Total commits: 916 Ricky Rosario : 430 Will Kahn-Greene : 192 Rehan Dalal : 98 Mike Cooper : 44 Erik Rose : 34 Tobbi : 29 Tanay Gavankar : 23 Kadir Topal : 11 Tim Watts : 10 Berker Peksag : 9 James Socol : 7 Victor Neo : 6 Cesar Carruitero : 5 David Lilly : 4 Ibai : 3 Isac Lagerblad : 2 icaaq : 1 TylerDowner : 1 browning : 1 ricky rosario : 1 Anatoli Papirovski : 1 Clauber Stipkovic : 1 Jason Thomas : 1 atopal : 1 Florin Strugariu : 1
There are some interesting bits in there:
Ricky does a lot of work! Holy cow!
There were 23 people who contributed code to Kitsune (the SUMO codebase) this year. Of those, about half are volunteer contributors.
Compare with 2011, we had 19 people who contributed to the code base and less than half were volunteer contributors.
We resolved more bugs than we created in 2012. We did that in 2011 as well, so that's two years in a row. I've never seen that happen before on a project I work on.
The codebase is pretty different now than it was at the beginning of the year. I helped with the following semi-massive overhauls:
- The push for more metrics and dashboards to view the numbers.
- The switch from Sphinx to ElasticSearch.
- The new Information Architecture which affected browsing and searching across the site.
- The site redesign which covered both the desktop and mobile versions of the site.
- The upgrade to Django 1.4.
- The switch from arecibo to sentry.
- The push to switch from fixtures to model makers for all our tests.
- The switch from weekly deployments on Tuesdays to deploying whenever we want. Continuous deployment is fantastic.
- Started switching the whole site from Webtrends to Google Analytics. I saw Ricky write up a bunch of bugs to finish up that work, so I'll say it's in progress.
- During the redesign, Rehan redid all the CSS and switched us to use LESS.
- I spun off some code I wrote for richard, then ported to Fjord, then improved into a project called django-eadred. That makes it a lot easier to generate sample data for a variety of purposes like new contributors, bootstrapping, and large random data sets.
On top of that, we did a lot of work on the documentation and making it easier to get to a working Kitsune development environment. We switched to a sprint-based work flow using Scrumbugz. We also nixed our daily checkin conference call for an IRC-based checkin system that we wrote called Standup.
It's been a big year.
For Input, it was a bigger year. We decided to abandon the old Input codebase (omfg yay) in favor of rewriting it from the ground up. The rewrite took a couple of months and then has sort of been sitting around waiting for a security review. In the meantime, we (actually, Mike did) fixed a bunch of issues with the old site code because that's what's currently in production.
Rewriting Input wouldn't have taken so long except that we did a lot of work fixing bugs in external libraries and updating Playdoh. That work definitely cut into our schedule, but it benefitted a bunch of other groups/people/sites, so that's good.
That's the gist of the year: it was a lot of work, but we accomplished a ton.
w00t for 2012!