Math, Programming, and Minority Reports

Belated Network World Story
June 12th, 2013

It’s old news by now, but a few months back I was on the cover of Network World along with my colleagues Nellwyn, Steve, and Dottie.

If you missed it when it came out, you can read the article here.

My Magnum Opus (Reconsidered)
June 10th, 2013

I have been very fortunate in a number of respects. I have access to the twisted and talented mind of Eric Beug. And not only do I have a broad mandate to behave like a lunatic, but I also have dozens of like-minded coworkers. I got paid to make this. That makes me a professional actor.

You may remember my slightly-lower-budget debut.

How Long Should You Run Experiments?
May 13th, 2013

The question of how long an A/B test needs to run comes up all the time. And the answer is that it really depends. It depends on how much traffic you have, on how you divide it up, on the base rates of the metrics you’re trying to change, and on how much you manage to change them. It also depends on what you deem to be acceptable rates for Type I and Type II errors.

In the face of this complexity, community concerns (“we don’t want too many people to see this until we’re sure about it”) and scheduling concerns (“we’d like to release this week”) can dominate. But this can be setting yourself up for failure, by embarking on experiments that have little chance of detecting positive or negative changes. Sometimes adjustments can be made to avoid this. And sometimes adjustments aren’t possible.

"You ran an A/B test at one percent for a week" - the seldom-heard, missing verse of "You Played Yourself" — "You ran an A/B test at one percent for a week" - the seldom-heard, missing verse of *You Played Yourself*.

To help with this, I built a tool that will let you play around with all of the inputs. You can find it here:

http://www.experimentcalculator.com

Here’s an example of what you might see using this tool:

You can probably go ahead and not test this one. Or hey maybe this isn't worth the time.

The source code is available on github here. The sample size estimate in use is the one described by Casagrande, Pike and Smith.

The following people were all great resources to me in building this: Steve Mardenfeld, James Lee, Kim Bost, William Chen, Roberto Medri, and Frank Harris. Peter Seibel wrote an internal tool a while back that got me thinking about this.

Belated Network World Story June 12th, 2013

My Magnum Opus (Reconsidered) June 10th, 2013

How Long Should You Run Experiments? May 13th, 2013

Belated Network World Story
June 12th, 2013

My Magnum Opus (Reconsidered)
June 10th, 2013

How Long Should You Run Experiments?
May 13th, 2013