Okay, so Ben Gordon has written a critique of the Halfpixel Business Model (as described in How To Make Webcomics) and come to the conclusion it doesn’t work. I wish I had time to dig into this the way it deserves, but there’s no way I’m going to be able to in the near future.
So let’s be clear that this is not a formal analysis of Gordon’s entire thesis, but specifically a response based on his numbers. I’m going to talk about this using casual terminology so as to make my thoughts as accessible as possible to everybody that doesn’t know (and, rightfully, doesn’t care) about the difference between skew and kurtosis. Onwards.
Gordon looked at a sample of webcomics, and sought to estimate how much money could be made from his reading of HTMW‘s “10% Rule” (5 – 10% of your readership will open their wallets and buy things). His calculations led him to conclude that the rule is fundamentally flawed, but pointed out:
I hope someone will find fault with my analysis, because if it is sound, it is a setback for webcomics.
I’m not sure if his conclusion can be proved or disproved (we are, after all, talking about applying mathematical rules to a creative endeavour), but if his conclusion’s based solely on the numbers, I think that I’ve found the fault he was looking for, from a purely statistical standpoint. Consider the following statements from his posting:
- [the business model] cannot be verified by the majority of case studies
- I’ve chosen comics in a range of sizes from a list in Wikipedia which reports comics that support their creator(s). … I removed the ones that don’t belong and analyzed the rest.
- The formula for estimating each comic’s profit is: … We assume the average profit per sale is $5 — typical for a t-shirt
- [five calcluations of estimated webcomic profits ranging from $975 to $24,000]
First off, we need to agree on some terminology — Gordon doesn’t have “a majority of case studies”, he’s got one study with five data points. Semantics? Nope — because the number of data points is a critical element of how much we can draw reliable conclusions from the numbers. We’ll come back to that in a moment.
Secondly, Gordon’s eliminated data that “don’t belong” (for example, Achewood was eliminated because Time magazine declared it the best graphic novel of 2007 — which may have artificially inflated its numbers, I guess), meaning that we’re not looking at a random sample. We’ll come back to that, too.
Thirdly, the assumption of profit per sale is entirely arbitrary — $5, which is described as the average profit on a t-shirt (I don’t sell shirts so I can’t say, but having ordered custom shirts from the same guy many webcomickers use, I think it’s probably a bit low). But the profit per shirt doesn’t matter anyway, because it assumes that any item the creator makes will produce the same profit. Unfortunately, this doesn’t hold up.
Case in point: I have purchased a number of originals from a number of webcomickers (some of whom describe themselves as entirely self-employed by their strips and others that do not); prices have ranged from $20 to $175. Profit on even the lowest priced of them is several times Gordon’s assumption, and on the high end it utterly destroys his model. Okay, many webcomickers sell shirts, and okay, the profit on a shirt probably occupies a fairly narrow range of values, but what do we do with all the other items? You’ve got books, prints, hoodies, skateboard decks, hot sauce, and an upsell (of $5 to $10, generally) to get the item signed/sketched. That’s an incredible variation.
That price range actually points to the real problem in Gordon’s analysis — the distribution curve of those “price per original” data would form a flat line. It’s not a set of consensus values with outliers because there’s too few points — this does not allow for meaningful statistical analyses. The same situation exists with the estimated profit figures he gives: 975, 2012, 8000, 17270, 24000 … that’s only five data points. The confidence that we can derive from any analysis over such a wide range, with a distribution curve that looks like a flat line, is vanishingly small.
Statistical analysis only works if any random datum that you select to calculate can be assumed to represent many, many, similar (to the point of being essentially identical) other data that you don’t bother to include in the analysis. The key thought here is Margin of Error. You know MoE — it’s what tells you that a political race between, say, the Harbinger of the New Golden Age and the Evil Throwback to All That’s Unholy is presently split 52% to 48%, plus or minus 4.3% (and since the MoE is greater than the difference between HNGA and ETATU, we essentially don’t know who’s ahead).
Also bear in mind that the MoE is probably only to the standard level of “95% confidence”, which means that there’s a 5% chance that the real split could be even more than 4.3%. I’m going to run one simple equation to drive this home. It’s a rule of thumb that if you want to calculate the margin of error to a 95% confidence level you can do so approximately with:
where n is the number of samples. In this case, n equals 5, which gives us
0.98/2.236 = 0.438 = plus or minus 43.8%
So there’s a 95% chance that the five data points we have are representative of webcomics earnings potential, with the assumption that any number we come up with could conceivably be off by as much as 43.8% from the true value. That’s not a number that we can be very confident in. Add to that the fact that statistics in general is predicated on random samples (but Gordon selected his population), and we have numbers that can’t be relied upon to any degree, even if we take the problematic $5 assumption off the table.
Heck, even recalculating for every self-reported self-supporting webcomicker isn’t going to help, because the number is still too low to provide statistical significance (honestly, we’d want a population several thousand and a sample of at least 500 to have much confidence in the numbers). It’s still an anomaly to make a living this way, and there simply are not enough data to allow for any analysis beyond the anecdotal — which is precisely what HTMW affords. This is not to say that Gordon’s question shouldn’t be asked or that his conclusions are wrong — but it is pointless to try to draw any statistical meaning from these numbers.
Speaking of “pointless”, I strongly urge that you avoid the related thread at The Daily Cartoonist, as it quickly devolved (despite Alan Gardner’s specific request to stay on the damn topic) into truly astonishing levels of dickery re: webcomickers do not have careers/incomes/lives/redeeming qualities.
It never ceases to astonish me that individuals that I have met — and who are perfectly polite and rational in person — turn into such raging exemplars of John Gabriel’s best known theorem (minus the anonymity … weird) when discussing this particular topic. I stopped reading in disgust after about 20 comments and won’t go back there. Proceed at your own risk.
The discussion at the original post is, by contrast, civil, productive, and based on logic. Gordon has been polite in responding to questions and everybody is doing their best to treat the question as an intellectual exercise designed to figure out the truth. Bravo.