comScore weblog report

I am obviously always on the lookout for weblog statistics, as it has become a core part of my thesis. Today a marketing company by the name of comScore has released a report detailing a number of different statements about the weblog community. I’d like to take a moment to remind people that this is a marketing survey, and as such should be carefully scrutinized before drawing any conclusions.

First, comScore’s methodology claims that they have 2 million active subjects, recruited through Random Digit Dial and an “online recruitment program,” for which they provide no details. They do however list the incentives that are provided to those individuals:

  • Server-based virus protection
  • Attractive sweepstakes prizes
  • Opportunity to impact and improve the Internet

Sans the third incentive which is the blanket “feel-good” incentive for all surveys, I challenge you to think of someone who is attracted to the first two. Let’s just say they’re not your average person or internet user. They also note:

All demographic segments of the online population are represented in the comScore Global Network, with large samples of participants in each segment. For example, our network includes hundreds of thousands of high-income Internet users - one of the most desirable and influential groups to measure, yet also one of the most difficult to recruit.

Without diving into what “high-income Internet users” are, having hundreds of thousands subjects from a assumedly small portion of the population leads me to believe that they’re not really interested in representivity, but rather, umm, marketing. Given that they do not justify their sample, nor provide margins of error, the initial sampling frame should be considered bunk.

Second, if their sampling of weblogs seems strange at first, it is. They were interested in how the aforementioned sample visited weblogs, so they decided to look at visits to 400 blog-related domains, which they culled from “top blog lists.” These domains include hosting services (e.g. “*.livejournal.com”) among the other top blogs. Keep in mind that this sample of 400 domains incorperates community sites (freerepublic.com, fark.com, slashdot.org, metafilter.com, etc.), professionally written sites (gawker.com, drudgereport.com, fleshbot.com, etc) and potentially spam (crazyass13.com throws my spam alarm).

I’m assuming, based on their distribution of unique visitors shown below, that all of these sites are included in one sample, with the top sites being blog hosts (although note the missing blogspot, which supposedly saw 19 million unique visitors), and the second group being community sites and professional blogs. As far as many people might be concerned, the “real blogs” start around #30, for which they provide no description. How this is a sample of weblogs at all, I can’t say. But building categories around this strange set of sites seems a little unsound.

comScore statistics

What this report, in sum, seems to say to me is that some large number of people have visited either a professional weblog or some weblog on any number of the hosted services in the past year. This should not be surprising. I get a blog site response from Google just about once every five queries. Without any description of how many of these blog visitors saw only one blog in the entire period, I’d say an overwhelming majority could be from search engines (which they admit).

Given their sampling frame and blog selection methodology, it seems hard to extrapolate any meaningful statistics about true blog readership. Until they release the data, I would quote these numbers with extreme caution.


7 Comments

  1. Posted August 8, 2005 at 4:32 pm | Permalink

    Cameron,

    If you’d read the methodology closely, you’d have seen that I’m a co-author. Feel free to call me for any details. For that matter, feel free to call comScore for answers to you questions.

    You make a few erroneous assumptions. First of all, it wasn’t a “survey” in the in the sense of questions that were asked of consumers. comScore has a panel of more than 1.5 million people (in the U.S.) and a total of 2 million worldwide who gave explicit permission for comScore to track everywhere they go online.

    The exact details of their recruitment methodology aren’t explicit on their site, but it’s not fundamentally different than the recruitment process for vitually any online research panel; that is, you’ll get about the same level of resentation of Internet users from Harris Interactive, Greenfield or anywhere else you’re likely to see cited as the basis of any online research. In fact, comScore goes farther by doing a periodic (quarterly, semi-annually, I’m not sure) telephone random-digit-dial study (generally regarded as the most sound sample recruitmentment methodology) to weight the demographics of their large panel to Internet user norms.

    comScore, along with their chief competitor NetRatings, is the gold standard for online media research among the greater online ad sector and web publishers. Most of the big web publishers subscribe to them and respect their numbers. All research methodologies can be attacked for some kinds of bias — it’s virtually unavoidable — but comScore’s approach is well respected in the $9.6 billion world of online advertising and web media.

    The methodology for how the blogs in the study were selected was clearly explained in the report. We started with a list of 14,000 blogs, culled from many lists of top blogs, most notably all that were then on the list of TruthLaidBear, and we fed those into comScore which came out with a list of the 400 top ones, including blog hosting services (about two dozen).

    We didn’t draw a distintion between commercial blogs and non-commercial blogs, but we did clearly separate out actual blogs from blog networks. As for CrazzAss13.com, what’s wrong with that? It looks like a normal blog as far as I can tell.

    As for how we categorized the blogs, that was a manual process. I did much of that work myself, along with (the uncredited in the report) Steve Hall of Adrants.

    I think you’re doing your readers a disservice to nitpick. To the extent that you believe I have a high standard for research methodology, I state my reputation on the fact that this is the soundest analysis that I’m aware of about the composite of blog readers.

    As for the legit question of what % are regular readers versus occasional visitors via Google or elsewhere, as you point out we did address that issue. We also give a rank of top blogs by visits versus visitors, so you can get a sense of whose sites are stickier with readers. Besides, just based on the differences in the characteristics between those designated as blog visitors versus the general online population, it’s clear that the Internet users identified in this analysis have common characteristics as distinct from the overall audience.

  2. Posted August 8, 2005 at 5:12 pm | Permalink

    Thanks for the clarification Rick. Just for the record, Rick is my friend (or was my friend), and I did NOT notice that he was part of this team. But I swear I read the study closely.

    Ok, responding in order:

    The study is not a survey, but it still falls under the same sampling methodologies that surveys do, i.e. there’s a population you want to study (target), a way you recruit them (frame). There’s the same issues with selection and nonresponse bias as far as I can tell.

    I agree with all of the tactics chosen by comScore, but I can’t really argue with them because they’re not public. Nor should they be, they’re for the private use of the clients, which I am not. Of course there will be bias in any sample, but it’s in comScore’s best interest to hide one if it exists. I just wanted to draw a line between academic studies (for which data is usually public) and marketing (for which it is not). I should have been more explicit about that. There are different motivations behind academics and marketing, and I think there is an important distinction.

    Regarding CrazyAss13.com, Take a closer look. How could a site that has been posting only for the past two months have gotten hundreds of thousands of unique visitors? It seems too unlikely to be true.

    And finally, I guess what I was looking for was a breakdown by the number of total visits made by all users in the sample. You show the difference between the top blogs, but I assume that over 50% of the people only came to a blog once, and that’s an important statistic to reveal.

    I apologize for being so harsh, it’s just that time of the thesis writing.

  3. Posted August 8, 2005 at 6:43 pm | Permalink

    Sorry to be cranky myself. Yes, Cameron and I are buddies; as for “friends,” it get’s down to complicated definitional issues: http://www.bruner.net/blog/archives/015143.shtml

    Definitional issues is what it’s all about. Market research is an inexact science. I always say it’s best at answering the question “Is it bigger than a bread box?” It’s all directional. But I think this research was directionally quite interesting. And in the bubble I live in — Internet advertising and media research — comScore’s methodology is about as good as it gets. I don’t know why they don’t talk about their “margin of error,” but it would presumably be very small since their panel is so gynormous. With a panel of 1.5 million logitudinal panel members and results weighted to random-digit-dial population surveys, they’ve got the seal of approval from the Advertising Research Foundation and a large market share for online media research. Think “wisdom in crowds.”

    I wish they did have a more detailed methodological statement, but c’est l’vie.

    As for Ole CrazyAss, his archives go back over a year: http://crazyass13.com/archive.htm

    He gives his email address and IM handle. And the content looks pretty homemade to me. So what that he’s featuring a lot of glam shots and text ads and porn site links. Lots of bloggers are making a lot of effort to make some cash and appeal to our more trivial instincts. Really, let’s not have a discussion on “what’s makes a blog a blog”; that’s so 2003.
    ;-)

  4. Posted August 9, 2005 at 2:02 am | Permalink

    Rick, my main beef is seeing freerepublic and democraticunderground both count as blogs. These ying and yang sites are clearly political discussion boards. I’d guess Freerepublic predates blogging by a year or more as well.

    Was there some gain in stats by including highly trafficked discussion boards as blogs?

  5. none
    Posted August 13, 2005 at 3:31 am | Permalink

    Marketscore/comscore spyware warmings/ install methods etc

    http://security.calpoly.edu/spyware/msproxy.html

    NBC article

    http://www.msnbc.msn.com/id/7546554/

  6. Norman
    Posted October 11, 2005 at 8:58 am | Permalink

    CrazyAss13 isn’t spam. If you look at the old archive the site goes back to 11/6/01, and it looks like he did the site by hand for a long time. He’s only used WordPress for a few months, but the sites been around for a long time. Not spam.

  7. Posted February 27, 2006 at 8:27 pm | Permalink

    CrazyAss13 isn’t spam. If you look at the old archive the site goes back to 11/6/01, and it looks like he did the site by hand for a long time. He’s only used WordPress for a few months, but the sites been around for a long time. Not spam.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*