Freudian T9

Do you ever feel like your cat can read your mind? Mine can, but that’s beside the point. I think this “technology” that they’re calling “T9” is actually a lot more sophisticated than I’ve given it credit for. In fact I would go so far as to say that “T9” isn’t “predictive,” it’s “mind reading.”

A little bit of background at this point would help. I hate eggs. Don’t get me wrong, I love to bake and I’ll even eat french toast, but the sight, smell and texture of cooked eggs has always made me gag. I have distinct childhood memories of being given eggs and wanting to yack. My mom even says that I used to spit them out as a baby (before the memories started). And I’m not militant about it—I have nothing against people who like to eat unborn fetuses.

Another piece of context that might be relevant to the story (it’s coming, I swear) is that while in Ireland recently I became a pretty skilled text messager. I’m not quitting my job as a grad student just yet, but I can walk down the street, write text messages without looking, and avoid traffic, all while juggling and eating ice cream. But seriously, I don’t look at the phone anymore, I just proof let T9 do it’s business and proof the message before I send it.

So I’m walking home today and a friend sends me a message that she’s looking for some eggs. Of course I keep eggs around for baking purposes, but I don’t really use them that often, and I don’t know how old they are. So I write her a message to convey this information, and this is what I got:

i hate eggs but i can’t touch them since i only able with them and i haven’t baked in a while.

Which of course is not what I meant to write, but by golly, it was nearly coherent and pretty much true. The real message, after a couple of corrections:

i have eggs but i can’t vouch for them since i only bake with them and i haven’t baked in a while

I’ve come into some pretty funny T9 substitutions in the process of writing messages, but this is the god darndest, most stupifying experience I’ve ever had. T9 people, I bow down to you.

Beat Rick Fox!

fox and jackson, sitting in a treeCome on, we all have a reason to hate the Lakers. If it’s not Rick Fox and his beautiful wife, well, it’s Rick Fox again with his I wear a facemask because I don’t want to hurt my beautiful smile attitude. Or maybe it’s the way that they always squeak out a lucky victory that makes you want to rethink statistics. Or how about Rick Fox? Did I mention Rick Fox? Well, I hate to break it to them, but if you flip the coin enough times, eventually it’ll come up tails, and the entire country will rejoice in your defeat.

The only other dynasty in my sports-watching years was the Bulls, a charismatic team led by a charismatic Michael Jordan. When MJ pulled up for a jumper at the buzzer you were convinced the Bulls would pull it off, and that luck would not be a factor. How can two teams so similar in prowess create such a different reaction from America? My answer: Rick Fox.

I watched the Blazers lose three rough playoff battles (well, rough for me) to the Lakers, and that kind of anguish is bad for a player like Rasheed Wallace. This guy already gets technical fouls without agitation, and you make him guard Mr. Myself-AND-Vanessa-Willams-think-I’m-hot Rick Fox, and you’re bound to give the poor guy a heart attack. It gave me such pleasure tonight to watch Rasheed embarass the Lakers (and Rick Fox) wearing a Detroit uniform. I don’t think I’ve ever encountered a team in sports as widely hated as the Lakers, and with this defeat I feel like we can all go to work tomorrow with a little less tension in our shoulders. With only one technical foul tonight, it might even add a few years to Rasheed Wallace’s life (courtesy of Rick Fox).

Metrosexual marketing

man creamToday I received a curious little book from Nordstrom’s called “The Grooming Game,” which features a host of men’s hair and skin products (available online as The Men’s Grooming Guide). Up until now I had thought Nordstrom’s figured me for a woman since they regularly send me lingerie and dainty dress catalogs. This book though is an extremely tactful piece of marketing, with ambiguous writing and suggestive photography. The opening page says it all: an image of a man shaving, staring off into the distance in Narcissian gaze, reaffirmed by the text, “think strategically: forget vanity—sharpening your image lets you take your big self out in the world. If life is a game, why not play? When it comes to grooming, Nordstrom knows the rules.”

Every page features some sort of beauty—er—masculine product, from face care to cologne, even a man-blush (a Jean Paul Gaultier product called Better Than Tan Matte Bronzer). There is also a host of Queer Eye wisdom, such as “look the part: natural may be nice, but sometimes you’ve got to buff your bluff,” or “keep them guessing: maintain your fresh face by moisturizing at night.” For the man with all the products they even offer an Alpha Lipoic Acid Face Firming Activator from N.V. Perricone Cosmeceuticals in 2 oz. bottle for $95. The question on everyone’s mind: what the hell is a cosmecutical?

My friend Zach has been saying for quite some time that metrosexualism is the cosmetic company’s solution to a saturated market. In this age, it’s nearly impossible to sell a woman another product regardless of its quality, innovation or ingredients. What these companies need is a new market, and what better than an untapped population equal in size to women? I’m sure that Queer Eye is just the beginning, and that over the next few years I’ll be inundated with enough eye creams to make me sick (even if I don’t eat them). I’ll be the first to admit that I’ve had a facial before, and I tend to use moisturizer in the morning, but I’m not sure I’m ready for the onslaught of advertising I’ve been able to ignore up until now. It’s just a matter of time before I get my first Loreal catalog.

Weblogs and authority

This week I’ll be presenting a paper at the International Communication Association Conference in New Orleans titled Audience, Structure and Authority in the Weblog Community. The paper is an analysis of two different metrics for measuring authority within weblogs:

The following table shows the top 20 for each measure. One observation is that many of the top ranked sites are community weblogs (e.g. Slashdot or Memepool). These sites play the important role of hubs, maintaining ties to more weblogs than a single person would be able to. They allow information to diffuse quickly between distant parts of the network of readership.

Blogroll Degree Rank Permalink Degree Rank
links url links url
1. 2581 metafilter.com 1322 boingboing.net
2. 2434 slashdot.org 1270 diveintomark.org
3. 2146 boingboing.net 1096 metafilter.com
4. 1825 kottke.org 1073 slashdot.org
5. 1604 instapundit.com 982 kottke.org
6. 1527 scripting.com 976 weblog.siliconvalley.com/column/dangillmor
7. 1307 evhead.com 956 instapundit.com
8. 1220 andrewsullivan.com 828 andrewsullivan.com
9. 1062 memepool.com 827 themorningnews.org
10. 1007 doc.weblogs.com 826 rathergood.com
11. 977 megnut.com 819 textism.com
12. 961 littlegreenfootballs.com/weblog 683 denbeste.nu
13. 899 diveintomark.org 626 doc.weblogs.com
14. 880 littleyellowdifferent.com 625 asmallvictory.net
15. 848 textism.com 582 rightwingnews.com
16. 846 rebeccablood.net 577 microcontentnews.com
17. 758 plasticbag.org 568 joi.ito.com
18. 737 dashes.com/anil 560 buzzmachine.com
19. 719 ftrain.com 553 waxy.org
20. 714 plastic.com 522 a.wholelottanothing.org

A second observation is that the lists are fairly distinct. While some webloggers hold top positions in both ranks, the list diverges considerably as the position increases. While Blogrolls tend to support the weblog elders (scripting.com, evhead.com, etc.), permalinks suggest a different set of authors as influencers (joi.ito.com, buzzmachine.com, etc.). Looking at the differential between the ranks in the figure below, it is apparent that as soon as the rank passes 100, the correlation between Blogroll and Permalink rank becomes less defined.

rank differential
Permalink and Blogroll rank differential

This raises new light to the age-old weblog power law debate. While the blogroll rankings (reflected by Shirky’s original analysis) suggest a model of preferential attachment, many of those weblogs listed in the top permalink ranks are much younger. If the weblog social structure is mitigated by a law of the “rich getting richer,” we would expect older weblogs to have more influence, and hence more links to their entries.

There are obviously many caveats and details, all of which are listed in the full paper below. Since I’m presenting it this coming Friday, I’d appreciate any feedback you may have.

Full paper: Audience, Structure and Authority in the Weblog Community (pdf 228k)

Popular press and weblogs

In the process of researching a paper for an upcoming conference at the end of the month I did some research on the coverage of weblogs in the popular press. I queried the LexisNexis database for references to "weblog," "web log," and "blog" resulting in 4051 magazine and newspaper articles from 1998 to the present. The first article, published in the Independent, February 18, 1998 isn’t actually a reference to weblogs as we know them, but rather another invention of the term:

Just how tricky the whole thing is is shown by the many drafts through which that note has already gone. Some of these drafts are available on the Internet, and for those of you unfortunate enough to be without a weblog* , I bring you today some of the first versions of that note to Saddam Hussein.

* Weblog. This is a new Internet word I have made up, which I hope will catch on. If it does, I will work out a meaning for it later.

The second reference is an article published in the Guardian, November 11, 1998, citing Jorn Barger’s Robot Wisdom:

Can computers model the human predicament? John Barger’s page sets out to tackle the idea of ‘robot wisdom’, taking in James Joyce, artificial intelligence and Internet issues along the way. The real gem is the weblog, a daily account of John’s travels around the web. Watch a highly observant and thoughtful surfer at work.

The story behind weblogs becomes more complete when they start receiving attention in mid-1999, with weblog exclusives by Jim McClellan of the Guardian (June 3, 1999) and Dan Gillmor (June 14, 1999). Both of these articles followed shortly after a piece by Scott Rosenberg in Salon (May 28, 1999), which unfortunately is not indexed by LexisNexis.

Weblog citations over time

The chart above shows the citation of weblogs over time along with the average number of times the term was used per article in that month. The data have been normalized so that they can be seen on the same plot; the maximum value for occurences of the term occured in October, 1999 at 31, and the maximum number of articles published in April 2004 at 296.

The exponential growth of attention to the topic is striking, although it appears in the last month to have taper off. Comparing this trend with the average number of uses of the term per article, it appears that the more frequently the concept is cited, the fewer times the word is used per article. The obvious interpretation is that the term is slowly becoming part of our vernacular, and when journalists write about weblogs today, much less context is necessary than in 1999. Also, the number of articles exclusively about weblogs is probably on the decline, while stories only tangentially related to weblogs are on the rise.

Another surprising characteristic of the media presentation of weblogs is the oversight of the most popular tools:

Weblog tool # of articles
Blogger 1913
MovableType 919
LiveJournal 181
DiaryLand 114
Xanga 31

While an extremely large contingent of weblog users rely on the last three tools in this list, all of the attention has been on MovableType and Blogger. Given that these tools are private communities, it could be simply that the press is not aware of how explosive their growth is.

If you’re interested in working with the data, I’m offering it up in zip (21 MB) and gzip (19MB) formats. I’ve stripped the HTML of unnecessary cruft, but it could still use being converted to XML.

Spam finger

finger me and DIEI feel like my Bayesian spam filter is winning the arms race against spammers, or at least making the filtering process managable. One of the side effects of having my mail presorted is that I can evaluate which of my email addresses are attracting the most attention. Over the past few months I’ve been watching this statistic very closely, and found that two addresses produce an overwhelming majority of my garbage: mit.edu and uchicago.edu. The irony there is that I never use either address. Where are they harvesting my email from? My best guess is finger.

While companies tend to use more sophisticated directory systems, most universities use finger as an open white pages for students, faculty and administration. In the stone age of the internet, it was ostensibly the only way to find a person’s email address, and it still remains as the most effective means of tracking down a user of an academic network. In most cases, all one needs is a first or last name and the university they work for. On most unix systems, simply typing name@host.edu will return a list of entries in the host.edu database matching "name."

This is a veritable gold mine of data for spammers: current students that will be graduating at some point, starting families, and needing loads of xanax, valium and viagra to cope. All the spammer has to do to tap into the finger database is know a first or last name, query the server, and take the email address. Or, alternatively you can just finger all of the names, ranked in descending order of popularity thanks to the 1990 census statistics. Since Cameron is the 336th most common name, it’s no surprise that I’ve been getting a flood of email from my fingerable addresses.

MIT does provide one level of indirection by giving each user an alias, mine being C-marlow. If you turn around and finger C-marlow at mit.edu, MIT responds with all of my contact information. I am in no way a privacy pundit, I just don’t appreciate getting unsolicited email. At this stage in the game, it seems to me that finger must die. Schools that still want to provide a directory service should do it through a web email interface, obscuring the addresses of students and employees. Otherwise they threaten to render their email addresses useless by serving them up wholesale to spammers.

MIT power outage

At about 1pm this afternoon, the entire MIT campus suffered a complete power outage, the first time such an event has occured in my 5-year tenure here. I was in the lab at when the power cut, and I was lucky enough to catch the sound of some 50 computers just outside my office spinning down simultaneously. The power was off for about three hours, at which point we began picking up the pieces of our poor, beaten network.

MIT maintains their own power cogeneration plant which supplies the campus and a good part of Cambridge through an exchange with NStar Electric. Talking with friends in the area it appears that much of the city went offline for some time, but was restored long before MIT powered up. This outage doesn’t seem to appear on the New England power consumption stats, but I assume that is because MIT maintains its own grid. Thankfully, MIT maintains its own statistics on the cogenration site, which I’ve cached below:

mit power consumption, 5-3-04
MIT Power Consumption, 5.3.04

There hasn’t been any news yet as to the cause, but I would expect it to at least make the local news (it seems like a pretty major event, considering that it took a full 3 hours to restore power to the campus). It wasn’t anywhere near the magnitude of the recent blackout in New York, but it did have a similar socializing effect, as people crawled out from under their desks and scurried outside into the jarring sunlight. Frankly I’d be happier if they threw the switch on a regular basis, just to test people’s ability to communicate face-to-face, a sort of fire drill for social interaction.

May 4: The MIT newspaper has covered the story, citing that an outage of this magnitude has happened only once in recent history.

The Globe reports that many Verizon customers lost voicemail when their data center in Cambridge went offline yesterday.

Occam’s Razor

Daniel DoertyAfter sitting on the Hot Abercrombie Chick story for a week, I still had an unsettling feeling that the story was unresolved. I decided to go back to my initial premonitions and check the IP addresses of the comments she posted to a few weblogs. Then I realized that I had been sitting on my own data the whole time, the IP addresses of sites added to Blogdex. Here’s what I’ve found:

   Date		    Source	     IP		  Owner
----------------------------------------------------------
2004-02-11	Blogdex		68.91.65.131	swbell.net
2004-02-21	Anil Dash	65.69.87.117	swbell.net
2004-02-28	Undisclosed	68.90.64.194	swbell.net
2004-04-14	Blogdex		68.89.157.59	swbell.net
2004-04-02	Blogdex		65.69.86.105	swbell.net
2004-04-07	Blogdex		128.252.173.54	wustl.edu

All of these IP addresses originate from St. Louis, Missouri, one directly from Washington University where Daniel Zeigenbein transferred to after leaving Vassar (also compare this to Amanda Doerty’s profile on MSN). While many open questions remain (the motive of the author, the identity of the girl in the picture, etc.), this is enough evidence for me to close the case on this one.

I find a great deal of pleasure in doing research on the internet, especially when it involves unearthing information that is nonobvious or difficult to procure. It’s the same type of excitement that one gets watching Woodward and Bernstein slowly uncover the trail of leads that eventually finds Nixon. I never had any intention of defaming HAC or Daniel Zeigenbein, it was simply a mystery where the facts didn’t add up and the truth should exist somewhere, buried in the internet.

Justin has also posted a more in-depth analysis of the evidence.

Something me

look ma, no hands!While walking home from work the other day I passed a group of guys emerging from a pizza joint. After a few handshakes and goodbyes they parted ways and made arrangements for their next meeting. And then one of them yelled across the street, "something me on Thursday." His friend looked a little confused, but I knew exactly what he was talking about. He added, "IM, call, email… I don’t care."

Despite our proximity to MIT, these guys did not strike me as the type who wear t-shirts that say Go away or I’ll replace you with a simple shell script or tote around Leathermans in their utility belts. These were just normal people with too many ways to talk to each other.

I’m guessing that we have reached some saturating point in communication technology where the actual medium itself has become unimportant. When I thought about the expression, "something me," I realized that we don’t have a satisfactory, general expression for communicating in our common vernacular. It seems like an issue that will only become more important as we add media and devices to the current equation, but at current I can’t come up with anything better. If you have a better idea, something it to me.

Dude, where’s my Google?

I’m not much of a conspiracy theorist, but I have to take notice when events coincide. As many people noticed yesterday, my Hot Abercrombie Chick post had a quick rise to prominence on Google, ranking at #1 for the query "Abercrombie Chick" and #2 for "Hot Abercrombie Chick." I was shocked to find that this page was no longer in any of the results for these searches. On the other hand, the post on Wizbangblog still maintains its original rank on both queries. In fact, my site has been removed from Google’s index while his remains. Check the screengrabs (click for more detail):

search for my post
Search for my post
search for the response to my post
Search for response post

Google’s policy about page removal is quite explicit:

Except in instances involving legal issues or spam, Google’s policy for removing a page from our index requires that we obtain the permission of that page’s webmaster. This prevents competitors from sabotaging each other’s listings.

I’m assuming my page is not spam. Without any emails from Google, I can only assume that it has been removed for legal reasons. Has Amanda or some other party emailed Google to remove my allegations? I’ve sent an email to Google to inquire.

April 23: The page is back in the index as of sometime this morning. Looking back on the whole incident, it’s pretty amazing that Google had the page in the index within 2 days, along with its PageRank. They have been caching weblogs within a few hours but this is the first time I’ve seen an individual weblog post go online in that amount of time. I now render this post defunct.