Youtube Epidemiology Interface

Youtube launched the most amazing statistics recently, hidden under their collapsed “Statistics & Data” header. Instead of a random list of awards, it now shows a timeline of the growth of the video’s popularity along with references to each source. Take for instance the video “Chap-hop History” by Mr. B the Gentleman Player:

Youtube Statistics

In addition to the existing statistics page (including awards and demographics), this interface now shows a chronology of the video’s popularity. In the case of Mr. B, the link first appears on b3ta, then @DJYodaUK and a few other twitter users, followed by Facebook, more b3ta, and then Planet Gnome. For the first time I feel like I have a clear, concise view of how a piece of content went viral. Take as a counter-example a popular video from this week, The Cat That Betrayed His Girlfriend whose popularity seems to have existed before the video was on the site (its first source of traffic was searches for the title). Browsing around a bit, it seems as though most big videos get their start from external sources, related videos, and searches.

This feels like a secret view into the inner workings of the internet, but all the pieces are still scattered around. I guess Youtube is the only one that can put them together.

Brijit closes

The long-form article summarization service Brijit decided to close its doors today. It would appear that their business model (paying people to summarize long-form news articles) was not viable. I may be in the minority, but I used the tool as a clipping service, subscribing to feeds of articles about the industry (Facebook, Social Networks, etc.). I’m really sad to see it go; it was one of the only tools I actually used on a regular basis. You can still get access to their archives. is a community “for people with large data sets,” founded by Aaron Swartz. It’s essentially a wiki and a few Google Groups centered around three topics:

  • Get: acquiring data through scraping, crawling, or otherwise
  • Process: conversions, queries, algorithms and the like
  • View: visualization in any number of forms

I’m excited by the definition of this community, but a bit skeptical of the scope. My intuition is that communities centered around tools, (e.g. Processing) tend to thrive more than general task-focused groups. I really hope the emergent data understanding population finds this tool and helps shape it accordingly. (via waxy)

Orkut to take over MySpace?

Orkut logoAlexa has recently been improving the global coverage of their traffic statistics. Their Global 500 now shows a number of sites that have almost zero attention in the US market (e.g. Baidu, QQ, and Yahoo Japan). Many on this list had a negligible presence on Alexa a year ago, most likely due to their marketing of the Alexa Toolbar in foreign markets.

While I was looking at the list of top 10 global sites, one was extremely startling: Orkut, Google’s social networking service that has been extremely successful in Brazil. Since January of last year, Orkut has grown by a factor of 10, moving from a daily reach of 3,000 to 30,000 per million. Since MySpace’s traffic has been more or less constant over that time period, it’s not surprising that Orkut has covered some major ground towards being the world’s largest social networking service:

Orkut vs. MySpace 1 year

Looking more specifically at the race for top ranked social networking service, it appears that the two will be neck and neck from here on out:

Orkut vs. MySpace

Orkut gets no attention in the US market mainly because their US presence is tiny compared to Facebook, MySpace or even Friendster. If they take the number one spot worldwide, will Americans respond? Google paid $900M to be MySpace’s search provider, a partnership that might lead people to believe that their business interest in social networking was diminishing. Another explanation could be their interest in monetizing a mature social networking service while Orkut continues to grow. As the service continues to drive traffic globally, it is inevitable that the press will take notice, and Google can take this opportunity to grow their domestic user base.

I hadn’t logged in to Orkut for years, but upon returning I realized that very little has changed. Same strange photos, same hearts and ice cubes, same periwinkle-and-purple color scheme. Orkut’s growth reinforces the fact that the value of social networking services, and social software in general, comes from the base of active users, not the set of features they offer.

Flickr Social 101

A certain friend of mine (who shall go unnamed) is living in Kenya. I met him today for lunch and was super psyched to hear about his life, but a little vexed to discover that he has 5000 photos he’s sitting on, a Flickr pro account, and a substantial internet connection. I asked him why he doesn’t upload one photo a day and he responded with, “I use Flickr as storage.” I thought this completely was absurd, until I remembered that some people have never been properly introduced to the social features within Flickr.

What I find most engaging about the product are the lightweight social interactions that allow me to keep up with my friends, even those who don’t have blogs or active myspace/facebook accounts. Much of the social functionality, such as searching for people, adding them as contacts, and following their progress, are not immediately obvious in the interface. This short tutorial should get Nathan him up to speed, along with anyone else who uses “Flickr for storage”**:

1. Find your friends. First, do a few searches for your connected Flickr friends. These will be the people who told you about Flickr, extol its virtues every time you’re out with them, or use their phone cameras to beam pictures to the internets. Whenever you see someone you know, drag your mouse over their photo, and when a box appears over the photo, press the arrow to expose the secret contact menu. Here you can click the link that says “add them as a contact.” You can also use this menu to get to their contacts page where their friends are listed. Use this list to find any friends you might have in common. Also, when you meet someone you like, exchange Flickr handles. Chances are they’ll be happy to swap photos.

2. Setup alerts. The next step is to make yourself aware of what your friends are doing on Flickr, both in their world and on your photos (you probably didn’t know people were commenting on your photos, did you?!). This is probably because you only go to Flickr when it’s time to dump some photos into the hard drive. There are three feeds of information that you want to track. Find your user id using idgettr and modify these URLs to be your own (right-click, copy the URL and paste it somewhere to replace USERID with your id):

Once you have these three feeds, you need to track them. If you already use an RSS reader, simply add these to your list of feeds. Done. If not, Yahoo! Mail beta users can add them to the feeds box in their mail. For people who don’t know what the hell RSS is, and don’t care, use RMail or RSSFwd to send updates to your email. This should take 5 minutes to setup.

3. Be social. When you see that your friend has uploaded a picture, say something to them. It might not make their day, but it will probably make them happy to know that you’re watching. It’s a small amount of effort that can have a positive impact on someone’s day. Or, if you’re a little shy in the online world, drop one of their photos in an offline conversation, e.g., “hey man, I thought that picture of you with a tiny 49ers helmet on was nuts! Were you drunk?” The answer will, undeniably, be “no.”

I made this guide so that every time I have this conversation I can tell people to search for “flickr social 101.” Please let me know if you have any comments on how to improve this tutorial and I’ll be happy to incorporate them.

** note this guide does not apply to misanthropes or pop-stars.

LinkedIn to launch answers product

LinkedIn logoIn a marketwatch story and online video interview, Keith Rabois of LinkedIn revealed that the business network is planning to launch a question/answer service similar to Yahoo! Answers, but directed at business intelligence. The service will take advantage of the identities LinkedIn users have taken time to construct, and utilize the existing social relationships as an incentive to get people to answer questions. From the interview with the head of business development:

Rabois: One of the things we’re doing is transcending the traditional value propositions of LinkedIn. Historically we’ve been focused a lot on hiring, recruiting, and finding new jobs and opportunities. We’re going to be using LinkedIn exploring new opportunities for people to conduct research, business research. One of the most important uses of a professional network is to get intelligence and get business information. For example, if one wanted to know what three changes are going to occur in patent law in the next five years, LinkedIn is a perfect tool to find that answer. Or, what venture capitalists are most appropriate for investing in a sports medical device. So we’re going to have a LinkedIn answers. You can sort of envision a useful version of Yahoo! Answers tied to people’s professional credentials and profile so you can assess the validity and credability of people’s answers.

Francisco: And how do you get the incentives? How do you provide incentives for people to actively participate in that?

Rabois: Principly social capital. It’s going to work within two degrees, and that means a friend of a friend. So if someone I know is asking a question and I know the possible answer, I’ll be willing to respond because I know the person in common. So I’ll earn some social capital as well as develop a professional set of expertise and reputational devices on our site that allow you to market yourself as an expert in a particular topic.

Answer sites have been coming out of the woodwork lately; after the overwhelming success of Naver and others in Asia, Yahoo! Answers opened a free, public question/answer site in America. Microsoft followed shortly afterward with Live Q&A and then Amazon with Askville. Each of these sites has a slightly different take on the incentives and social dynamics that make up the system, and each hoping to find the magic arrangement that creates high-quality content for its users. Of course the holy grail for these services is to achieve what Naver did, namely gaining the #1 search market share for search a year after launching their Q/A product.

LinkedIn presents an interesting player in this game, specifically because they have a substantial amount of information about their users, and because these profiles represent serious, professional concerns. Their opinion of Yahoo! Answers is obvious (“imagine a useful version of Yahoo! Answers…”), and they believe the users of LinkedIn will participate in something much more serious than the current competition. The interview does not mention when they plan to launch the system, but I would expect it to be soon if the head of BD is talking to MarketWatch.

Explanatory algorithms

There is a trend in recommender systems that I think is extremely interesting: systems are starting to explain themselves. The first place I noticed this was at Amazon in their personal recommendations section, at the bottom of a given suggestion:

Amazon recommendation

In this case, Amazon recommended Moon Palace because I had rated another book by Paul Auster. This makes perfect sense, namely I rated something by an author, so the system recommended other books by the same author. The second place this popped up was at the new social music service iLike. Every time a user views another user’s profile, the system calculates a compatibility score based on how similar your favorite artists are, as shown here:

iLike recommendation

In this case, I share interest in the bands ESG, TV on the radio, et al. with this user, so our compatibility is high. When I share more popular artists like Miles Davis or Bob Dylan, my compatibility score is lower. This makes sense since rarer bands suggest a closer connection. has added a similar feature called Taste-o-meter.

What’s interesting about these examples is not the algorithm, some augmented form of collaborative filtering, but rather in the way that the algorithm explains itself to the user. Many years ago, with the likes of Firefly and CDNow showing off the power of recommender systems, this sort of behavior would have been considered crazy. Showing to users elements of how your algorithm works? What if they reverse engineer it and copy your methods and copy your system and steal all your users?!

Not likely. For most intents and purposes, recommender systems are within wiggling distance of each other. Netflix is holding a contest to see if theirs can be improved, offering a cool $1M to anyone who can show a 10% gain over their current algorithm. While the current leaderboard shows the best contenders at a 4% gain over the original algorithm, Netflix does not expect people to make the 10% gain necessary anytime soon, suggesting the contest could run until 2011. But companies like Amazon and iLike are making improvements through the way that these algorithms are explained.

Explanation creates understanding, and understanding leads to trust.
What if all systems started to take this approach? We mostly assume that search providers keep their ranking algorithms in a 6-foot safe behind a wall of lasers, but at the same time Google is starting to release more information about PageRank through various systems. Someday we might have search results that explain themselves, while keeping the special sauce away from SEO geeks and spammers. Imagine if a top search result said “This result is first because: your search term was in the title, the author is a well known writer, and the host is a reputable newspaper.” I would probably say “that makes sense,” and in turn I would trust that system even more.

Amazon launches answers site

Askville LogoToday I received an invite to join a new community at Amazon called Askville:

You’re Invited!

As a valued Amazon customer, you’ve been specially picked to get an early look at a new website called Askville where you can ask any question on any topic and get real answers from real people. It’s new, and best of all, it’s free!

This site will compete with Yahoo! Answers and Microsoft Q&A in the free question-answering space except that it might be able to leverage the Amazon community of experts. For those that have not been following this area, these systems enable knowledge creation by allowing users ask questions that are then answered by other users in exchange for reputation within the system. The first success in this space was a startup in Korea named Naver that took control of the search market share in a very short period of time.

Amazon’s system is similar to all of its American counterparts, with its large fonts and friendly messaging (“ask.. answer.. meet.. play”), except for a few subtle distinctions:

  • Users are rewarded for asking questions as well as answering them
  • Questions are limited to 5 answers total
  • Best answers are chosen by the group of question asker and answerers, where the asker gets one more vote than the answerers

Probably the most significant change is the flow of the question/answering exchange. In Yahoo! Answers, and elsewhere, answers are shown publicly as they are received; in Askville, answers are hidden to the public until 5 answers have been received. Any discussion or clarification can happen in a public message board attached to the question. After 5 answers have been collected, the group of asker and answerers vote and the whole thing is made public.

Askville rewards users with “coins,” a virtual currency that will be redeemable in another community named Questville slated for release in early 2007.

The system has given me 25 invitations for other accounts. If you’re interested in trying out the system, shoot me an email.

Update: I apologize, but all of my invitations have been distributed! It seems like the invitations are spreading though, so look for one on a weblog near you…

YouTube adds backlinks

Last week YouTube released a new player along with a few other features across the site (for some reason they have yet to blog about these changes). Personally I liked the look and feel of their old player more, but that is beside the point: the new interface exposes the most popular off-site links to the given video (a.k.a. backlinks), allowing users to discover who is really driving viewership. From my brief interaction, it appears the video must be embedded on the source page, and not simply a text link. Here’s the backlinks for their most popular video of all time (Evolution of Dance):

YouTube backlinks
YouTube backlinks

What you find inside this little hidden window is usually not startling: MySpace profiles,, and so on. But in some cases the backlinks are more interesting. Take for instance the popular Treadmills video by Ok Go is apparently the most downloaded music video of all time. In YouTube’s backlinks we find the largest driver of traffic to be a weblog named fugufish, with over 200k links. Who is this mysterious blogger? Were they the first people to identify the video, or simply a maven that brought it to a larger audience? For a researcher of diffusion (such as myself), these links are fascinating, and start allowing us to understand how giant internet memes tend to spread.

This move also continues to solidify YouTube’s place in the ever-evolving media ecology of the web. It solidifies the video site as a platform integrated with other media providers. You can think of this feature as a sort of traffic-share with those sites that drive the most viewership; just as AdSense shares Google’s ad revenue with publishers who use the ads, YouTube will now reward bloggers with traffic for using YouTube as their video platform. YouTube was a pioneer in the platform approach to building an audience, and they continue to innovate.