Brijit closes

The long-form article summarization service Brijit decided to close its doors today. It would appear that their business model (paying people to summarize long-form news articles) was not viable. I may be in the minority, but I used the tool as a clipping service, subscribing to feeds of articles about the industry (Facebook, Social Networks, etc.). I’m really sad to see it go; it was one of the only tools I actually used on a regular basis. You can still get access to their archives.


theinfo.org

theinfo.org is a community “for people with large data sets,” founded by Aaron Swartz. It’s essentially a wiki and a few Google Groups centered around three topics:

  • Get: acquiring data through scraping, crawling, or otherwise
  • Process: conversions, queries, algorithms and the like
  • View: visualization in any number of forms

I’m excited by the definition of this community, but a bit skeptical of the scope. My intuition is that communities centered around tools, (e.g. Processing) tend to thrive more than general task-focused groups. I really hope the emergent data understanding population finds this tool and helps shape it accordingly. (via waxy)


Orkut to take over MySpace?

Orkut logoAlexa has recently been improving the global coverage of their traffic statistics. Their Global 500 now shows a number of sites that have almost zero attention in the US market (e.g. Baidu, QQ, and Yahoo Japan). Many on this list had a negligible presence on Alexa a year ago, most likely due to their marketing of the Alexa Toolbar in foreign markets.

While I was looking at the list of top 10 global sites, one was extremely startling: Orkut, Google’s social networking service that has been extremely successful in Brazil. Since January of last year, Orkut has grown by a factor of 10, moving from a daily reach of 3,000 to 30,000 per million. Since MySpace’s traffic has been more or less constant over that time period, it’s not surprising that Orkut has covered some major ground towards being the world’s largest social networking service:

Orkut vs. MySpace 1 year

Looking more specifically at the race for top ranked social networking service, it appears that the two will be neck and neck from here on out:

Orkut vs. MySpace

Orkut gets no attention in the US market mainly because their US presence is tiny compared to Facebook, MySpace or even Friendster. If they take the number one spot worldwide, will Americans respond? Google paid $900M to be MySpace’s search provider, a partnership that might lead people to believe that their business interest in social networking was diminishing. Another explanation could be their interest in monetizing a mature social networking service while Orkut continues to grow. As the service continues to drive traffic globally, it is inevitable that the press will take notice, and Google can take this opportunity to grow their domestic user base.

I hadn’t logged in to Orkut for years, but upon returning I realized that very little has changed. Same strange photos, same hearts and ice cubes, same periwinkle-and-purple color scheme. Orkut’s growth reinforces the fact that the value of social networking services, and social software in general, comes from the base of active users, not the set of features they offer.


Flickr Social 101

A certain friend of mine (who shall go unnamed) is living in Kenya. I met him today for lunch and was super psyched to hear about his life, but a little vexed to discover that he has 5000 photos he’s sitting on, a Flickr pro account, and a substantial internet connection. I asked him why he doesn’t upload one photo a day and he responded with, “I use Flickr as storage.” I thought this completely was absurd, until I remembered that some people have never been properly introduced to the social features within Flickr.

What I find most engaging about the product are the lightweight social interactions that allow me to keep up with my friends, even those who don’t have blogs or active myspace/facebook accounts. Much of the social functionality, such as searching for people, adding them as contacts, and following their progress, are not immediately obvious in the interface. This short tutorial should get Nathan him up to speed, along with anyone else who uses “Flickr for storage”**:

1. Find your friends. First, do a few searches for your connected Flickr friends. These will be the people who told you about Flickr, extol its virtues every time you’re out with them, or use their phone cameras to beam pictures to the internets. Whenever you see someone you know, drag your mouse over their photo, and when a box appears over the photo, press the arrow to expose the secret contact menu. Here you can click the link that says “add them as a contact.” You can also use this menu to get to their contacts page where their friends are listed. Use this list to find any friends you might have in common. Also, when you meet someone you like, exchange Flickr handles. Chances are they’ll be happy to swap photos.

2. Setup alerts. The next step is to make yourself aware of what your friends are doing on Flickr, both in their world and on your photos (you probably didn’t know people were commenting on your photos, did you?!). This is probably because you only go to Flickr when it’s time to dump some photos into the hard drive. There are three feeds of information that you want to track. Find your user id using idgettr and modify these URLs to be your own (right-click, copy the URL and paste it somewhere to replace USERID with your id):

Once you have these three feeds, you need to track them. If you already use an RSS reader, simply add these to your list of feeds. Done. If not, Yahoo! Mail beta users can add them to the feeds box in their mail. For people who don’t know what the hell RSS is, and don’t care, use RMail or RSSFwd to send updates to your email. This should take 5 minutes to setup.

3. Be social. When you see that your friend has uploaded a picture, say something to them. It might not make their day, but it will probably make them happy to know that you’re watching. It’s a small amount of effort that can have a positive impact on someone’s day. Or, if you’re a little shy in the online world, drop one of their photos in an offline conversation, e.g., “hey man, I thought that picture of you with a tiny 49ers helmet on was nuts! Were you drunk?” The answer will, undeniably, be “no.”

I made this guide so that every time I have this conversation I can tell people to search for “flickr social 101.” Please let me know if you have any comments on how to improve this tutorial and I’ll be happy to incorporate them.

** note this guide does not apply to misanthropes or pop-stars.


LinkedIn to launch answers product

LinkedIn logoIn a marketwatch story and online video interview, Keith Rabois of LinkedIn revealed that the business network is planning to launch a question/answer service similar to Yahoo! Answers, but directed at business intelligence. The service will take advantage of the identities LinkedIn users have taken time to construct, and utilize the existing social relationships as an incentive to get people to answer questions. From the interview with the head of business development:

Rabois: One of the things we’re doing is transcending the traditional value propositions of LinkedIn. Historically we’ve been focused a lot on hiring, recruiting, and finding new jobs and opportunities. We’re going to be using LinkedIn exploring new opportunities for people to conduct research, business research. One of the most important uses of a professional network is to get intelligence and get business information. For example, if one wanted to know what three changes are going to occur in patent law in the next five years, LinkedIn is a perfect tool to find that answer. Or, what venture capitalists are most appropriate for investing in a sports medical device. So we’re going to have a LinkedIn answers. You can sort of envision a useful version of Yahoo! Answers tied to people’s professional credentials and profile so you can assess the validity and credability of people’s answers.

Francisco: And how do you get the incentives? How do you provide incentives for people to actively participate in that?

Rabois: Principly social capital. It’s going to work within two degrees, and that means a friend of a friend. So if someone I know is asking a question and I know the possible answer, I’ll be willing to respond because I know the person in common. So I’ll earn some social capital as well as develop a professional set of expertise and reputational devices on our site that allow you to market yourself as an expert in a particular topic.

Answer sites have been coming out of the woodwork lately; after the overwhelming success of Naver and others in Asia, Yahoo! Answers opened a free, public question/answer site in America. Microsoft followed shortly afterward with Live Q&A and then Amazon with Askville. Each of these sites has a slightly different take on the incentives and social dynamics that make up the system, and each hoping to find the magic arrangement that creates high-quality content for its users. Of course the holy grail for these services is to achieve what Naver did, namely gaining the #1 search market share for search a year after launching their Q/A product.

LinkedIn presents an interesting player in this game, specifically because they have a substantial amount of information about their users, and because these profiles represent serious, professional concerns. Their opinion of Yahoo! Answers is obvious (”imagine a useful version of Yahoo! Answers…”), and they believe the users of LinkedIn will participate in something much more serious than the current competition. The interview does not mention when they plan to launch the system, but I would expect it to be soon if the head of BD is talking to MarketWatch.


Explanatory algorithms

There is a trend in recommender systems that I think is extremely interesting: systems are starting to explain themselves. The first place I noticed this was at Amazon in their personal recommendations section, at the bottom of a given suggestion:

Amazon recommendation

In this case, Amazon recommended Moon Palace because I had rated another book by Paul Auster. This makes perfect sense, namely I rated something by an author, so the system recommended other books by the same author. The second place this popped up was at the new social music service iLike. Every time a user views another user’s profile, the system calculates a compatibility score based on how similar your favorite artists are, as shown here:

iLike recommendation

In this case, I share interest in the bands ESG, TV on the radio, et al. with this user, so our compatibility is high. When I share more popular artists like Miles Davis or Bob Dylan, my compatibility score is lower. This makes sense since rarer bands suggest a closer connection. Last.fm has added a similar feature called Taste-o-meter.

What’s interesting about these examples is not the algorithm, some augmented form of collaborative filtering, but rather in the way that the algorithm explains itself to the user. Many years ago, with the likes of Firefly and CDNow showing off the power of recommender systems, this sort of behavior would have been considered crazy. Showing to users elements of how your algorithm works? What if they reverse engineer it and copy your methods and copy your system and steal all your users?!

Not likely. For most intents and purposes, recommender systems are within wiggling distance of each other. Netflix is holding a contest to see if theirs can be improved, offering a cool $1M to anyone who can show a 10% gain over their current algorithm. While the current leaderboard shows the best contenders at a 4% gain over the original algorithm, Netflix does not expect people to make the 10% gain necessary anytime soon, suggesting the contest could run until 2011. But companies like Amazon and iLike are making improvements through the way that these algorithms are explained.

Explanation creates understanding, and understanding leads to trust.
What if all systems started to take this approach? We mostly assume that search providers keep their ranking algorithms in a 6-foot safe behind a wall of lasers, but at the same time Google is starting to release more information about PageRank through various systems. Someday we might have search results that explain themselves, while keeping the special sauce away from SEO geeks and spammers. Imagine if a top search result said “This result is first because: your search term was in the title, the author is a well known writer, and the host is a reputable newspaper.” I would probably say “that makes sense,” and in turn I would trust that system even more.


Amazon launches answers site

Askville LogoToday I received an invite to join a new community at Amazon called Askville:

You’re Invited!

As a valued Amazon customer, you’ve been specially picked to get an early look at a new website called Askville where you can ask any question on any topic and get real answers from real people. It’s new, and best of all, it’s free!

This site will compete with Yahoo! Answers and Microsoft Q&A in the free question-answering space except that it might be able to leverage the Amazon community of experts. For those that have not been following this area, these systems enable knowledge creation by allowing users ask questions that are then answered by other users in exchange for reputation within the system. The first success in this space was a startup in Korea named Naver that took control of the search market share in a very short period of time.

Amazon’s system is similar to all of its American counterparts, with its large fonts and friendly messaging (”ask.. answer.. meet.. play”), except for a few subtle distinctions:

  • Users are rewarded for asking questions as well as answering them
  • Questions are limited to 5 answers total
  • Best answers are chosen by the group of question asker and answerers, where the asker gets one more vote than the answerers

Probably the most significant change is the flow of the question/answering exchange. In Yahoo! Answers, and elsewhere, answers are shown publicly as they are received; in Askville, answers are hidden to the public until 5 answers have been received. Any discussion or clarification can happen in a public message board attached to the question. After 5 answers have been collected, the group of asker and answerers vote and the whole thing is made public.

Askville rewards users with “coins,” a virtual currency that will be redeemable in another community named Questville slated for release in early 2007.

The system has given me 25 invitations for other accounts. If you’re interested in trying out the system, shoot me an email.

Update: I apologize, but all of my invitations have been distributed! It seems like the invitations are spreading though, so look for one on a weblog near you…


YouTube adds backlinks

Last week YouTube released a new player along with a few other features across the site (for some reason they have yet to blog about these changes). Personally I liked the look and feel of their old player more, but that is beside the point: the new interface exposes the most popular off-site links to the given video (a.k.a. backlinks), allowing users to discover who is really driving viewership. From my brief interaction, it appears the video must be embedded on the source page, and not simply a text link. Here’s the backlinks for their most popular video of all time (Evolution of Dance):

YouTube backlinks
YouTube backlinks

What you find inside this little hidden window is usually not startling: MySpace profiles, i-am-bored.com, and so on. But in some cases the backlinks are more interesting. Take for instance the popular Treadmills video by Ok Go is apparently the most downloaded music video of all time. In YouTube’s backlinks we find the largest driver of traffic to be a weblog named fugufish, with over 200k links. Who is this mysterious blogger? Were they the first people to identify the video, or simply a maven that brought it to a larger audience? For a researcher of diffusion (such as myself), these links are fascinating, and start allowing us to understand how giant internet memes tend to spread.

This move also continues to solidify YouTube’s place in the ever-evolving media ecology of the web. It solidifies the video site as a platform integrated with other media providers. You can think of this feature as a sort of traffic-share with those sites that drive the most viewership; just as AdSense shares Google’s ad revenue with publishers who use the ads, YouTube will now reward bloggers with traffic for using YouTube as their video platform. YouTube was a pioneer in the platform approach to building an audience, and they continue to innovate.


Privacy and transparency

Recently Facebook has introduced a few feature which has raised a lot of attention among users and bloggers. This piece of the system, called the News Feed shows you activity of your contacts within Facebook. If your friend posts to their blog, uploads a photo, attends an event or changes almost anything in their profile, these show up in your news feed. Many of the users have voiced complaints, saying that this is an infringement of their privacy, namely exposing their activities in a way that makes it easy for people to track their behavior.

Why would Facebook implement such a feature? My guess is that they hope this new level of transparency will cause people to be more active. For instance, if I see that my friend is attending an event, I might choose to join them and attend as well. This type of event log is not new; Upcoming has had a similar feature for many months now. In both cases these logs detail events happening around the system that could be observed otherwise, but in a form that is much more easily consumed. Why do Facebook users care? The system has now made peoples actions too transparent, in a way that would limit their ability to express themselves without fear of their privacy being breached. In fact, the result of this new feature may have the opposite effect than they anticipated: users might start censoring their actions in order to avoid being noticed.

As another example, take the political campain contribution website Fundrace.org. Thanks to campaign financing laws, all contributions of over $200 to a political campaign or party must be made public. These data are collected by the Federal Election Commission (FEC) and made available in electronic form for download. Eyebeam researchers took the files provided by the FEC, indexed them, and made a search engine available to the web. Now anyone could easily find contributers by their last name or address.

One result of this simple transformation was that campaign contributers thought their privacy had been breached. Even though their contributions are required to be public, that does not mean that they are required to be indexed so that people can be easily found. In this case Eyebeam made the FEC data more transparent, and as a result, those who contributed felt betrayed.

In developing social software, there is an inherent tradeoff between transparency and privacy; finding the correct balance is a demanding task, and one that needs to be carefully user-tested. While the overall benefit to a system might be positive, some features will cause some users to be angry, while others will result in serious privacy infringements. While the benefits to tranparency can be huge, users must feel safe and protected at all times, and the transition from comfort to discomfort can happen in a matter of seconds.

Online communities just a few years ago were mainly opaque about user activities, most probably in protection of EULAs and privacy advocacy. Nearly every Web 2.0 company alerts you when people have affected your account, such as somone adding you as a contact in Flickr or Myspace. It’s only now that we’re pushing up against this line of transparency, and I would expect in the next few years a set of best practices will evolve as to what features are admissable and which are not.