OpenSource Science vs. The Journals

Ever since George Soros announced that he would be donating $3 million to the Budapest Open Access Initiative, debate over the e-journal versus traditional journals has been heating up. An article today from the BBC points out some a few critics attacking the net journal initiative.

In many ways, these criticisms are the same ones being made in the debate surrounding peer-to-peer journalism versus Journalism-with-a-capital-J, arguments that probably shouldn’t be made so hastily because the two media aren’t necessarily competing. As the Physical Review Letters has proven, online journals can provide an entirely different type of information (namely late breaking results) that augments, rather than undermines traditional journals.


Internet addiction? Yay!

Perhaps I need to see someone from an Internet/Computer Addiction Service. My job might necessitate anti-depressants.


Society is happy.. barely

People are marginally happy, at best, according to Google: “I love my life,” 8280 pages to “I hate my life,” 8070.


Scrabble vowels gone missing

People making games beware: Scrabble vowel shortage revealed, thanks to faulty pseudorandomness. Also a tasty morsel of Scrabble trivia:

Alfred Butts invented Scrabble in 1931 after studying the front page of the New York Times for months to calculate how often each of the 26 letters in the Roman alphabet is used in English words. He settled on an appropriate score and number of tiles for each letter. There are 12 copies of E, the most common letter, and it has a value of 1; there is only one Z, but it has a value of 10.

Jouralist Josh has a backlash

Josh had quite a reaction to our panel at SXSW.


NewsBlaster

In today’s SearchDay, Chris Sherman introduced a new project from the Columbia Natural Language Processing group called Newsblaster, an automatic content aggregator, which, unlike Blogdex, actually culls similar content into one descriptive passage. Chris noted:

“If such a system were combined with a URL monitoring service, and seeded with a taxonomy of subjects personally interesting to you, it could effectively create your own web “advisory” service, automatically building directories of promising sites annotated with high-level summaries that would spare you the time of manual searching.”

Sounds to me like the coming of personalized news, the underlying goals of which have always left me a little uneasy. Personalized news tends to converge, as one might expect, on your personal interests. If we take this model to the extreme, then would I ever learn anything entirely new? Something revolutionary? What I want is someone else’s personalized news, someone like myself, but different enough that they will lead me in a new direction. That, my friends, is a weblog. And thankfully, there are lots of those.

To be fair, I’m sure Chris is referring to some hybrid of weblog and news content, which is taking all of this into account. I just had to get the ‘personalized news’ rant off my chest. I feel better now, in case anyone is wondering.


Real research on weblogs

I’ve been swimming in blog data for the past few days, preparing for my upcoming presentation at Sunbelt XXII, a social networks conference. Needless to say, I’ll be working up until the last minute, looking for new insights. I’ll make sure and let you know if I have any :)


Hard-to-describe numbers

who thought that one number could be so powerful?

consider a number so rare that most of math’s difficult enigmas can be solved simply by reciting the number. this number might be represented somewhere in the universe, but even if a person stumbled upon it, they would not be able to identify it. have i successfully made this number sound like an incantation from some numerology text?

well, charley bennett wrote a proof of this number’s existence in his 1979 paper “on random and hard-to-describe numbers” (requires acrobat). the paper is a dense read, even for those versed in computability theory, but if you get stuck, you can skip to the end, where you’ll find gems like this one:

“to know it in detail, one would have to accept its uncomputable digit sequence on faith like words of a sacred text. it embodies an enormous amount of wisdom in a very small space, inasmuch as its first few thousand digits, which could be written on a small piece of paper, contain the answers to more mathematical questions than could be written down in the entire universe…”


“Um yes, let me connect you to the inventor of Lisp”

guy steele, the guy who wrote the book on lisp, and co-wrote the book on java just called my office by accident (wrong number). MIT is a very unique place.


Style classification in text

researchers in italy have created a system which can automatically classify texts by their style, with hopes of clearing up debates over attribution for old texts (e.g. francis bacon as william shakespeare).

it’s also a pretty daunting proposition for people in the plagiarism business, such as stephen ambrose. his fans recently rallied around him, blurting support along the line of “so what if he plagiarized? everyone plagiarizes to some extent!” my gut instinct says that some people plagiarize more than others, and with any luck we’ll be able to prove this statement in the near future.