Posts Tagged ‘Worth Reading’
Aardvark Publishes A Research Paper Offering Unprecedented Insights Into Social Search
Aardvark Publishes A Research Paper Offering Unprecedented Insights Into Social Search
In 1998, Larry Page and Sergey Brin published a paper[PDF] titled Anatomy of a Large-Scale Hypertextual Search Engine, in which they outlined the core technology behind Google and the theory behind PageRank. Now, twelve years after that paper was published, the team behind social search engine Aardvark has drafted its own research paper that looks at the social side of search. Dubbed Anatomy of a Large-Scale Social Search Engine, the paper has just been accepted to WWW2010, the same conference where the classic Google paper was published.
Aardvark will be posting the paper in its entirety on its official blog at 9 AM PST, and they gave us the chance to take a sneak peek at it. It’s an interesting read to say the least, outlining some of the fundamental principles that could turn Aardvark and other social search engines into powerful complements to Google and its ilk. The paper likens Aardvark to a ‘Village’ search model, where answers come from the people in your social network; Google is part of ‘Library’ search, where the answers lie in already-written texts. The paper is well worth reading in its entirety (and most of it is pretty accessible), but here are some key points:
- On traditional search engines like Google, the ‘long-tail’ of information can be acquired with the use of very thorough crawlers. With Aardvark, a breadth of knowledge is totally reliant on how many knowledgeable users are on the service. This leads Aardvark to conclude that “the strategy for increasing the knowledge base of Aardvark crucially involves creating a good experience for users so that they remain active and are inclined to invite their friends”. This will likely be one of Aardvark’s greatest challenges.
- Beyond asking you about the topics you’re most familiar with, Aardvark will actually look at your past blog posts, existing online profiles, and tweets to identify what topics you know about.
- If you seem to know about a topic and your friends do too, the system assumes you’re more knowledgeable than if you were the only one in a group of friends to know about that topic.
- Aardvark concludes that while the amount of trust users place in information on engines like Google is related to a source website’s authority, the amount they trust a source on Aardvark is based on intimacy, and how they’re connected to the person giving them information
- Some parts of the search process are actually easier for Aardvark’s technology than they are for traditional search engines. On Google, when you type in a query, the engine has to pair you up with exact websites that hold the answer to your query. On Aardvark, it only has to pair you with a person who knows about the topic — it doesn’t have to worry about actually finding the answer, and can be more flexible with how the query is worded.
- As of October 2009, Aardvark had 90,361 users, of whom 55.9% had created content (asked or answered a question). The site’s average query volume was 3,167.2 questions per day, with the median active user asking 3.1 questions per month. Interestingly, mobile users are more active than desktop users. The Aardvark team attributes this to users wanting quick, short answers on their phones without having to dig for anything. They also think people are more used to using more natural language patterns on their phones.
- The average query length was 18.6 words (median of 13) versus 2.2-2.9 words on a standard search engine. Some of this difference comes from the more natural language people use (with words like “a”, “the”, and “if”). It’s also because people tend to add more context to their queries, with the knowledge that it will be read by a human and will likely lead to a better answer.
- 98.1% of questions asked on Aardvark were unique, compared with between 57 and 63% on traditional search engines.
- 87.7% of questions submitted were answered, and nearly 60% of them were answered within 10 minutes. The median answering time was 6 minutes and 37 seconds, with the average question receiving two answers. 70.4% of answers were deemed to be ‘good’, with 14.1% as ‘OK’ and 15.5% were rated as bad.
- 86.7% of Aardvark users had been asked by Aardvark to answer a question, of whom 70% actually looked at the question and 38% could answer. 50% of all members had answered a question (including 75% of all users who had ever actually interacted with the site), though 20% of users accounted for 85% of answers.
Intel’s FTC response hints at depth of AMD mismanagement
Intel’s FTC response hints at depth of AMD mismanagement
![]()
Intel has posted a detailed, 25-page rebuttal to the antitrust complaint that the FTC filed against the chipmaker this past November. It can be paraphrased as: “We deny almost all of the charges, except for the charges that are bogus because the FTC doesn’t even understand the basics of the processor business.”
But much more interesting than the Intel rebuttal (which, like the FTC complaint, is eminently worth reading for educational value) is what some of the material in the document appears to reveal about the depth of AMD’s mismanagement over the past decade. Specifically, the filing quotes some 2004 comments from former AMD marketing chief Henri Richard, who had some surprisingly disparaging things to say about AMD’s processors in an internal memo:
YouTube Launches Real-Time Discussion Search and Tracking
YouTube Launches Real-Time Discussion Search and Tracking
Real-time information is red hot all around the web but it made a surprise appearance on YouTube tonight in the form of real-time search for comments, of all things. YouTube comments are notoriously not worth reading, but now you can search their full text…in real time. There are some very real, potential use-cases crying out for a tool like this. Companies in particular are likely to want to know what people are saying about their names in the comments on YouTube. You name your topic, though: it’s now available for real-time search across viewer discussion.
Real-time search appears to have been rolled out very recently, with no mention, on this page. In addition to search results continuously updated ala Facebook’s newsfeed (”3 new results”) there’s also a frequently-updated list of “trending topics” on the search page.

Unfortunately, there are no feeds being published to syndicate these search results into a reader off-site. The regular search on YouTube now has RSS feeds and Google Wonder Wheel data being published, so perhaps comment search will have feeds added soon as well.
Proper nouns will likely be of interest to searchers watching YouTube comments. This could be a popular addition to the toolkits of social media watchers everywhere.
What’s the benefit of serving those results up in real time? For certain search queries you don’t want to wait around to find out there’s new results.
What could be next? Presuming this feature is as real as it looks and goes live to the public soon, we’d love to see YouTube support something like the Salmon comment aggregation protocol and publish updates for this and other GData feeds through in a real-time syndication format.
Thanks to Tikva Morowati for the tip. Tikva is the Community Platform Director for KGBWeb, a stealth startup made up of ex-Googlers and others in New York City that will likely make a splash among web-watchers later this year.

