Saturday, November 04, 2006

Web 2.0, for the layman!

We have often heard the term Web 2.0. What does it mean? Surely there's no new version of the web available and neither there are any particular group of web sites that open only in Firefox 2.0. Then, what is it?

Web 2.0 points out to the new era of web which is much different from delivering just HTML content to the user. It is a new way of thinking, it points out to the web that is getting smarter and richer as more and more people start to use it. The trend for the Web nowadays is Social interaction, user-generated content and blogs which when taken together form the next generation, user driven, intelligent web.

Soon gone will be the days when people used desktop software that came in versions. The software has become an service now. The versions are no more there, only improvement goes on. Do you know which version of Google reader or Gmail do you use, or do you know what have been the improvements going on? The whole software cycle of design-develop-test-ship-install will be finished. Software will now rely on the Beta development model, where it is continually refined and improved and the users will become beta developers. It is Web 2.0.

The above description has been taken from the Expert of the Web2.0 report published by John Musser and Tim O'Reilly called the "Web 2.0, Principles and Best Practices".

Also there is a Third Annual Web2.0 Conference from November 7-9 , 2006.

The term Web 2.0 was coined by O'Reilly's Media in 2004, pointing out the second generation of web services , where people could collaborate and share information. There are many examples of Web 2.0 services:Orkut and MySpace are two of the most popular web social networking sites, Google's Spreadsheet and documents are an another example, then there is YouTube for sharing videos and Slideshare.net for sharing slide shows. Also there's an Indian startup like burrp, where users write reviews and recommendations for the right spots, be it a pan vaala or a good restaurant......and the list is unending!!

I would like to end with line by Ross Mayfield,

Web 1.0 was commerce , Web 2.0 is people.

Biasing Web Results for Topic Fimiliarity

This post is based on a research paper by Yahoo! Research written by Omid Madani and Rosie Jones of Yahoo! and Giridhar Kumaran from University of Massachusetts. It is based on the fact that based on the user’s familiarity with the search topic, it would be appropriate to give him either introductory or advanced search results.

The findings are based on a four-fold procedure. Firstly, the definition of advanced and introductory web pages is given.

An Introductory webpage is defined as a page that doesn’t presuppose any background knowledge of the topic and to an extent introduces or defines key terms in the topic.

An advanced webpage would be one which assumes sufficient background knowledge of the topic and probably builds upon them.

Then it is shown that the definitions above hold for a set od people, by the inter-labeler agreement. Three annotators are asked to label randomized sets of results for particular queries and are found to agree about 70% of the time. Also based on their labeling it is found that the search engines have an equal bias towards both introductory and advanced web pages. Also the precision for an introductory page to be at position 1 is slightly more than 0.5, showing that search engines generally make the top result an introductory one. The work tries to improve the precision for introductory documents from the positions 1 to 10.

An experiment was performed on the introductory and the advanced documents according to Fog, Flesch and Kincaid indices. All of them marked the documents as unreadable and weren’t able to distinguish the introductory from the advanced thus showing that the reading level measures aren’t enough to distinguish the documents. Also an experiment was performed in which a query was expanded using introductory trigger words. But it was found that it didn’t bring about significant improvement in the rankings of the introductory documents.

Thus a familiarity classifier was developed using reading level measures, distribution of stop words in the text and the non text features like the average line-length. This classifier when trained could label documents as introductory or advanced. It could be used to increase the precision at the top rankings by including more results there. However relevance can’t be increased this way. But the documents can be classified at crawl time, thus addressing this problem too.

The study was able to re-rank the documents, producing a statistically significantly higher proportion of introductory documents at top most 5 positions and at topmost 10 positions, over baseline search engine retrieval. This kind of topic-independent, user-independent classifier is empowering for personalized search, as with a single change to the retrieval reranking, any user can specify whether they want introductory or advanced documents for any query.

Further work in this area would be integrate user profile to automatically know the knowledge level of the user, so that user doesn’t have to point out explicitly whether he wants advanced results or introductory results. This scheme could have majority of the10 results matching the user profile information. If the user clicks upon the minority results, its clear that he wants the opposite information. Also the classifier could include more features which help in better identification of advanced documents from the introductory documents.

The publication is available here.

Friday, November 03, 2006

Useful Keyboard shortcuts for Microsoft Word



C refers to Control
S refers to Shift

C+Home Go to first line of Page from anywhere
S+End Go to last line of a Page from anywhere
C+S+> Increase selected text in increments like the drop down font menu
C+S+< Decrease selected text in increments like the drop down font menu
C+S++ Apply superscript formatting
C+= Apply subscript formatting
C+S+] Increase selected text one point
C+S+[ Decrease selected text one point
S+F3 Change case of the letters
C+S+W Underline words but not spaces
C+E Center a paragraph
C+BkSp Delete one word to the left
C+Del Delete one word to the right
C+J Justify a paragraph
C+L Left align a paragraph
C+R Right align a paragraph
C+S+(<--/-->) Extend selection to the beginning/end of a word