Home | Computers | Software | Information Retrieval | Ranking

Adaptive On-Line Page Importance Computation

http://www2003.org/cdrom/papers/refereed/p007/p7-abiteboul.html

A good explanation about the convergence of various algorithms. This paper also describes an adaptive and on-line algorithm for computing the page importance. It can be used for focus crawling as well as for search engine's ranking.

Authoritative Sources in a Hyperlinked Environment

http://www.cs.cornell.edu/home/kleinber/auth.pdf

HITs is a link-structure analysis algorithm which ranks pages by "authorities" (pages which have many incoming links and provide the best source of information on a given topic) and "hubs" (pages which have many outgoing links and provide useful lists of possibly relevant pages). Ranking is performed at query time. [PDF format]

DiscoWeb: Discovering Web Communities Via Link Ana

http://www.cs.rutgers.edu/~davison/discoweb/

This paper describes a prototype system, later known as the Teoma Search Engine. It performs a Link Analysis, loosely based on the Kleimberg method, and computed at query time.

Exploiting the Block Structure of the Web for Comp

http://www.stanford.edu/~taherh/papers/blockrank.pdf

A hierarchical approach for computing PageRank. The local PageRanks of page for each host are computed independently and then used to compute the global PageRank of Web Graph.

Extrapolation Methods for Accelerating PageRank Co

http://dbpubs.stanford.edu:8090/pub/2003-16

A paper about the computation of PageRank using the standard Power Method and the new Quadratic Extrapolation which computes the principal eigenvector of the Markov matrix representing the Web link graph with an increased speed up of about 50-300%.

Finding Authorities and Hubs From Link Structures

http://www10.org/cdrom/papers/314/

A survey on PageRank, Hits and SALSA. It also describes two Bayesian statistical algorithms for ranking of hyperlinked documents and the concepts of monotonicity and locality, as well as various concepts of distance and similarity between ranking algorithms.

Improved Algorithms for Topic Distillation in Hype

http://gatekeeper.dec.com/pub/DEC/SRC/publications/monika/sigir98.pdf

Given a typical user query to find quality documents related to the query topic. It uses an Hits variation.

Improvement of HITS-based Algorithms on Web Docume

http://www2002.org/CDROM/refereed/643/

It proposes a new weighted HITS-based method that assigns appropriate weights to in-links of root documents and combines content analysis with HITS-based algorithms.

Improvement to Clever Algorithm

http://www2002.org/CDROM/poster/171.pdf

A Kleimberg's algorithm improvement. [PDF format]

Larry Page Describes PageRank

http://www-db.stanford.edu/~backrub/pageranksub.ps

Postscript-format slides which introduces citation importance ranking by Larry Page, Google's founder.

Link Analysis, Eigenvectors, and Stability

http://www.cs.berkeley.edu/~alicez/ijcai01-linkanalysis.ps

Do Hits and PageRank (and some variations) give stable rankings under small perturbations to the linkage patterns? [PS format]

PageRank as a Random Walk

http://www8.org/w8-papers/2c-search-discover/measuring/measuring.html

A general framework for measuring the quality of an index and providing the background on the PageRank and Random Walks. Imagine a Web surfer who wanders the Web. At each step, he/she either jumps to a page on the Web chosen uniformly at random, or follows a link chosen from those on the current page.

PageRank Calculation Techniques

http://dbpubs.stanford.edu:8090/pub/1999-31

Describes efficient techniques for computing PageRank.

PageRank Calculation with Lossy Encoding

http://dbpubs.stanford.edu:8090/pub/2002-58

Lossy encoding for large scale PageRank calculation.

PageRank Computation Methods

http://www2002.org/CDROM/poster/173.pdf

A poster paper by Stanford db group which describes iterative methods for calculating PageRank. [PDF format]

PageRank U.S. Patent 6,285,999

http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=6285999

Lawrence Page's PageRank Patent.

PageRank Used to Characterize Web Structure

http://www.cs.purdue.edu/homes/gopal/prank.pdf

PageRank's values on the Web follow a power law. An high in-degree of a node does not imply high PageRank, and vice versa. [PDF format]

PageRank: A Circuital Analysis

http://www2002.org/CDROM/poster/165.pdf

It shows some theoretical results for understanding the distribution of the score in the Web according to PageRank. Seven golden rules for building good pages are presented. [PDF format]

Probabilistic Combination of Content and Links

http://research.microsoft.com/copyright/accept.asp?path=http://research.microsoft.com/~sdumais/SIGIR

It introduces a probabilistic model that integrates link topology (used to identify important pages), anchor text (used to augment the text of cited pages), and activation (spread to linked pages). Experiments are on MSN Directory. [PDF format]

SALSA: The Stochastic Approach for Link-Structure

http://www.cs.technion.ac.il/~moran/r/PS/lm-feb01.ps

A focused search algorithm (SALSA) based on Markov chains. It starts with a query on a broad topic, discards useless links, and then weights the remaining terms. A stochastic crawl is used to discover the authorities on this topic. [PS format]

Scaling Personalized Web Search

http://dbpubs.stanford.edu:8090/pub/2002-12

Link Popularity algorithms biased according to a user-specified set of given interesting pages.

Survey on Google's PageRank

http://pr.efactory.de/

Information on the algorithm, how to increase PageRank, what diminishes it and how to distribute PageRank within a website.

The Clever Project

http://www.almaden.ibm.com/cs/k53/clever.html

The CLEVER search engine incorporates several algorithms that make use of hyperlink structure for discovering information on the Web. It is an extension of Hits method.

The EigenTrust Algorithm for Reputation Management

http://www.stanford.edu/~sdkamvar/papers/eigentrust.pdf

An eingenvalues algorithm for calculating reputation in P2P networks and isolating malicious peers. There is a relationship with PageRank algorithm.

The Intelligent Surfer: Probabilistic Combination

http://www.cs.washington.edu/homes/pedrod/papers/nips01b.pdf

This method uses query dependent importance scores and a probabilistic approach to improve upon PageRank. It pre-computes importance scores offline for every possible text query. [PDF format]

The Missing Link - A Probabilistic Model of Docume

http://www.cs.cmu.edu/~cohn/papers/nips00.pdf

This paper describes a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. [PDF format]

The PageRank Citation Ranking: Bringing Order to t

http://dbpubs.stanford.edu:8090/pub/1999-66

First Stanford paper about PageRank. It is a static ranking, performed at indexing time, which interprets a link from page A to page B as a vote, by page A, for page B. Web is seen as a direct graph and votes recursively propagate from nodes to nodes. Ranking is performed at indexing time. Used by Google.

The Second Eigenvalue of the Google Matrix

http://www.stanford.edu/~taherh/papers/secondeigenvalue.pdf

A mathematical paper about the convergence of methods used for solving the PageRank Matrix.

The World?s Largest Matrix Computation

http://www.mathworks.com/company/newsletters/news_notes/clevescorner/oct02_cleve.html

"Google's PageRank is an eigenvector of a matrix of order 2.7 billion"

Topic -Sensitive Page Rank

http://dbpubs.stanford.edu:8090/pub/2002-6

Integrates ODP data in PageRank calculation for performing query time probabilistic ranking.

Towards Exploiting Link Evolution

http://www.almaden.ibm.com/cs/people/siva/papers/linkevol.ps

It describes how to compute incrementally PageRank when Web graph's link topology changes. [PS format]

Web Page Scoring Systems for Horizontal and Verti

http://www2002.org/CDROM/refereed/629/

"Random Surfer" model extension. At each step of traversal of the Web graph, the surfer can jump to a random node or follow a hyperlink or follow a back-link (a hyperlink in the inverse direction) or stay in the same node.

Web-Trec 8 and PageRank

http://trec.nist.gov/pubs/trec8/papers/acsys.pdf

About the using of PageRank in Web Track 8 "large" and "small" datasets. [PDF format]

Web-Trec 9 and Link Popularity

http://trec.nist.gov/pubs/trec9/papers/unine9.pdf

About the using of Link Popularity in Web Track 9 datasets. [PDF format]

What is this Page Known for? Computing Web Page Re

http://www.cs.ualberta.ca/~drafiei/papers/www9.ps

PageRank and Hub and Authority generalization based on the topic of Web Pages. Definition of a model where a surfer can move forward (following an out-going link) and backward (following an in-going link in the inverse direction). [PS format]