Index Structure For Metadata Extracted From Large Hypertext Collections
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science
Abstract
Growing amount of hypertext data can be found in various contexts like weblogs and
online journals, intranet webs, the World Wide Web (WWW), online communities, intraorganizational
wikis and other collaborative content
management
platforms.
In such
collections,
the combination
of content and hyperlink
structures reflect
several interesting
information
about various phenomena
like existence
of cyber communities,
the
documents
similar
to a given document,
the popularity
and importance of documents,
the
probability
of reaching a document
from
any
other
document
by following a sequence of
hyperlinks
etc. These can all be determined
by analyzing a hypertext web. So, different
kinds of analysis can be done on hypertext collections. Doing analysis requires locating
and finding some information in hypertext collection. To locate information in hypertext
database requires the use of an index. Since hypertext database is large in size, we need
an efficient index structure to locate information in hypertext collection. Keys are used to
construct the index and to search information in the index. Urls of web pages are used as
keys to construct the index for hypertext collections. Since Urls of pages are variable in
length, index that supports variable length keys is needed. To achieve these, a multilevel
index supporting variable length key has been constructed as an index for hypertext
collections.
