 |
|
 |
 |
 |
NEW APPROACHES: VISUALIZATION
HYPERTEXTUAL RECOVERING
An emerging discipline in the field of recovering info
from large hypertext corpus suggested the use of links in the World Wide
Web to improve the performance (precision) of the search engines. A brief
introduction
was provided by Sullivan (1998) in "Counting Clicks and Looking at Links".
Several promising projects are on the way and all of
them could be of deep impact on the scientometric research of the Internet,
such as:
• BIRD
is a bibliometric query by example search engine. Given a set of pages
of interest to the user, it retrieves a set of similar documents by
following citation paths that pass through those given documents.
• The CLEVER
project builds on Jon Kleinberg's HITS (Hypertext-Induced
Topic Search) algorithm, which seeks to find authoritative sources (Authorities)
of information on the Web, together with sites (Hubs)
featuring good compilations of such authoritative sources. The original
HITS algorithm, devised while Kleinberg was a visiting scientist at
IBM Almaden, first uses a standard text search engine to gather a "root
set" of pages matching the query subject. Next, it adds to the
pool all pages pointing to or pointed to by the root set. Thereafter,
it uses only the links between these pages to distill the best authorities
and hubs. The key insight is that these links capture the annotative
power (and effort) of millions of individuals independently building
web pages.
° J. Kleinberg. Authoritative
sources in a hyperlinked environment. Proc. 9th ACM
° J. Kleinberg and Steve Lawrence. The
Structure of the Web. Science, Vol 294, Issue 5548, 1849-1850
, 30 November 2001
°SIAM Symposium on Discrete Algorithms, 1998.
Also appears as IBM Research Report RJ 10076, May 1997.
° D. Gibson, J. Kleinberg, P. Raghavan. Inferring
Web communities from link topology. Proc. 9th ACM Conference on
Hypertext and Hypermedia, 1998.
° S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg,
P. Raghavan, S. Rajagopalan, Automatic
resource list compilation by analyzing hyperlink structure and associated
text. Proc. 7th International World Wide Web Conference, 1998.
° S. Chakrabarti, et al. (1999) Hypersearching
the Web.
° S. Chakrabarti, et al. (1999) Mining
the link structure of the World Wide Web.
° Kleinberg J et al (1999) The
Web as a graph: Measurements, models and methods.
° D. Gibson, J. Kleinberg, P. Raghavan. Structural
Analysis of the World Wide Web. Invited position paper at the
WWW Consortium Web Characterization Workshop, November 1998.
• CiteSeer.
Autonomous Citation Indexing (ACI) which automates the construction
of citation indexes.
° Bollacker, K.; Lawrence, Steve; Giles, C. Lee
(1998). "CiteSeer:
An Autonomous Web Agent for Automatic Retrieval and Identification
of Interesting Publications". Proceedings of the 2nd International
ACM Conference on Autonomous Agents, pp. 116, 1998. PDF
° Giles, C. Lee; Lawrence, Steve & Krovetz,
Bob (1998). "Access
to Information on the Web". Letter to Science, 280, (5371):1815.
° Giles, C. Lee; Bollacker, K. & Lawrence,
Steve (1998). "CiteSeer:
An Automatic Citation Indexing System". ABSTRACT. Proceedings
of the 3rd ACM Conference on Digital Libraries, pp. 89-98, 1998.PDF
° Lawrence, Steve & Giles, C. Lee. "Context
and Page Analysis for Improved Web Search". IEEE Internet
Computing, 2(4), July/August 1998:38-46. PDF
° Lawrence, Steve & Giles, C. Lee. (1998).
"Searching
the World Wide Web". Science, 280(5360):98.
° Lawrence, Steve & Giles, C. Lee (1998).
"Inquirus,
The NECI Meta Search Engine". Proceedings of the Seventh
International World Wide Web Conference, Brisbane, Australia, 1998.
PDF
° Lawrence, Steve & Giles, C. Lee (1998).
"Searching
the Web". Letter to Science, 281(5374):175
• Web
Archeology. A project by Digital Research:
° Bharat Krishna & Henzinger, Monika
R.(1998). "Improved
Algorithms for Topic Distillation in Hyperlinked Environments".
Proceedings of the 21st International ACM SIGIR Conference on Research
and Development in Information Retrieval, 1998.
° Bharat, Krishna; Broder, Andrei; Henzinger,
Monika; Kumar, Puneet & Venkatasubramanian, Suresh
(1998). "The
connectivity server: Fast access to linkage information on the Web".
Proceedings of the 7th International World Wide Web Conference, 469-477,
April 1998.
• Google
is becoming one of the main
engines, but it is also a very clever approach to the problem of
searching great volumes of information using the hypertext links.
° Brin, Sergey & Page, Lawrence (1998).
The Anatomy
of a Large-Scale Hypertextual Web Search Engine. Proceedings of
the 7th International World Wide Web Conference, April 1998.
° Cho, Junghoo; Garcia-Molina, Hector &
Page, Lawrence (1998). Efficient
Crawling Through URL Ordering. Proceedings of the 7th International
World Wide Web Conference, April 1998.
• The
Open GRiD Project is a proposal to add peer review and peer
recognition to Google model in order to improve the ranking of webpages.
In a descriptive way, the author (Maxim Lifantsev) call it the voting
model.
° Lifantsev, Maxim (2000). Open
Peer-Review as Web's Self-Organization Force. Submitted
to the 26th International Conference on Very Large Databases, Cairo,
Egypt, September 2000.
° Lifantsev, Maxim (2000). Voting
Model for Ranking Web Pages. Accepted to the 1st International
Conference on Internet Computing, Las Vegas, U.S.A., June 2000.
° Lifantsev, Maxim (1999). Rank
Computation Methods for Web Documents. Technical Report TR-76,
ECSL, Department of Computer Science, SUNY at Stony Brook, Stony Brook,
NY, November 1999.
NEURAL NETWORKS AND SELF-ORGANISED MAPS
There is an extensive directory of bibliographies about
neural networks in the University
of Waterloo.
The Internet has offered a new scenario for representing
by maps the scientometric relationships as it has been successfully showed
by the projects developed in the CWTS. There is extensive information
about neural networks.
But in the last months several proposals have increased
the potential use of mapping data. Now these techniques are applied also
to the information in the Web with plenty of options to improve the knowledge
organisation of the Web contents by Mapping Web Sites.
• Planning Diagrams to Site Maps. Seminar by
Paul Kahn
http://www.dynamicdiagrams.com/seminars
/mapping/maptoc.htm DISAPPEARED
This could increase the capabilities of search engines,
providing them with a new and friendly graphical interface.
• WEBSOM:
Self-Organizing Maps for Internet Exploration. This is the latest result
of the work of Teuvo Kohonen (http://www.cis.hut.fi/nnrc/teuvo.html).
The algorithm is developed in the following articles:
° T. Kohonen. Self-organizing formation of topologically
correct feature maps. Biological Cybernetics, 43(1):59--69, 1982.
° T. Kohonen. The self-organizing map. Proceedings
of IEEE, 78:1464--1480, 1990.
• SEMIO.
Semio's text-mining software provides the ability to discover and leverage
new value in the glut of textual information on corporate Intranets.
° SemioMap® has an on-line
demo; this tool displays how phrases interact across documents,
uncovering meaning that is not readily apparent.
• Cyberspace
geography visualization. Mapping the World-Wide Web to
help people find their way in cyberspace.
• TouchGraph.
This tool offers a solution by presenting a graph where similar items
are clustered next to each other.
• Opte
Project. This project was created to make a visual representation
of a space that is very much one-dimensional, a metaphysical universe.
• NeuralWare.
NeuralWare offers proven technology tools for developing and deploying
neural networks and advanced empirical models.
• Topic
Maps. It is an independent consortium of parties interested
in developing the applicability of the Topic Maps Paradigm to the World
Wide Web, by leveraging the XML family of specifications as required.
• NewMap.
Newsmap is an application that visually reflects the constantly changing
landscape of the 'Google News' news aggregator. A treemap visualization
algorithm helps display the enormous amount of information gathered
by the aggregator.
• Mapping
Cyberspace.
• Tamara Munzner and Paul Burchard. Visualizing
the Structure of the World Wide Web in 3D Hyperbolic Space.
Proceedings of VRML '95, (San Diego, California, December 14-15, 1995),
special issue of Computer Graphics, ACM SIGGRAPH, New York, 1995, pp.
33-38.
• Kevin Gurney.Neural
Nets.
|
 |
 |
 |