The Internet provides an exceptional testbed for developing algorithms that can improve browsing and searching large information spaces. Browsing and searching tasks are susceptible to problems of information overload and vocabulary differences. Much of the current research is aimed at the development and refinement of algorithms to improve browsing and searching by addressing these problems. Our research was focused on discovering whether two of the algorithms our research group has developed, a Kohonen algorithm category map for browsing, and an automatically generated concept space algorithm for searching, can help improve browsing and/or searching the Internet. Our results indicate that a Kohonen self-organizing map (SOM)-based algorithm can successfully categorize a large and eclectic Internet information space (the Entertainment subcategory of Yahoo!) into manageable sub-spaces that users can successfully navigate to locate a homepage of interest to them. The SOM algorithm worked best with browsing tasks that were very broad, and in which subjects skipped around between categories. Subjects especially liked the visual and graphical aspects of the map. Subjects who tried to do a directed search, and those that wanted to use the more familiar mental models (alphabetic or hierarchical organization) for browsing, found that the map did not work well. The results from the concept space experiment were especially encouraging. There were no significant differences among the precision measures for the set of documents identified by subject-suggested terms, thesaurus-suggested terms, and the combination of subject- and thesaurus-suggested terms. The recall measures indicated that the combination of subject- and thesaurus-suggested terms exhibited significantly better recall than subject-suggested terms alone. Furthermore, analysis of the homepages indicated that there was limited overlap between the homepages retrieved by the subject-suggested and thesaurus-suggested terms. Since the retrieved homepages for the most part were different, this suggests that a user can enhance a keyword-based search by using an automatically generated concept space. Subjects especially liked the level of control that they could exert over the search, and the fact that the terms suggested by the thesaurus were "real" (i.e., originating in the homepages) and therefore guaranteed to have retrieval success.
|Original language||English (US)|
|Number of pages||22|
|Journal||Journal of the American Society for Information Science|
|Publication status||Published - 1998|
ASJC Scopus subject areas