Toshiyuki Masui, Mitsuru Minakuchi, George R. Borden IV, Kouichi Kashiwagi
Proceedings of the ACM Symposium on User Interface Software and Technology (UIST'95) (November 1995), ACM Press, pp.199-206.
To be demonstrated at ACM SIGIR'96 Conference (held at ETH, Zurich, Switzerland, August 18 - 22, 1996
Although various visualization techniques have been proposed for information retrieval tasks, most of them are based on a single strategy for viewing and navigating through the information space, and vague knowledge such as a fragment of the name of the object is not effective for the search. In contrast, people usually look for things using various vague clues simultaneously. For example, in a library, people can not only walk through the shelves to find a book they have in mind, but also they can be reminded of the author's name by viewing the books on the shelf and check the index cards to get more information.
To enable such realistic search strategies, we developed a multiple-view information retrieval system where data visualization, keyword search, and category search are integrated with the same smooth zooming interface, and any vague knowledge about the data can be utilized to narrow the search space. Users can navigate through the information space at will, by modifying the search area in each view.
In recent years, a huge amount of data has become widely available, owing to the widespread use of mass-storage devices and Internet. Although conventional information retrieval (IR) techniques such as full-text keyword searches are useful for handling gigabytes of data[15], they are not powerful enough to handle large amounts of multimedia data, and many new IR techniques have been proposed.
The most promising approach is using various visualization techniques for showing a large amount of data in a small display space. These techniques include 3D visualization techniques where far objects are displayed in small sizes[5,7,11], distortion-oriented techniques where only important objects are displayed[4,6, 9,12,13], and zooming-based techniques where users can smoothly zoom into any part of the display to enter the next level of detail[3,10]. Virtual reality (VR) can also be considered to be a 3D visualization technique.
Although visualization-based approaches are very useful for finding data by navigating in the information space, they are not as powerful as everyday searches conducted by humans. For example, when a person goes to a library to look for a book, he can walk through the shelves which contain topics relating to the book and look around to find it, just like using visualization-based IR systems. But in addition to that, he can also check the index cards if he remembers the title or the author of the book, and even when he does not remember the exact title or the name of the author, he has a great chance of recalling them by seeing the related books on the shelves. In most cases, people do not remember the exact title or the name of the author and they cannot judge the book's category, but with such vague knowledge, it is very likely that they can eventually find the book by using various searching techniques like looking around, seeing other books, etc. In this way, visualization-based IR systems can add great power, when they are supplemented by other IR techniques like keyword search, hypertext, and a hierarchical conceptual database. Visualization-based IR systems without these features are like libraries without any indexing systems.
Considering this, we believe that integration of the following IR schemes is the key to effective multimedia IR systems.
Either a 3D visualization technique, a distortion-oriented technique, or a zooming-based technique can be used for this purpose.
The Dynamic Query (DQ) technique[14] is very important in recent IR systems. In a DQ system, related information is redisplayed as soon as any change is made to the query conditions. A virtual reality system is considered to be an IR system incorporating DQ, since every single move of a user will change the total view. The DQ approach should be applied to all IR operations so that users can browse a vast amount of information smoothly.
In addition to conventional approaches to specifying keywords, new approaches using advanced visualization techniques are also possible. We will introduce one such technique in this paper.
Searching by category is also important in most searches. Various visualization techniques can be used to visualize a complicated category structure.
In this paper, we introduce a tourist information system based on the combination of these techniques, and show that the smooth integration of existing visualization techniques and new techniques for selecting keywords and categories work together for intuitive information retrieval.
Based on the assumptions shown in the last section, we developed a multimedia Nara information system, or the WING (Whole Interactive Nara Guide) system. Nara, located about 40 kilometers south of Kyoto, is an ancient capital of Japan and full of tourist attractions like old shrines and temples. One of the most famous temples is the Todaiji Temple, which holds the Great Bronze Buddha. It is 15 meters in height, sitting in the Daibutsuden Hall, which is the largest wooden structure in the world (48 meters in height.)
Figure 1 shows a snapshot of the WING system. The display consists of four subviews. Upper-left is the Map View, which shows the map and the terrain of Nara city. A WING user can move the viewpoint to any point in the Map View and see roads, rivers, mountains, etc. like in VR systems. At the center of the Map View, tourist attractions and other information are displayed as cubes with different colors, and their pictures and information are displayed in the Content View at upper-right. Lower-left is the Category View, where all the data are hierarchically categorized and users can dynamically zoom into subcategories to find an individual datum or to constrain the overall search to a particular category. Lower-right is the Index View, where users can search data by text. In the following sections, we will show how the four views work together in more detail.
The Map View displays a 3D map of Nara city. We do not have enough data for making the view look like the real Nara, and only the roads, railroads, rivers, and ponds are displayed in the view. In spite of this simpleness, users can easily recognize geographical locations and perceive the view as a very simple form of virtual reality.
Users can smoothly rotate the view, move the map, zoom into a location, and change the angle of the view by mouse operations. The view looks very much like an ordinary map when seen from above (Figure 2,) and looks more like walking on the ground when seen from a lower position (Figure 3.)
Figure 2: The Map View seen from above in the air.
Figure 3: The Map View seen from the ground.
Data items close to the center of the view are displayed as cubes with different colors, and pictures and other information related to the items are displayed in the Content View. Watching the Content View, users can look for data items by moving around in the map.
Moving around in the Map View is interesting by itself, just like playing with flight simulators and VR systems. users can get some knowledge of Nara, just by moving around in the Map View and watching the changing Content View.
For data items shown close to the center of the Map View, related pictures and information are displayed in the Content View which looks like a travel guidebook. For each data item, degree of interest is calculated from the distance between the item and the center of the map and the importance of each item. The items closer to the center of the Map View usually have higher degree of interest, and their information are displayed closer to the top of the Content View.
Figure 4: Content View.
Data items closely related to the data items shown in the Content View are also displayed in the Content View. For example, when looking at Nara Women's University, information on its accompanied highschool is also displayed in the Content View. Clicking on the information in the Content View makes the Map View gradually move to the specified data item, thereby making the selected item move to the top of the Content View.
All the data items used in the WING system are hierarchically categorized. For example, ``Todaiji Temple'' is an instance of the ``Temple'' class, which is a subclass of the ``Sightseeing'' class. In fact, there is no distinction between a class and an instance, and all the data items constitute a simple object-oriented database. The root of the class hierarchy is the ``All'' class, which has seven subclasses. First, the seven subclasses are displayed as shown in Figure 5.
Users can zoom into any part of the view to show or hide the subclasses and superclasses, just like the Pad system[10]. If a user zooms into the ``Sightseeing'' class, the view changes to Figure 6.
Figure 6: Zooming into ``Sightseeing'' category.
Then the user can see the ``Temple'' class (Figure 7,) and finally find the ``Todaiji Temple'' entry (Figure 8.)
Figure 7: Magnifying the ``Sightseeing'' category.
Figure 8: Magnifying the ``Temple'' category.
When a user clicks the mouse on a data item which has no subclass, the Map View gradually moves to display the data item in the center of the Map View. If the user clicks on a category, the category is used to narrow the search space, and the data shown in the Map View and the Content View changes instantly. For example, if a user selects the ``Restaurant'' category in the Category View, data items not related to restaurants immediately disappear from the Map View and the Content View, and only the data items related to restaurants will be displayed afterwards. The category is deselected when the user clicks on the same category again.
Zooming and moving the view can be performed by the same mouse operations as in the Map View.
Using the Index View, users can find a data item by name from a long list of data names in the database.
Many techniques have been proposed for selecting an item from a large list of items. A pulldown menu may be the most popular interaction tool for selecting an item from small number of items. A scrollbar is also used for selecting a text line from a long text. Sliders can also be used for selecting words from large dictionaries[1,8]. In the Index View of the WING system, a new zooming-based technique is used for selecting an item from the item list.
Although the Index View can display only a portion of the long name list, users can easily locate a name from the list by utilizing the following schemes. First, users can smoothly magnify and shrink the gap between words, just like the zooming operations in the Map View and the Category View. Second, the list consists of a permuted index, where the same name appears in many locations in the list, sorted by valid substrings of the names. For example, ``Hotel Fujita'' appears at around ``ho'', ``fu'', ``ji'', and ``ta'', so that all the data items which contain the same string are listed line by line . (In this case, only ``fu'', ``ji'', and ``ta'' are the valid substrings of ``fujita'', since ``fujita'' consists of three Japanese characters.) We will show how the search operation works in the following example.
At the beginning of the search, the Index View looks like Figure 9. All the data names are sorted at their valid substrings, shown in white characters. The background color under the name corresponds to the position of the name in the list. The rainbow-like colors in Figure 9 indicate that the displayed words are selected from almost all parts of the list.
In Figure 9, only one data name in every 256 names in the list is displayed. To search ``Hotel Fujita'', a user expands the gap between the word ``Ikeda Gankoudou'' and ``Isuien Sanshuu'' by a mouse operation, since the character ``H'' is between ``G'' and ``i''. Then the view changes as shown in Figure 10.
Figure 10: Expanding the gaps between entries.
When the user Expands the gap further and the gap becomes wide enough, a string between the two strings appears like Figure 11. At this moment, one data name in every 128 names in the list is displayed.
Figure 11: More entries appear in the expanded gaps.
By repeating the zoom-in operations in this way, the user will see the Index View expanding to Figure 12 and Figure 13.
Figure 12: One hotel found in the view.
Figure 13: Fully expanded Index View.
Here, although there exist many hotels with the string
``Hotel'' either at the top or at the end of their names,
all the hotels are listed at the same position in the list.
This is like invoking the ``grep
''
command on UNIX with an argument ``Hotel
''.
In this way, even when a user doesn't remember the
exact name of the data item, he can still easily
find it just by zooming.
For example, he can find the ``Hotel Fujita'' entry,
as long as he remembers that the name ends with the string ``jita''.
Clicking on the item name makes the Map View gradually move to display data item in the center of the Map View, just like clicking in other views.
We also tried this zooming technique for finding a movie title from about 10,000 titles. Users can find an entry faster than other existing techniques which use sliders and scroll bars, and queries like ``list all the titles including `New York' '' can easily be performed.
Using the WING system, users can easily retrieve useful information which is usually difficult to retrieve in other systems. In this section, we show two example scenarios of using the WING system.
Suppose you are visiting Nara to attend a conference, and you want to get information, with only the following knowledge.
With only these vague clues, you can use WING and easily get enough information for your visit.
Since the conference will be held at a public convention hall, you can first select the ``Public'' category in the Category View, and move along the hillsides in the Map View. Public offices and halls on the hillsides are displayed in the Guide View, and you can easily locate the convention hall. You can also use the Index View to list all places with the word ``Hall'' in their names. If you vaguely remember the name of the convention hall, this strategy also works well.
First, you look for the name of your hotel in the Index View, and move to your hotel in the Map View by clicking the name in the Index View. Now you know the locations of both your hotel and the convention hall, you can see how you can get to the hall in the Map Window. If you are not sure, you can look for the name of the convention hall in the Index View, and by clicking the name, you can see how to get to the hall by seeing the moving Map View.
When you move to your hotel in the Map View, many information including restaurants, tourist attractions, etc. are displayed in the Guide View. If you select ``Drinking Spots'' in the Category View, information not related to the subject disappear from the Guide View. If you move closer to the hotel in the Map View, drinking spots which are far from the hotel disappear from the Guide View, and you can easily locate the nearest spot from the hotel. In the same manner, you can find tourist attractions and souvenir shops close to your hotel.
Although you had very little information concerning the conference and Nara City, after using WING in this way, not only have you succeeded in getting all the information you needed, but you saw many other items of interest around the convention hall and your hotel. You moved around in the Map View, viewing the information shown in both the Map View and the Guide View, and now you know the terrain, tourist attractions, and many other things related to the area, just like walking around the city with a guidebook, or staying long hours in a library. Navigating in this way is like navigating in hypertext, with the difference that you seldom get lost in the Map View.
Suppose you are going to be transferred to Nara, and you have to look for a place to live in the city. In this situation, you have to consider many things, such as the distance from the house to your office, the distances to train stations, schools and hospitals around the house, the terrain, etc. Before going to Nara, you can use WING to get a basic understanding of Nara and have some ideas about where to look. You can fly over Nara in the Map View and enjoy the scenery and information, and without effort you will eventually know the names and locations of schools, hospitals, shopping centers, etc. With actual real estate information, WING works as an improved version of the HomeFinder system[2].
Figure 14 shows the architecture of the WING system. Whenever a user changes the viewpoint in the Map View, the system redraws the Map View, and at the same time, degree of interest for each data item is calculated and important items are shown in the Content View and in the Map View. When an item is selected by a user, a path from current viewpoint to the new viewpoint close to the item is generated, and for the viewpoints along the path, the same calculation shown above are performed to represent gradual movement to the new location. The zooming actions in the Category View and the Index View are performed independent of other views, and the viewpoint changes only when an item is selected in these views.
Display speed and processing speed is the key to smooth interaction techniques. We implemented the WING system in C with Silicon Graphics computers graphics language facilities (GL), to enable smooth interactions in the Map View. The Category View does not require much display speed, and the Index View requires less processing power. The zooming technique for keyword searching can be used even on PDAs.
Various interface techniques like 3D visualization, dynamic query, zooming interfaces, and permuted keyword search are integrated in the WING system. As we discussed in the introduction, using only one of these techniques is not powerful enough, and the combination of these techniques works better than using them separately.
Users can get explicit and implicit information while they move around in each subview of the WING system. Information such as the location of a temple is explicit, while knowing where to find old temples is implicit. Navigating in the views of the WING system, users can easily and smoothly acquire many implicit information. This is to say, using only one searching strategy is like looking at vast information through a narrow pipe, but using various techniques working in combination, users can view and manipulate vast amounts of information at will. This is just like the way people usually look for and find things. People look around the scene, find a clue, see the clue in more detail and find another clue, etc.
The idea of using multiple related views is useful for many kinds of information retrieval tasks. The Map View can be substituted by any kind of existing data visualization techniques, including 3D visualization techniques, distortion-oriented visualization techniques, zooming-based visualization techniques, etc. The Content View can always show relevant information, and the Category View and the Index View can be generally used to narrow the search space. Here we show examples of applying the technique to other application areas.
File system visualizers[5,11] can be augmented by putting more views around them. Figure 15 shows a file system visualizer augmented with the category view, the index view, and the content view. Users can search a file not only using the 3D view of the file system, but also using the index view and the category view.
Figure 15: Visualizing a large file system.
The on-line manual is another area where multiple-view approach is effective. When a user wants to know how a particular portion of a machine works, he usually must check the manual to find out what it does. However, if he has no idea about the name and the function of the portion, he cannot find the answer either in the table of contents nor in the index, and he has to look for figures in the manual to find out the name of the portion first, and then check the index. Using a visualization system in addition to the table of contents and the index, users can easily get the information they need.
Figure 16: Visualizing an on-line manual.
We are trying to use more sophisticated input devices to enable more intuitive and smooth interaction. A LCD pen tablet with pressure sensor is a promising candidate for this purpose. We are also trying to use various 3D display devices for more realistic information visualization.
We introduced a new unified approach for various information retrieval tasks. Integrating a visualization technique, a keyword search technique, and a category search technique with a consistent smooth zooming interface, various forms of intuitive information retrieval became possible. We believe this technique can be used in wide range of applications.