Internet search has become a routine computing activity, with regular visits to a search engine—usually Google—the norm for most of us. The vast majority of searchers, as recent studies of Internet search behavior reveal, search only in the most basic of ways and fail to avail themselves of options that could easily and effortlessly improve search quality. Despite a wealth of literature on how to better use search engines, those same search-behavior studies suggest that searchers ignore nearly all the advice.
Many searches seek simple factual results or a known site when the URL is forgotten. For this, a simple Google search works well. Getting good results for a more complex query can depend on a well-crafted search strategy and the use of multiple search engines, deep Web resources, and possibly commercial library databases. For anything beyond the mere factual, the limited search methods used by most searchers are largely ineffective.
This article examines alternative engines with unique features that might improve search quality. I also advocate the use of search engines that graphically illustrate what searchers miss when they use only one engine. Finally, the article explores the possibilities for using social bookmarking communities as the ultimate search engine alternative.
Developing good search skills requires study, experimentation, exploration, and a routine for keeping up with the latest developments in search engine technology. That goes for information technologists, instructional technologists, and librarians. As information professionals expand their repertoire of tools and techniques to improve their own search skills, they can in turn educate their end-user communities to do better than just good enough.
Librarians working with the public know a good deal about search behavior. No matter the search system or interface used, searchers tend to exhibit little sophistication in the development of search terms and strategies. Beyond anecdotal evidence, though, what do we know about search behaviors? The body of knowledge about this topic advanced in January 2005 through a Pew Internet & American Life study about search engine use and experience.1 While the study didn’t explore how searchers determine what words to use for their search or how they are entered, it still yields some revealing findings:
- A remarkable 87 percent of searchers say they have successful search experiences most of the time, including some 17 percent who claim they always find the information for which they look.
- Internet users behave conservatively as searchers: They tend to settle quickly on a single search engine and then stick with it rather than switching as search technology evolves or comparing results from different search systems.
- The study also indicated that satisfied Internet users don’t really understand why and how they use search engines and seem unaware of how search engines work.
The Nielsen Norman Group conducts an annual survey that seeks to learn more about how people use the Internet. Their studies have yielded additional information about search techniques and strategies typically used with search systems.2 Among their findings:
- Six in 10 of those surveyed search only one word to find what they are looking for.
- Only 3 percent of searchers use quote marks to search an exact phrase, although this technique can significantly reduce the number of hits.
- Only 1 percent of searchers use the advanced search interface.
- Few people look beyond the first page of results. According to Jacob Nielsen, Web usability expert, “If it’s beyond the first page, it’s as if it doesn’t exist.”3
In another 2005 study of Internet search behavior, a team of researchers at Cornell University found that searchers unthinkingly accept the defaults given and rarely make choices based on anything other than by default. To test if users always click the first result because it is the best information presented or simply because it is the first result presented, the research team gave searchers the results in two ways. They first presented the default results, and then presented them again with the number one and number two hits reversed. In each case searchers always showed a preference for the result in the top spot.4
These studies, combined with search experts’ advice, indicate Internet searchers could certainly improve the quality of their search techniques. This article focuses on only a few of the dozens of ways to improve individual searching behavior. Integrating these techniques can yield vast improvement in search results, and all can be mastered quickly. Those considered most essential are:
- Using more than one search engine
- Experimenting with new search engines that offer different features
- Using alternatives to search engines
What’s Your Search Missing?
A 2005 study found that restricting Internet searches to a single engine tremendously limits access to information. Researchers examined search results from more than 12,500 random queries on Ask.com, Google, MSN search, and Yahoo. The overlap in first-page results for these four engines was a scant 1.1 percent on average for a given query. Of the total results, 85 percent were unique to one engine, and even overlap between just two engines occurred only 11 percent of the time. This lack of overlap, pointed out the researchers, means that using one Web search engine may impede the user’s ability to find the desired information. 5 It may take more than a research study, however, to convince searchers they should use more than one engine.
Three new engines provide graphic results that leave no uncertainty about the importance of searching multiple engines. Dogpile’s Search Comparison Engine (http://comparesearchengines.dogpile.com) graphically compares search results from the big three engines: Google, Yahoo, and MSN. For example, searching “computer virus protection,” I found that 14 of Google’s 17 first-page results were unique to Google. Of Yahoo’s 16 first-page results, 13 were unique to Yahoo, with just two Web sites duplicated on Google. Almost any search will demonstrate the minimal overlap among the first-page results of these three search engines.
Jux2 (http://www.jux2.com) yields a display of results that clearly shows where each retrieved page is ranked on Google, Yahoo, or MSN. A searcher can quickly see the results unique to one engine, as well as any duplication among them. This information reinforces the importance of going beyond the first result page. A searcher using only Google may miss an important site that its search algorithm places on the fourth or fifth results page, while Yahoo might rank that site differently and place it on page one.
Thumbshots Ranking (http://ranking.thumbshots.com) can compare up to seven different search engines head-to-head. Its entirely graphic display of the results makes it all too obvious that two search engines can rank the same sites completely differently. Again, it sends a strong message that using just one site or failing to delve beyond the first few results pages can lead to more misses than hits.
Librarians and other information technologists can put these tools to good use helping students and colleagues better understand the importance of using more than one search engine. It’s a technique I’ve used in classroom sessions to help students better understand that using just one engine can lead to missing more than is found. It tends to grab the students’ attention and focus their thinking on the quality and reliability of their Internet search methods.
Think there are just a few search engines to choose from? Think again. In September 2005 Wendy Boswell, the search engine expert at About.com, began her “100 Search Engines in 100 Days” series to demonstrate the depth and breadth of the search tools at our disposal.6 Just knowing there are so many choices makes the practice of sticking with just one seem impractical. One option is to add Yahoo, MSN, or AOL to the Google search. According to the cover story of the October 2005 PC Magazine, using some combination of the “big four” reduces a searcher’s risk of missing a valuable site.7 They will still miss the unique and interesting approaches to search offered by some new or second-tier engines, of course. Several engines are worth exploring.
Exalead (http://www.exalead.com) received significant attention in 2005 as an up-and-coming contender in the search engine race. For starters, it offers an effective proximity search. When exact phrase search is too limiting, proximity allows for searching terms with up to 16 intervening words. Add Exalead’s respectable “truncation” option, which allows root word (for example, strateg*) searches, and the engine excels by offering searchers a variety of narrowing and widening options.
Ask.com (http://www.ask.com) increased its versatility early in 2006 when it acquired and merged Teoma into its search technology. It helps searchers refine a search by recommending terms that might make the search more precise. For example, if the term “library” is searched, “refine” will display terms such as public library, library of congress, and others that might help a searcher narrow the scope of the query. If the search results are too narrow, Ask.com will suggest other terms to expand the search. Unfortunately, Ask.com has not made Teoma’s resource page feature available, which included existing resource pages with the search results, thus saving time by providing pages with topical links. I can only hope Ask.com will bring it back.
Clusty (http://www.clusty.com), as in “clustering,” takes a different approach to delivering search results. It organizes results into categories that represent subsets of the topic. A search on the term “library” creates clusters for “university” and “public” as well as other types of libraries. Clusty has the ability to preview the actual Web site in the result screen. Searchers can examine a site’s home page without having to jump in and out of Clusty. All Clusty results identify other search engines that list the same site should the searcher wish to look at results elsewhere in other search engines.
Internet searchers should also pay attention to the blogosphere’s rich content, which a number of search engines mine. Technorati (http://www.technorati.com) claims to index the content of more blogs than any other engine, and it usually produces robust results. Technorati offers advanced searching features, as well as ways to locate blogs using different criteria. Other blog search engines worth examining are Feedster.com, Findory.com (which can search blogs, news, and Web sites), and Daypop.com. The challenge with searching blog content is that the results can be extremely unpredictable given the range of blog content. A well-defined, more-specific search topic, rather than a general one, will almost always yield better results.
This summary barely touches the search world, as there are dozens more engines, many specialized for academic fields such as science, business, or education. No searcher will know all the possibilities, but being well versed with three or four engines, each offering a unique way of accessing Web content, can offer far better results than using the same engine over and over.
Collective wisdom, a philosophy popularized by James Surowiecki in The Wisdom of Crowds, asserts that groups produce better decisions and solutions than individuals. That premise has spurred the growth of social and collaborative online communities, where information is found among shared bookmarks.
Social and collaborative networks are virtual representations of an old paradigm for finding information. If I’ve already gathered some useful resources about a specific topic, why not allow others who share my interest to use those books and articles? When many individuals point to the same resources, those sources are judged to be of good quality—demonstrating the wisdom of the crowd. This now occurs in large, anonymous Web communities of information hunters and gatherers who bookmark favored sites for others to share. A few communities are worthy of further exploration.
FURL (http://www.furl.net) helps searchers keep found things found, storing favored sites. FURL subscribers can capture entire Web pages and store them for future retrieval even if the original Web page no longer exists—FURL maintains the page on its server. Saved content is organized by categories, and subscribers can assign keywords to and attach clippings from stored content. When saved articles are retrieved, FURL will point to other users who saved the same content. It also displays related content saved by other FURL subscribers, enabling FURL users to locate useful Web sites and articles without search engines.
FURL subscribers can create an RSS (Real Simple Sindication) feed for their content. When I subscribe to the RSS feeds of other FURL subscribers, my news aggregator (http://www.bloglines.com) tracks new content added to their FURL archives. It is relatively easy to identify others with similar interests, and this never fails to lead to new information. With social collaboration software, any useful information I miss will likely be found within the collective.
Among the more popular social bookmarking communities is del.icio.us (http://del.icio.us). These sites set the groundwork for a new search paradigm through tagging. Tags are words each subscriber assigns to any bookmarked site to aid future retrieval. For example, if I bookmarked the EDUCAUSE site, I might tag it with the terms educause, hawkins, technology. Those tags help lead other subscribers to additional information about EDUCAUSE. That’s how del.icio.us can supplant search. Instead of using search engines, searchers can navigate to del.icio.us, identify an appropriate tag for their topic, and then use that tag to locate links added to del.icio.us by all the subscribers who used the tag.
Tag selection is completely arbitrary, of course; individuals choose whatever makes the most sense to them. As a result, finding information in communities like del.icio.us can be hit or miss. If someone only tags the EDUCAUSE site with “conference” and “podcasts,” those searching for “technology” would completely miss the link. While the tagging system has flaws, however, a vague search of Google, followed by visits to even a few sites, can be far less effective—which is why many searchers have turned to collaborative bookmarking sites like del.icio.us.
Academics may prefer CiteULike (http://www.citeulike.com), which describes itself as a “free service to help academics share, store, and organize the academic papers they are reading.” It advises subscribers that they can share their library with others and find out who is reading the same papers. Most of the content stored at CiteULike is scholarly. Searches often yield more current and authoritative articles than a similar Google Scholar search. When CiteULike users locate relevant articles, they can discover who else links to them, and that can ultimately lead to colleagues who share research interests.
RSS technology also contributes to the marginalization of traditional search engines. Why search for information when RSS technology will “push” it to me? The mainstream media and Internet news services such as Yahoo and Findory offer RSS feeds for their news and allow feeds for customized searches. Many of the social bookmarking sites allow any tag to be saved as an RSS feed. The paradigm is shifting from searching existing content to retrieving, by preselected variables, future content as it is published on the Internet. The social networking communities of the Internet are breaking down the concept of the solo searcher and giving credence to the power of sharing information.
Can We Live With Good Enough?
An immediate reaction to this article might be, “Why bother?” After all, in a Google universe does expanding one’s search skills and diversifying the engines used really matter? Isn’t it sufficient to simply pick the most convenient search engine, search whatever words come to mind, and then make the most of whatever appears on the first results screen?
These are legitimate questions. Do we sanction “satisficing” by students or hold them to higher standards? As the world of search changes, librarians, instructional technologists, information technologists, and faculty must rethink what quality research means and what our research expectations for students should be.
Social bookmarking communities and RSS feeds probably won’t replace traditional search engines soon, but these new technologies, along with new and different search engines, should be examined more closely by faculty and information professionals. Educators should lead the way in helping students improve the quality of their Internet research to better prepare them for a future workplace where information is a critical commodity. Simply accepting the poor quality that results from “just good enough” research need not be our destiny.
We can use teachable moments and formal instruction to spread the word about better search methods and new strategies. We should advocate research strategies that encourage the use of many options, everything from traditional library databases (see the sidebar), Internet search engines, resource pages and directories, deep Web resources, RSS feeds, and social bookmarking communities. Students can learn when each option is appropriate for particular types of research. But it will begin with raising our own awareness of the many options for searching and retrieving information and committing ourselves to gain proficiency in using them.