About  |  Contact Us  
GoogleGuide logo

Part II: Understanding Results

Google strives to make it easy to find whatever you’re seeking, whether it’s a web page, a news article, a definition, or something to buy. After you enter a query, Google returns a results list ordered by what it considers the items’ relevance to your query, listing the best match first. (Sponsored links may appear above and to the right of the search results.) This part of Google Guide describes what appears on a results page and how to evaluate what you find so you’ll be better able to determine if a page includes the information you’re seeking or links to it.

How Google Works

If you aren’t interested in learning how Google creates the index and the database of documents that it accesses when processing a query, skip this description. I adapted the following overview from Chris Sherman and Gary Price’s wonderful description of How Search Engines Work in Chapter 2 of The Invisible Web (CyberAge Books, 2001).

Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:

  • Googlebot, a web crawler that finds and fetches web pages.
  • The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
  • The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.

Let’s take a closer look at each part.

1. Googlebot, Google’s Web Crawler

Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google’s indexer.

Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it’s capable of doing.

Googlebot finds pages in two ways: through an add URL form, www.google.com/addurl.html, and through finding links by crawling the web.

Screen shot of web page for adding a URL to Google.

Unfortunately, spammers figured out how to create automated bots that bombarded the add URL form with millions of URLs pointing to commercial propaganda. Google rejects those URLs submitted through its Add URL form that it suspects are trying to deceive users by employing tactics such as including hidden text or links on a page, stuffing a page with irrelevant words, cloaking (aka bait and switch), using sneaky redirects, creating doorways, domains, or sub-domains with substantially similar content, sending automated queries to Google, and linking to bad neighbors. So now the Add URL form also has a test: it displays some squiggly letters designed to fool automated “letter-guessers”; it asks you to enter the letters you see — something like an eye-chart test to stop spambots.

When Googlebot fetches a page, it culls all the links appearing on the page and adds them to a queue for subsequent crawling. Googlebot tends to encounter little spam because most web authors link only to what they believe are high-quality pages. By harvesting links from every page it encounters, Googlebot can quickly build a list of links that can cover broad reaches of the web. This technique, known as deep crawling, also allows Googlebot to probe deep within individual sites. Because of their massive scale, deep crawls can reach almost every page in the web. Because the web is vast, this can take some time, so some pages may be crawled only once a month.

Although its function is simple, Googlebot must be programmed to handle several challenges. First, since Googlebot sends out simultaneous requests for thousands of pages, the queue of “visit soon” URLs must be constantly examined and compared with URLs already in Google’s index. Duplicates in the queue must be eliminated to prevent Googlebot from fetching the same page again. Googlebot must determine how often to revisit a page. On the one hand, it’s a waste of resources to re-index an unchanged page. On the other hand, Google wants to re-index changed pages to deliver up-to-date results.

To keep the index current, Google continuously recrawls popular frequently changing web pages at a rate roughly proportional to how often the pages change. Such crawls keep an index current and are known as fresh crawls. Newspaper pages are downloaded daily, pages with stock quotes are downloaded much more frequently. Of course, fresh crawls return fewer pages than the deep crawl. The combination of the two types of crawls allows Google to both make efficient use of its resources and keep its index reasonably current.

2. Google’s Indexer

Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms.

To improve search performance, Google ignores (doesn’t index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google’s performance.

3. Google’s Query Processor

The query processor has several parts, including the user interface (search box), the “engine” that evaluates queries and matches them to relevant documents, and the results formatter.

PageRank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank.

Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page. A patent application discusses other factors that Google considers when ranking a page. Visit SEOmoz.org’s report for an interpretation of the concepts and the practical applications contained in Google’s patent application.

Google also applies machine-learning techniques to improve its performance automatically by learning relationships and associations within the stored data. For example, the spelling-correcting system uses such techniques to figure out likely alternative spellings. Google closely guards the formulas it uses to calculate relevance; they’re tweaked to improve quality and performance, and to outwit the latest devious techniques used by spammers.

Indexing the full text of the web allows Google to go beyond simply matching single search terms. Google gives more priority to pages that have search terms near each other and in the same order as the query. Google can also match multi-word phrases and sentences. Since Google indexes HTML code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.g., in the title, in the URL, in the body, and in links to the page, options offered by Google’s Advanced Search Form and Using Search Operators (Advanced Operators).

Let’s see how Google processes a query.

1. The web server sends the query to the index        servers. The content inside the index servers is similar        to the index in the back of a book--it tells which pages        contain the words that match any particular query       term.          2. The query travels to the doc servers, which   actually retrieve the stored documents. Snippets are    generated to describe each search result.       3. The search results are returned to the user          in a fraction of a second.

For more information on how Google works, take a look at the following articles.

tags (keywords): , , , , , , , ,

This page was last modified on: Friday February 2, 2007

Results Page

The results page is filled with information and links, most of which relate to your query.

Screen shot indicating what is shown on a Google results page.

  • Google Logo: Click on the Google logo to go to Google’s home page.
  • Statistics Bar: Describes your search, includes the number of results on the current results page and an estimate of the total number of results, as well as the time your search took. For the sake of efficiency, Google estimates the number of results; it would take considerably longer to compute the exact number. This estimate is unreliable.

    Every underlined term in the statistics bar is linked to its dictionary definition. Queries that are linked to just one definition are followed by a definition link.

  • Tips: Sometimes Google displays a tip in a box just below the statistics bar.
    Screen shot of a Google tip
    Screen shot of a Google tip
    Screen shot of a Google tip
  • Search Results: Ordered by relevance to your query, with the result that Google considers the most relevant listed first. Consequently you are likely to find what you’re seeking quickly by looking at the results in the order in which they appear. Google assesses relevance by considering over a hundred factors, including how many other pages link to the page, the positions of the search terms within the page, and the proximity of the search terms to one another.

    Below are descriptions of some search-result components. These components appear in fonts of different colors on the result page to make it easier to distinguish them from one another.

    • Page Title: (blue) The web page’s title, if the page has one, or its URL if the page has no title or if Google has not indexed all of the page’s content. Click on the page title (e.g., The History of the Brassiere - Mary Phelps Jacob) to display the corresponding page.
    • Snippets: (black) Each search result usually includes one or more short excerpts of the text that matches your query with your search terms in boldface type. Each distinct excerpt or snippet is separated by an ellipsis (…). These snippets, which appear in a black font, may provide you with

      • The information you are seeking
      • What you might find on the linked page
      • Ideas of terms to use in your subsequent searches

      When Google hasn’t crawled a page, it doesn’t include a snippet. A page might not be crawled because its publisher requested no crawling, or because the page was written in such a way that it was too difficult to crawl.

    • URL of Result: (green) Web address of the search result. In the screen shot, the URL of the first result is inventors.about.com/library/weekly/aa042597.htm.
    • Size: (green) The size of the text portion of the web page. It is omitted for sites not yet indexed. In the screen shot, “5k” means that the text portion of the web page is 5 kilobytes. One kilobyte is 1,024 (210) bytes. One byte typically holds one character. In general, the average size of a word is six characters. So each 1k of text is about 170 words. A page containing 5K characters thus is about 850 words long.

      Large web pages are far less likely to be relevant to your query than smaller pages. For the sake of efficiency, Google searches only the first 101 kilobytes (approximately 17,000 words) of a web page and the first 120 kilobytes of a pdf file. Assuming 15 words per line and 50 lines per page, Google searches the first 22 pages of a web page and the first 26 pages of a pdf file. If a page is larger, Google will list the page as being 101 kilobytes or 120 kilobytes for a pdf file. This means that Google’s results won’t reference any part of a web page beyond its first 101 kilobytes or any part of a pdf file beyond the first 120 kilobytes.

    • Date: (green) Sometimes the date Google crawled a page appears just after the size of the page. The date tells you the freshness of Google’s copy of the page. Dates are included for pages that have recently had a fresh crawl.
    • Indented Result: When Google finds multiple results from the same website, it lists the most relevant result first with the second most relevant page from that same site indented below it. In the screen shot, the indented result and the one above it are both from the site inventors.about.com.

      Limiting the number of results from a given site to two ensures that pages from one site will not dominate your search results and that Google provides pages from a variety of sites.

    • More Results: When there are more than two results from the same site, access the remaining results from the “More results from…” link.

      When Google returns more than one page of results, you can view subsequent pages by clicking either a page number or one of the “o”s in the whimsical “Gooooogle” that appears below the last search result on the page.

      Click on a number or an "o" to see another page of results.

      If you find yourself scrolling through pages of results, consider increasing the number of results Google displays on each results page by changing your global preferences.

      In practice, however, if pages of interest to you aren’t within the first 10 results, consider refining your query instead of sifting through pages of irrelevant results. To simplify such refinements, Google includes a search box at the bottom of the page you can use to enter your refined query.

  • Sponsored Links: Your results may include some clearly identified sponsored links (advertisements) relevant to your search. If any of your search terms appear in the ads, Google displays them in boldface type.
  • Spelling Corrections, Dictionary Definition, Cached, Similar Pages, News, Product Information, Translation, Book results: Your results may include these links, which are described in the next few chapters.

Here’s another screen shot of the results page in case the one at the top of this page scrolled off your screen.

Screen shot indicating what is shown on a Google results page.

For more on what’s included on Google’s results page, visit www.google.com/help/interpret.html.

tags (keywords): , ,

This page was last modified on: Friday February 2, 2007

Links Included with Your Results

Google may include links to the following types of information above or along side your results.

The shortcut links that often appear to the left of an icon are known as OneBox results.

tags (keywords): , ,

This page was last modified on: Friday February 2, 2007

Spelling Corrections and Suggestions

Not sure how to spell something? Don’t worry, try gessing or speling any way you can. In just the first few months on the job, Google engineer Noam Shazeer developed a spelling correction (suggestion) system based on what other users have entered. The system automatically checks whether you are using the most common spelling of each word in your query.

(We used to suggest that you search Google for phonitick spewling. But so many Web pages added the same example that now — or, at least, when we last checked — Google no longer treats those “words” as incorrectly spelled! Google’s system doesn’t match words against an actual dictionary; it compares them to commonly-used words.)

Want to know the approximate value of a used car? Check out its “Blue Book” value.

Google search box with [ blu book ].  

Notice that Google suggests the correct spelling if you fail to type the final “e” in “blue.

Google suggests an alternative more common spelling.

Since an alternative spelling is more common, Google asks: Did you mean: blue book. Click the suggested spelling link to launch a new search on the “blue book” spelling instead of the original “blu book.

Google’s checker is particularly good at recognizing frequently made typos, misspellings, and misconceptions. It analyzes all terms in your query to recognize what you most likely intended to enter. For example, when you search for [ untied stats ], the spelling checker suggests Did you mean: united states. although each individual word is spelled correctly.

Regardless of whether it suggests an alternative spelling, Google returns results that match your query if there are any. If there aren’t any that match your query, Google may offer an alternative spelling, search tips, and a link to Google Answers. The last is a service that provides assistance from expert online researchers for a fee.

If no results match your query, Google offers search tips.

Google figures out possible misspellings and their likely correct spellings by using words it finds while searching the web and processing user queries. So, unlike many spelling correctors, Google can suggest common spellings for:

  • Proper nouns (names and places)
  • Words that may not appear in a dictionary

People searching for Britney Spears have clearly found the spelling checker useful, as it has corrected spellings of her first name ranging from “Brittany” to “Prietny.” Visit www.google.com/jobs/britney.html to see hundreds of other ways people have misspelled her name.

Be aware that the spelling checker isn’t able to distinguish between a variant spelling and a word or name that is spelled similarly. So, before clicking on what Google suggests, check that it’s what you intended. For example, when looking up the San Francisco Bay Area web designer Mistrale, Google asks: Did you mean: Mistral, though I spelled the name correctly.

Screen shot showing how Google makes a suggestion though I spelled the term correctly.

Exercises

The first problem gives you practice in using Google’s spelling-correction system. For hints and answers to selected problems, see the Solutions page.

  1. On National Public Radio (NPR), you heard a researcher at Stanford University whose name sounded like Jeff Naumberg and want to send him email. What is Jeff’s email address?
  2. From Google’s home page, www.google.com, search for “french military victories” and then click on the I’m Feeling Lucky button to see Albino Blacksheep’s parody of a Google spelling correction result.

    Note: Though the page looks like a Google page, if you enter another query in the search box, it will be processed by the hosting site, listed in your browser’s address box.

tags (keywords): , ,

This page was last modified on: Tuesday March 13, 2007

Dictionary Definitions

Want a definition for your search terms? It’s just a click away.

Google looks for dictionary definitions for your search terms. If it finds any definitions, it shows those words as underlined links or includes a definition link in the statistics bar section of the results page (located below the search box showing your query). Google is able to find definitions for acronyms, colloquialisms, and slang, as well as words that you would expect to find in a dictionary.

Google search box with [ triumvirate ].  

Click on the underlined terms or the definition link in the statistics bar to link to their dictionary definition, which also may include information on pronunciation, part of speech, etymology, and usage.

Screen shot of the underlined terms in the statistics bar, which are linked to their dictionary definitions.

For example, learn what co-founders Larry Page and Sergey Brin, and CEO Eric Schmidt mean when they say they “run Google as a triumvirate” by clicking on the link triumvirate to look up “triumvirate” on dictionary.reference.com.

Screen shot of one of the dictionary definitions for "triumvirate" Screen shot of one of the dictionary definitions for "triumvirate"

Phrases with idiomatic meanings that aren’t necessarily implied by the definitions of the individual words will be linked to their dictionary definitions, e.g., “happy hour,” “put off,” “greasy spoon,” and “raise the roof.

Google search box with [ happy hour ].  

If Google doesn’t find a definition for a term, try using Google Glossary.

Exercises

These problems give you practice in finding dictionary definitions. For hints and answers to selected problems, see the Solutions page.

  1. According to the dictionary, what is an “urban legend”?
  2. Find the history of the word chivalry. From which language does it come and from what word?
  3. Does Google provide a link to dictionary for definitions of terms in languages other than English?
  4. What does zeitgeist mean? What’s on the Google Zeitgeist page www.google.com/press/zeitgeist.html?

tags (keywords): ,

This page was last modified on: Tuesday March 13, 2007

Cached Pages

Google takes a snapshot of each page it examines and caches (stores) that version as a back-up. The cached version is what Google uses to judge if a page is a good match for your query.

Practically every search result includes a Cached link. Clicking on that link takes you to the Google cached version of that web page, instead of the current version of the page. This is useful if the original page is unavailable because of:

  • Internet congestion
  • A down, overloaded, or just slow website
  • The owner’s recently removing the page from the Web

Sometimes you can access the cached version from a site that otherwise require registration or a subscription.

Note: Since Google’s servers are typically faster than many web servers, you can often access a page’s cached version faster than the page itself.

If Google returns a link to a page that appears to have little to do with your query, or if you can’t find the information you’re seeking on the current version of the page, take a look at the cached version.

Let’s search for pages on the Google help basic search operators.

Google search box with [ Google help basic search operators ].  

Screen shot showing cached link in a search result

Click on the Cached link to view Google’s cached version of the page with the query terms highlighted. The cached version also indicates terms that appear only on links pointing to the page and not on the page itself.

On the cached version, Google highlights search terms and indicates terms that appear only on links pointing to the page.

Note: Internet Explorer users may view a page with any word(s) highlighted, not just search terms, by using the highlight feature of the Google Toolbar, which we cover in Making Google Easier with Google Tools.

When Google displays the cached page, a header at the top serves as a reminder that what you see isn’t necessarily the most recent version of the page.

The Cached link will be omitted for sites whose owners have requested that Google remove the cached version or not cache their content, as well as any sites Google hasn’t indexed.

If the original page contains more than 101 kilobytes of text, the cached version of the page will consist of the first 101 kbytes (120 kbytes for pdf files).

You can also retrieve Google’s cached version of a page via the cache: search operator. For example, [ cache:www.pandemonia.com/flying/ ] will show Google’s cached version of Flight Diary in which Hamish Reid documents what’s involved in learning how to fly.

On the cached version of a page, Google will highlight terms in your query that appear after the cache: search operator. For example, in the snapshot of the page www.pandemonia.com/flying/, Google highlights the terms “fly” and “diary” in response to the query [ cache:www.pandemonia.com/flying/ fly diary ].

Use the Wayback Machine when you want to visit a version of a web page that is older than Google’s cached version.

Exercises

These problems give you practice accessing Google’s cached version of a page. For hints and answers to selected problems, see the Solutions page.

  1. After Nelson Blachman received reprints of a paper he wrote for the June 2003 issue of The Mathematical Scientist, he wanted to discover what other sorts of papers appear in the same issue of this semiannual publication. Find a table of contents for The Mathematical Scientist for Nelson.
  2. Compare the dates on the current page with the dates on the cached version for the following organizations:
    • CNN
    • New York Times
    • Linux Magazine
    • North Texas Food Bank

    Note: Google indexes a page (adds it to its index and caches it) frequently if the page is popular (has a high PageRank) and if the page is updated regularly. The new cached version replaces any previous cached versions of the page.

  3. Check the dates that the Wayback Machine archived versions of Google Guide.

tags (keywords): , , ,

This page was last modified on: Tuesday March 13, 2007

Similar Pages

Here’s how to find results similar to another Google search result. Let’s say you’re interested in finding sites similar to that of Consumer Reports. First, search for their site.

Google search box with [ "Consumer Reports" ].  

Click on the Similar pages link that appears on the bottom line for the Consumer Reports result.

Screen shot of Similar pages link in search results

The link may be useful for finding more consumer resources, or information on Consumer Reports’ competitors.

Screen shot of what you see when you click on the Similar pages link

You can also find similar pages by using the Page-Specific Search selector on the Advanced Search page or by using the related: search operator. If you expect to search frequently for similar pages, you may want to install a GoogleScout browser button.

Note: The similar pages feature is most effective on pages that are popular, i.e, that are linked to from many pages.

How does Google find similar pages?

By finding other sites listed on pages that link to the specified page. Let’s see how Google chooses sites similar to Google Guide. I use the related: search operator, which returns the same results as the Similar pages link.

Google search box with [ related:www.googleguide.com ].  

Screen shot showing pages similar to Google Guide

Now let’s look at one of the sites that link to Google Guide, as it was at the time we made the screen shot above. On the Michigan State University (MSU) Libraries page, www.lib.msu.edu/sowards/home/home5.htm (shown in the screen shot below), Google Guide is listed near the top of the page just after a link to Google’s Zeitgeist page, www.google.com/press/zeitgeist.html. The next three sites listed as being similar to Google Guide (Metaspy, the MEL Internet Myths and Hoaxes, and Web Characterization) are also listed on the MSU page. Google automatically selected these sites by considering many factors including the popularity of the pages containing links to Google Guide, the positions, sizes, and proximities of other links to the Google Guide link.

Screen shot of an MSU library page that links to Google Guide

Another resource for similar results is the category link that may appear just below the snippet or above your search results, which is described next. If there isn’t a category link, try using Google’s Directory.

For more information about the Similar pages link, visit www.google.com/help/features.html#related.

Exercises

These problems give you practice in using Google’s Similar pages feature. For hints and answers to selected problems, see the Solutions page.

  1. Find a site that will get your name off mailing lists so that you receive less commercial advertising mail. Click on the Similar pages link to find other such sites.
  2. What sites are similar to the Internet Movie Database?

tags (keywords): ,

This page was last modified on: Tuesday March 13, 2007

News Headlines

When Google finds current news relating to your query, Google includes up to three headlines that link to news stories above your search results. Why at most three? So as not to push the web search results off the page.

Of course, since news by definition reports recent events, you’ll see the most recent headlines about the United Nations (if there are any recent headlines, that is) when you enter the query [ United Nations ].

Google search box with [ United Nations ].  

News relating to your query appears above your results

For more news stories or to browse the latest headlines, visit Google News Search at news.google.com, which we describe in the Part named Services.

Exercises

These problems give you practice in searching for news headlines. For hints and answers to selected problems, see the Solutions page.

  1. Find the latest news about Google.
  2. Find the latest news on Iraq.

tags (keywords): ,

This page was last modified on: Tuesday March 13, 2007

Product Search

When Google finds products relevant to your query, above your search results, you may find up to three links to items that merchants list in Google’s Product Search service.

Google search box with [ portable dvd player ].  

Screen shot of products that match your query

Product Search is also called Shopping. There are two Shopping links near the top of the screen shot above.

Exercises

These problems give you practice in searching for products.

  1. Find denim jackets.
  2. Find cell phones (mobile phones).

tags (keywords): ,

This page was last modified on: Thursday March 13, 2008

File Type Conversion

Google converts all file types it searches to either HTML or text (unless, of course, they already are in one of these formats). Google searches a variety of file formats including

File Format Suffix Description
Adobe Acrobat PDF pdf A publishing format commonly used for product manuals and documents of all sorts.
Adobe PostScript ps A printing format often used for academic papers.
Hypertext Markup Language html or htm The primary language for web pages.
Lotus 1-2-3 wk1, wk2, wk3, wk4, wk5, wki, wks, or wku A spreadsheet format.
Lotus WordPro lwp A word processing format.
MacWrite mw A word processing format.
Microsoft Excel xls A spreadsheet format.
Microsoft PowerPoint ppt A format for presentations and slides.
Microsoft Word doc A common word processing format.
Microsoft Works wks, wps, or wdb A word processing format.
Microsoft Write wri A Macintosh word processing format.
Rich Text Format rtf A format used to exchange documents between Microsoft Word and other formats.
Plain Text ans or txt Ordinary text with no special formating.

Clicking on a link to a non-HTML file will launch the associated program for reading the file, provided it’s installed on your system.

If you can’t view the page in the native format — for instance, if you don’t have Adobe Acrobat on your computer, or if you want faster access to the file — click on either the “View as HTML” or “View as Text” link.

Note: Portions of some files converted to HTML or text may be difficult to read.

Non-HTML files can be viewed in their original forms, or as HTML or text.

You can use the File Format section of the Advanced Search form or the filetype: search operator to restrict your results to a particular format.

For more information about file types that Google supports, visit www.google.com/help/faq_filetypes.html.

Exercises

These problems give you practice viewing files of different types. For hints and answers to selected problems, see the Solutions page.

  1. Find a document with tips on job interviewing and salary negotiation that is in PDF/Adobe Acrobat format. What differences in the appearance of the document result from viewing it in its native format, Adobe Acrobat versus HTML?
  2. Find a Power Point slide presentation on first aid and choking. View the presentation as HTML.
  3. Find pdf or Postscript documents and course notes on symplectic geometry that are on university and other educational sites.

    This problem was inspired by Julian Uschersohn.

tags (keywords): ,

This page was last modified on: Tuesday March 13, 2007

Translation

As the web has spread across the world, more and more web pages are available in languages other than English. Google provides a translation link and language tools to enable you to read pages written in unfamiliar languages.

Google translates pages by computer. Machine translation is difficult to do well and tends not to be as clear as human translation. But it can give you the gist of what’s written or suggestions for translating something into another language.

Your results may include a “Translate this page” link when a results page is written in a language different from your interface language (as specified by your Google Preferences, which we describe soon). Your interface language is the language in which Google displays messages and labels, buttons, and tips on Google’s home page and results page. You can translate pages written in English, French, German, Italian, Portuguese, and Spanish into another language from that set.

Results include a "Translate this page" link when Google finds a page in a language different from your language of choice.

Google’s Language Tools overcome language barriers. Click on the Language Tools link to the right of the search box on Google’s home page,

The Language Tools link on Google's home page

or visit www.google.com/language_tools, or select the Language Tools menu option in the Google Toolbar to:

  • Search for pages written in specific languages
    With the Language Tool, you can search for pages written in a specific language or in a specific country.
  • Search for pages located in specific countries
  • Use the Google interface in another language

    That is, set Google’s home page, messages and labels, and buttons to display in a specific language

  • Visit Google’s site in a specific country.

    For example, visit www.google.de in Germany

    With the Language Tool, you can Google's site in a specific country.

  • Translate any text or web page from a limited set of languages including English, French, German, Italian, Portuguese, or Spanish into another language in that set.

If you want to translate some text or a page into a language other than those Google Language Translation Tool offers, check out Fagan Finder’s Translation Wizard.

If you’re interested in translating Google Guide, please use our contact form and also review Erik Hoy’s advice for Google Guide translators. The Danish Google Guide, bibliotek.kk.dk/soeg_bestil_forny/googleguide, is available through the Copenhagen Central Library’s website. You can find a Hebrew version of Google Guide at www.googleguide.co.il/.

Exercises

These problems give you practice with translating words, pages, and results, and with finding pages in specific countries. For hints and answers to selected problems, see the Solutions page.

  1. Find out about municipal swimming pools that you can use when visiting Naples. Hint: Find the Italian words for “municipal swimming pools Naples” and then search for them on pages in Italy. You can use your browser’s Copy and Paste features to transfer the Italian words from one screen to another.
  2. Find the name of the mayor of Montpellier, France, by searching the city website montpellier.fr. It may help to know the French word for “mayor.”
  3. Translate “I wish to mail a package. Where is the nearest post office? Thank you.” into Spanish.
  4. Find listings or photos of old books at the national library of Spain. Hint: Translate the two unrelated phrases “old books” and “national library Spain” separately; otherwise, the translation software may try to make them into a sentence (and add “noise” words).
  5. Restrict your search to France and search for pages in English on the war in Iraq.

tags (keywords): , , ,

This page was last modified on: Tuesday March 13, 2007

Customizing Your Results: Preferences

Whenever I run a new piece of software, … I [first] … look at the program’s ‘preferences’ panel. By clicking through the options, I rapidly learn what a program can do and what its shortcomings are. Google is no different. — Simson Garfinkel, Getting More from Google, Technology Review, June 4, 2003

You can customize the way your search results appear by configuring your Google global preferences, options that apply across most Google search services. To change these options, click on the Preferences link, which is to the right of Google’s search box, or visit www.google.com/preferences.

A screen shot showing that the Preferences link is to the right of the search box on Google's home page

From the Preferences page, specify your global preferences, including

  • Interface Language: the language in which Google will display tips, messages, and buttons for you
  • Search Language: the language of the pages Google should search for you
  • SafeSearch: automatic filtering and blocking of web pages with explicit sexual content
  • Number of results: how many search results are to be displayed per page
  • Results window: when enabled, clicking on the main link (typically the page title) for a result will open the corresponding page in a new window

When you set your preferences, Google stores your settings in a “cookie on the computer you are using. Google doesn’t associate that cookie with any other computer you use. So, if you want Google to work similarly on all the computers you use, you will need to set these preferences on each one of them.

1. Interface Language

The set of languages in which you want to allow messages and labels, text on buttons, and tips to be displayed. Your choice of interface languages is much larger than the “translate” set of languages (those that can be translated into your interface language). It includes relatively obscure languages, such as Catalan, Maltese, Occitan, and Welsh; designed languages like Interlingua and Esperanto; and frivolous languages such as Bork, bork, bork!, Hacker, and Pig Latin.

Screen shot showing the selection of languages in which you can display messages and labels, text on buttons, and tips

If you set your interface language to Greek, message and text on links, tabs, and buttons will be displayed in Greek.

A screen shot showing Google's search box with the interface language set to Greek

The interface language is configured on the Preferences page. The pull-down menu allows you to choose from over 80 languages.

A screen shot showing how to specify your Interface Language

Note: If you don’t find your preferred language in the list, you can volunteer to translate Google’s help information and search interface into that language via the Google In Your Language program.

If you select an interface language other than English, when using Google Web search you will be given the option of searching the entire web or just pages written in your interface language. For example, with French as the interface language the search box looks like this:

A screen shot showing Google's search box with the interface language set to French

Note: Most non-English Google home pages have a “Google.com in English” link in case you can’t read the rest of the page.

2. Search Language

By default, Google Web search includes all pages on the Web. You can choose to restrict your searches to those pages written in the languages of your choice by setting the search language.

Google Search Language Preferences

If you want to restrict results to a single language for a few queries, consider using Google’s Language section of the Advanced Search page.

3. SafeSearch Filtering

Google’s SafeSearch filters out sites with pornography and explicit sexual content. Moderate filtering, the default, is set to exclude most explicit images from Google Image search results but not Google Web search or other Google search services.

Google SafeSearch Filtering Preferences

Google’s philosophy is to filter no more than necessary, i.e., as little as possible. Google considered adding the capability to filter other controversial content besides pornography, e.g., hate speech, anarchy, bomb making, etc. But these are much more difficult to filter automatically. For example, if you try to filter hate speech, you may filter out sites that discuss hate speech.

4. Number of Results

 

The most important setting, located near the bottom of the page, is “Number of Results.” By default, Google returns just 10 results for a search. Since Google’s search algorithms are so accurate, this default saves Google both computer resources and downloading time. But I always increase the default to 100. Although such searches take a little longer to download (especially over a dial-up connection), getting back 100 results saves me time when I’m searching for anything out-of-the-ordinary; it’s much faster to scroll through a Web page than to manually click through 10 pages of intermediate results.

 
  Simson Garfinkel, Getting More from Google, Technology Review, June 4, 2003 (MIT’s Alumni magazine)

You can increase the number of results displayed per page to 20, 30, 50, or 100. The more results displayed per page, the more likely you are to find what you want on the first page of results. The downside is that the more results per page, the more slowly the page loads. How much more time it takes depends on your connection to the Internet.

Google Number of Results Preferences

The Number-of-Results setting applies to Google’s Web, Groups, News, Froogle, and Directory search services. It does not apply to Images and Answers.

5. New Results Window

After you set the Results Window option on the Preferences page, when you click on the main link (typically the page title) for a result, Google will open the corresponding page in a new window.

Google Preferences

You can display the contents of the associated page in a new window:

  • In Internet Explorer, hold down the SHIFT key while you click on the link, or press the right mouse button and select Open a New Window after clicking on the link.
  • In Firefox or Netscape, with a three-button mouse, simply click your mouse’s middle button on the link that you wish to display in a new window (this can be configured in the browser’s Preferences or Options section). If your mouse has two buttons and a center scroll wheel, the scroll wheel may also act as a middle button when you press down on it.

    With a two-button mouse, press the right mouse button and select Open Link in New Window after clicking on the link.

6. Cookies and their Effect on Preferences

Google stores your preferences with a cookie in your computer. Among other things, this means:

  • If you use more than one computer, you’ll need to set your Google preferences on each one.
  • If your browser is set to deny cookies, your preferences can’t be saved.
  • If you use “cleanup” software that removes cookies, it may remove your Google preferences.

So, if Google seems to “forget” your preferences settings, look into what’s happening with your cookies. As of this writing, the Mozilla and Firefox web browsers have especially flexible cookie management — including site-by-site cookie preferences and a scrollable list of all saved cookies.

You’ll find more about cookies and how to control them in the pages Tracking and Cookies.

Exercises

These problems give you practice in changing preferences. After you’ve changed your preferences, run a couple of searches. For hints and answers to selected problems, see the Solutions page.

  1. Change your preferences to display 20 results per page.
  2. Change your preferences to use strict filtering, i.e., filter both explicit text and explicit sexual content.
  3. Set your preferences to open search results in a new browser window.
  4. Configure your preferences to suit your needs.
  5. If you would like to have more than one set of preferences on your computer, e.g., one of searching French language sites and to search all sites, then find tools for enabling you to specify more than one set of preferences using more than one cookie.

    (For instance, the Mozilla browser allows you to have multiple “profiles,” each with its own set of cookies. You can also install more than one type of browser on the same computer. Both of these methods let you have more than one “identity” at the same time on the same computer.)

tags (keywords): , , , , ,

This page was last modified on: Tuesday March 13, 2007

Tracking

One of Google’s corporate philosophies has always been not to “do evil.” Google’s Privacy Policy Highlights explain more. (You’ll also find a link to their complete Privacy Policy on that page.)

Whether you trust Google or not, it’s good to know something about how Google tracks you. What does Google do to remember your Preferences? When does Google record personal information like your name and your email address? And how far can you go to protect yourself without losing Google’s services? We won’t try to answer all of those questions thoroughly or in detail — after all, this is a guide to Google, not to computer security. We’ll hit the highlights, though: enough information to help you understand something about what’s going on inside your browser and on Google’s servers.

Cookies vs. Accounts

Let’s start with an overview of two main ways Google can keep track of you: by storing cookies on your web browser(s) and by asking you to sign up for a Google Account. Two following pages, Google Accounts and Controlling Cookies, have details.

  • A cookie is a piece of data that’s exchanged by a server (say, Google’s server) with a web browser that’s using its web pages. A cookie lets a web server track information about a particular web browser.

    For instance, a web server could store a cookie to help it track all of the web pages visited by you (actually, by your browser — including any other people who use the browser on your computer).

    Browsers can store many different cookies at the same time. You can control which cookies are set and how long they’re kept.

  • A Google Account holds some or all of the information about yourself that you’ve provided to Google at some time — such as your email address and your name. This information is maintained on Google’s servers. It gives you access to some Google Services, such as your personal shopping wishlist for Froogle.

    Google doesn’t require accounts for most of its services. The exception is services where identitifying you is important — like sending messages with Google Groups or Gmail.

tags (keywords): , ,

This page was last modified on: Tuesday May 1, 2007

Google Accounts

A screen shot of the Google Accounts sign-in page

A Google Account is free of charge. The easiest way to get one is by visiting . There you’ll be asked for information like your email address and a password.

Note: If you’re planning to get a Gmail account, and you’d like to use your Gmail address as your primary email address, you should sign up with Gmail first. Then your Gmail address can automatically become the email address for your Google Account. In fact, signing up for Gmail gives you a Google Account automatically.

Once you have a Google Account, you can tell Google who you are by signing in to the account. You’ll find a “Sign In” box at the top right-hand corner of many Google screens. You can also sign in from the home page of services like Gmail and Groups, as well as from the Accounts page shown above.

When you’re done with your Google account, you can simply go on with your business. You can also close your browser. If you’ve checked the “Remember me on this computer” box (see the example above), Google will set a cookie in your browser so that, the next time you open your browser and go to a Google page, Google’s server will sign you in automatically.

tags (keywords): , ,

This page was last modified on: Tuesday May 1, 2007

Cookies

As we said in our earlier introduction, Tracking, a cookie is a bit of data from a web server. (Think of “fortune cookies” you might get after a Chinese meal, with little bits of wisdom inside each one.) Each web browser keeps its own set of cookies. So, if you use several computers — or several different web browsers on the same computer — each of those browsers has a different set of cookies in its “cookie jar” (actually, in the computer’s memory and/or disk).

So, for example, if you set your Google preferences on a particular browser, Google’s web server can set a cookie in that browser to maintain your preferences on that browser. But if you go to another computer, those preferences you just set on the previous computer won’t be set here because Google’s server can’t know that it’s you on that other computer. (Google has no idea where you are in a room.)

It’s possible for a web server to associate cookies with other information you enter. It won’t always do that, but it can — and often does. For instance, if you have an account and you sign in, then the web server will know who you are and that you’re using this browser. Then the web server may set cookies on that browser to “remember” that you’re using it and keep track of what you’re doing. A company’s privacy policy may explain what it stores in any cookies it sets.

Remember that, unless you have a Google Account and you sign in, Google can’t track you as a person. It can only track what’s happened on the particular browser you’re using at the moment. (This is true of other web servers, too: not just Google’s.)

You can remove the cookies from your browser by using cookie management programs or by using controls built into your browser itself. You can also prevent cookies from being set in the first place. Doing so can help to preserve your privacy, but you can also lose the advantages of cookies — such as being able to set preferences.

How Long Do Cookies Last?

Each cookie has a name and an expiration date. When a web server sends a cookie, it asks your browser to keep that particular cookie until a certain date and time. These dates can be:

  • Some date in the future. This might be a few minutes or a few hours from now (to track something like your shopping cart in an online store). Or the cookie might expire many years in the future — which means the server wants to keep track of your browser for a long time.
  • When you close your browser. This is called a session cookie. The next time you start your browser, the session cookies from the previous session will have vanished.
  • Some date in the past. This is how the server asks a browser to remove a previously-stored cookie.

As we’ll see in a moment, Google uses a mixture of session cookies and longer-term cookies.

Most web browsers let you prevent a web server from setting cookies. Add-on software can also control cookies. The most sophisticated browsers, such as Firefox, give you a lot of control over cookies.

Your browser probably has a way to remove some or all stored cookies. Doing that will stop most (but not all) tracking that a web server can do. But, of course, you’ll lose the benefits of permanent cookies. For instance, if you have a Google Account, you’ll probably have to sign in again before you use a personalized Google service like Gmail.

If you’re concerned about privacy but also want the advantages of cookies, some browsers have a good compromise: treating some or all cookies as session cookies. That is, if a server asks to store a cookie until next year, your browser can store it as a session cookie instead.

That’s enough, we hope, to give you an idea of what Google is doing “behind the scenes” in your web browser and on their servers. It’s far from everything there is to know, though! If you’d like to know more, please check the website’s privacy policy and some good references about Web security.

Cookie Examples

You can configure the Firefox browser to let you control each cookie and to see details about each of the stored cookies. Let’s use it to show a few examples of Google’s use of cookies.

Note: This section is for people who are interested in more technical details of setting cookies. If you aren’t, please skip ahead to the next chapter.

We’ll start by opening Firefox to a blank page and entering www.google.com as the URL. We’ve configured Firefox to ask before setting each cookie, and we’ve also just used its “Clear privacy data” command to erase all old cookies. As soon as we go to www.google.com, the server asks to set a cookie:

Firefox browser asking if www.google.com can set a cookie

Notice that the server wants the cookie to expire in the year 2038 and that the cookie’s name is PREF. (This may be where the server “remembers” our Google preferences.) We click the Allow for Session button, which tells Firefox to erase the cookie when we quit the browser. We could also have denied the cookie, though, to see what might happen next. It’s likely that Google will work fine with almost all cookies denied — except the cookie(s) that keep your Google Account settings.

Later, after doing some searches, we decide to sign in. Clicking the Sign in button brings up another Confirm setting cookie dialog. This time, the server wants to modify a cookie that it set earlier named GoogleAccountsLocale_session. The cookie will expire at the end of the browser session. In this case, we agree. (We could also have chosen “Use my choice for all cookies from this site” if we didn’t want to answer any more questions about www.google.com.)

Firefox browser asking if www.google.com can modify a cookie

After more searches, we open the Firefox Options dialog to look at the stored cookies. (That’s the little right-hand window in the next screen shot.) Google has set several cookies by now: five for www.google.com, one for groups.google.com, and at least one more for images.google.com. Clicking on one of the cookies shows that it’s the PREF cookie set two screen shots previous. You generally won’t need to get to this level of detail — but it is possible to, say, remove the stored cookies from a server so that server can’t “remember” you.

Firefox browser showing cookie settings and stored cookies

tags (keywords): , ,

This page was last modified on: Tuesday May 1, 2007

Last Results Page

Though the statistics bar may estimate that more than 1000 results match your query, Google doesn’t serve more than 1000 results for any query. You can get to the 1000th or last result by setting your Preferences to display 100 results/page and clicking on the highest number or last “o” at the bottom of the results page.

Click on a number or an "o" to see another page of results.

Alternatively, you can specify a URL (web address) with the results that you want Google to display. Request results 900-999 for the query [ googleguide ] with the URL

If there aren’t 900 results, Google will display the last page of results. If you value of the variable start and min(num, 100) (the minimum value of the variable num and 100) add up to more than 1000, Google will display the following error message:

Sorry, Google does not serve more than 1000 results for any query.

tags (keywords): ,

This page was last modified on: Friday February 2, 2007

Ads

Some search engines sell their search results, in addition to showing ads. A sold result means that a link to the buyer’s page is put at or near the top of the results page, just as if the search engine thought it was one of the best results. Usually, there is no indication that the page’s result location was bought and paid for.

Google never sells its search results. If a web page appears in Google’s search results, it’s because Google thought it was a relevant result for your search, not because someone paid Google to put it there.

Google’s approach to ads is similar to its approach to search results: the ads must deliver useful links, or the ads are removed.

  • Ads must be relevant to your search.
  • Ads must not intrude, distract, or annoy (no pop-up or flashy ads).
  • Sponsored links are clearly identified and kept separate from search results.
  • At most, two sponsored links appear above Google’s search results.

You can distinguish ads by their format and the label “Sponsored Link.” Ads contain a title, a short description, and a web address (URL).

A screen shot showing how Google's ads are identified and kept separate from search results

Advertisers decide which queries their ads should match, and then Google decides on placement, i.e., which ads to show and in what order. Google determines placement by an auction; the auction not only considers what the advertiser will pay for the ad, but also its click-through rate, i.e., how often users click on the ad. If users often click on an ad, Google will likely place the ad higher up on the results page. If the click-through rate of an ad falls below a certain level, indicating an ad isn’t relevant to the query, Google removes the ad.

For the most part, you’ll find advertisements pertinent to your query. However, Google’s automatic matching to words on a page sometimes places an ad inappropriately. For example, in September of 2003, adjacent to a New York Post article about a gruesome murder in which the victim’s body parts were stashed in a suitcase, Google listed an ad for suitcases. Since that incident, Google has improved its filters and automatically pulls ads from pages with disturbing content. So Google is unlikely to make another faux pas on a par with this one.

Some web pages display ads provided by Google’s AdSense service. The hosting website and Google share the amount an advertiser pays when a user clicks on an ad, which varies between US$0.01 and US$50.00. Web publishers typically place Google AdSense ads near the top, on the right, or on the left side of a page to catch your attention. We’ve included such an ad at the top of this page.

Sample Google AdSense advertisement

For why Google sells advertising and not search results, visit www.google.com/honestresults.html.

For more information on Google’s advertising programs, visit www.google.com/ads/.

For what to do if you find a pop-up ad on Google, visit www.google.com/help/nopopupads.html.

Exercises

For hints and answers to selected problems, see the Solutions page.

  1. How many sponsored links (ads) appear on the first search-results page with the answer to the following questions?
    1. Where can you stay in central London at a moderate price?
    2. What’s going on with NASA’s Mars Exploration Program?
  2. Click on several interesting sounding Adsense ads.
  3. If you have a website, sign up for an AdWords account so that you can purchase ads to bring users to your site.
  4. If you have a website, sign up for an AdSense account so that you can generate revenue from advertising on your site.

tags (keywords): ,

This page was last modified on: Tuesday March 13, 2007

Evaluating What You Find

Google’s web-page-ranking system, PageRank, tends to give priority to better respected and trusted information. Well-respected sites link to other well-respected sites. This linking boosts the PageRank of high-quality sites. Consequently, more accurate pages are typically listed before sites that include unreliable and erroneous material. (The various browser toolbars can show you the PageRank of the page you’re currently browsing.) Nevertheless, evaluate carefully whatever you find on the web since anyone can

  • Create pages
  • Exchange ideas
  • Copy, falsify, or omit information intentionally or accidentally

Many people publish pages to get you to buy something or accept a point of view. Google makes no effort to discover or eliminate unreliable and erroneous material. It’s up to you to cultivate the habit of healthy skepticism. When evaluating the credibility of a page, consider the following AAOCC (Authority, Accuracy, Objectivity, Currency, Coverage) criteria and questions, which are adapted from www.lib.berkeley.edu/ENGI/eval_criteria.html.

Authority
  • Who are the authors? Are they qualified? Are they credible?
  • With whom are they affiliated? Do their affiliations affect their credibility?
  • Who is the publisher? What is the publisher’s reputation?
Accuracy
  • Is the information accurate? Is it reliable and error-free?
  • Are the interpretations and implications reasonable?
  • Is there evidence to support conclusions? Is the evidence verifiable?
  • Do the authors properly list their sources, references or citations with dates, page numbers or web addresses, etc.?
Objectivity
  • What is the purpose? What do the authors want to accomplish?
  • Does this purpose affect the presentation?
  • Is there an implicit or explicit bias?
  • Is the information fact, opinion, spoof, or satirical?
Currency
  • Is the information current? Is it still valid?
  • When was the site last updated?
  • Is the site well-maintained? Are there any broken links?
Coverage
  • Is the information relevant to your topic and assignment?
  • What is the intended audience?
  • Is the material presented at an appropriate level?
  • Is the information complete? Is it unique?

Search for [ evaluate web pages ] or [ hints evaluate credibility web pages ] to find resources on how to evaluate the veracity of pages you view.

For a printable form with most of the questions that you will probably want to ask, visit www.lib.berkeley.edu/TeachingLib/Guides/Internet/EvalForm.pdf. If you’re unable to view PDF files, you can get a free PDF viewer from Adobe by visiting www.adobe.com/products/acrobat/readstep2.html. For more information on evaluating what you find, visit www.lib.berkeley.edu/TeachingLib/Guides/Internet/Evaluate.html.

Exercises

Find documents on the web that provide the answers to the following questions. What’s your level of comfort with the referring site(s) and why? For hints and answers to selected problems, see the Solutions page.

  1. Is it true that if you touch a cold halogen light bulb with clean fingers, you will shorten its lifespan?
  2. Are 75% of Americans chronically dehydrated? Find opposing points of view.
  3. Are you less likely to get dental cavities if you drink fluoridated water?
  4. Is clumping kitty litter a major health hazard to cats?
  5. What are the benefits and drawbacks of a flu (influenza) shot?
  6. Does microwaving food in plastic containers or plastic cling wrap release harmful chemicals into the food? Check whether this is an urban legend.

Want more experience assessing the authenticity and integrity of some websites? Try the exercises listed on www.lib.berkeley.edu/TeachingLib/Guides/Internet/EvaluateWhy.html.

tags (keywords):

This page was last modified on: Tuesday March 13, 2007



For Google tips, tricks, & how Google works, visit Google Guide at www.GoogleGuide.com.

Creative Commons

By Nancy Blachman and Jerry Peek who aren't Google employees. For permission to copy & create derivative works, visit Google Guide's Creative Commons License webpage.

Please send us suggestions for how we can improve Google Guide.