![]() |
|
![]() |
|
|
The fact that Nancy has been teaching Internet novices is apparent.
She takes nothing for granted, and even includes tips on how to
navigate a Web page. More savvy users may skip those sections,
however, and focus on the practical examples and exercises.
--Pandia Search Engine News
|
|
|
Nancy Blachman's Google Guide is by far the best guide to using
Google, for beginners & more intermediate users, that I've seen so
far. I see great potential here for plopping patrons down with this
self-guided tutorial, instead of the 20 minute "This is Google, this
is how you search" lecture.
Google Guide offers help with searching Google, for both the novice
and the advanced user. ... The most useful search guides like this one
always seems to come from the fringes, not from the company itself.
|
|
|
While the Google search instruction page is helpful, it's a rather bare bones approach, and your guide fills in the gaps. ... I'm glad your guide is available now and will recommend it to anyone new to the internet. I wish it had been available 5 years ago when I was a newbie.
|
| Power Googling: Getting What You Want from Google |
In this presentation, you can learn
| Selecting Search Terms & Crafting Your Query |
The search terms you enter and the order in which you enter them affect
Use words likely to appear on the pages you want.
USE [ Australia
Target store ]
NOT [ Does
Australia have Target ]
Avoid using words that you might associate with your topic, but you wouldn't expect to find on the designated page(s).
USE [ lasik eye surgery ]
NOT [ documentation
on lasik eye surgery ]
USE [ jobs
product marketing Sunnyvale ]
NOT [ listings
of product marketing jobs in Sunnyvale ]
Be specific: Use more query terms to narrow your results.
Does your query have enough specific information for Google to determine unambiguously what you're seeking?
USE [ Java Indonesia ],
[ java coffee ], or
[ java
programming language ]
NOT [ java ]
How can you come up with more specific search terms?
Consider answers to the questions, who?, what?, where?, when?, why?, and how?
Add a term that distinguishes among them.
USE [ Tom
Watson MP ],
[ Tom
Watson golf ], or
[ Tom
Watson IBM ]
NOT [ Tom Watson ]
USE [ quit smoking program ]
NOT [ program on quitting tobacco cigarette smoking addiction ]
You don't have to correct your spelling.
When you enter: [ Anna Kornikova tennis ]
Google responds: Did you mean: Anna Kournikova tennis
To search for a phrase, a proper name, or a set of words
in a specific order, put them in double quotes.
A query with terms in quotes (" ") finds pages containing the
exact quoted phrase.
Find pages mentioning Google's co-founder
Larry Page.
[ "Larry
Page" ]
Find pages containing
[ Larry
Page ]
A quoted phrase is the most widely used type of special
search syntax.
[ "what
you're looking for is already inside you" Anne Lamott
speech ]
Precede each term you do not want to appear in any result with a "-" sign.
Do not put a space between the - and the word.
USE [ dolphins -football ]
NOT [ dolphins
Search for a twins support group in Minnesota, but exclude pages with the word "baseball".
[ twins
support group Minnesota -baseball ]
You can exclude more than one term.
Find pages on "salsa" but not the dance nor dance classes.
Find synonyms by preceding the term with a ~, which is known as the tilde or synonym operator.
The tilde (~) operator takes the word immediately following it and searches both for that specific word and for the word's synonyms. It also searches for the term with alternative endings.
Put the ~ (tilde) next to the word, with no spaces between the ~ and its associated word.
USE [ ~lightweight laptop ]
NOT [ ~ lightweight laptop ]
Why does Google use tilde? In math, the "~" symbol means "is similar to".
The tilde tells Google to search for pages that are synonyms or similar to the term that follows.
[ ~inexpensive ] matches
"inexpensive," "cheap,"
"affordable," and "low cost"
[ ~run ] matches "run,"
"runner's," "running,"
as well as "marathon"
Looking for a guide, help, tutorial, or tips on using Google?
[ google ~guide ]
Interested in food facts as well as nutrition and cooking information?
[ ~food ~facts ]
The synonym operator tends not to work well on infrequently-used query terms.
[ ~cockroach ]
If you don't like the synonyms that Google suggests when you use the ~ operator, specify your own synonyms with the OR operator, which I describe next.
Specify synonyms or alternative forms with an uppercase OR or | (vertical bar).
[ Tahiti OR Hawaii ]
[ Tahiti | Hawaii ]
[ blouse
OR shirt OR
chemise ]
[ blouse
| shirt |
chemise ]
Note: If you write OR with a lowercase "o" or a lowercase "r," Google interprets the word as a search term instead of an operator.
Note: Unlike OR, a | (vertical bar) need not be surrounded by spaces.
Use quotes (" ") to group compound words and phrases together.
[ filter OR stop "junk email" OR spam ]
[ "New Zealand
[ recumbent bicycle $250..$1000 ]
Find the year the Russian Revolution took place.
[ Russian Revolution 1800..2000 ]
This table summarizes how to use the basic search operators, described on this page. You may include any of these operators multiple times in a query.
Notation Find result Example terms1 terms2 with both term1 and term2 [ carry-on luggage ] term1 OR term2
term1 |term2 with either term1 or term2 or both [ Tahiti OR Hawaii ]
[ Tahiti | Hawaii ]" phrase" with the exact phrase, a proper name, or a set of words in a specific order [ "I have a dream" ]
[ "Rio de Janeiro" ]"term" with term (The quotation marks operator, " ", is used around stop words that Google would otherwise ignore or when you want Google to return only pages that match your search terms exactly.) [ "i" spy ] -term without term [ twins minnesota -baseball ]~ term with term or one of its synonyms
(currently supported on Web and Directory search)[ google ~guide ] number1..number2 with a number in the specified range
[ annual report 2000..2003 ]
| How Google Interprets Your Query |
Understanding how Google treats your search terms will help you
Google returns only pages that match all your search terms.
Because you don't need to include the word AND between your terms, this notation is called an implicit AND.
Because of implicit AND, you can focus your query by adding more terms.
[ compact lightweight fold-up bicycle ]
Note: If you want pages containing any (instead of all) of your search terms, use the OR operator.
Google returns pages that match your search terms exactly.
"Google simply matches strings of characters together and doesn't currently base inferences on uses of the language." —Internet Research, Second Edition by Ned Fiedlen (McFarland & Company, 2001)
If you search for ... Google won't find ... cheap inexpensive tv television children kids Calif OR CA California
Google returns pages that match variants of your search terms.
The query [ child
bicycle helmet ] finds pages that contain words that are
similar to some or all of your search terms, e.g.,
"child," "children," "children's,"
"bicycle" "bicycles," "bicycle's," "bicycling," "bicyclists,"
"helmet", "helmets."
Google calls this feature word variations or automatic stemming.
Google ignores some common words called "stop words," e.g., the, on, where, how, de, la, as well as certain single digits and single letters.
[ lyrics to the Dixie Chicks' songs ]
Google used to indicate when it ignored words, but does so less frequently now.
If your query consists only of common words that Google normally ignores, Google will search for pages that match all the terms.
[ the who ]
Note: Google will search for stop words included in quotes, which it would otherwise ignore.
USE [ "to be or not to be" ]
The first few results from the same query but without quotes returns quite different results.
NOT [ to be or not to be ]
Note: Use + operator in front of stop words that Google would otherwise ignore or when you want Google to return only those pages that match a search term exactly.
USE [ Star
Wars +I ]
NOT [
Star
Wars I ]
Google limits queries to 32 words.
The following query finds sites that have included Google Guide's description of how Google works.
Google favors results that have your search terms near each other.
Google considers the proximity of your search terms within a page.
[ snake grass ]
[ snake
in the grass ]
Although Google ignores "in" and "the," Google gives higher priority to pages in which "snake" and "grass" are separated by two words.
Google gives higher priority to pages that have the terms in the same order as in your query.
[ New
York library ]
[ new
library of York ]
Google ignores some punctuation and special characters, including ! ? , . ; [ ] @ / # < > .
[ Dr. Ruth ] returns the same results as [ Dr Ruth ]
Note: There are exceptions, e.g., C++, $99, and math symbols, such as /, <, and >, are not ignored by Google's calculator.
If you're seeking information that includes punctuation that Google ignores, just enter the whole thing including the punctuation.
[ info@amazon.com ]
[ part-time
] matches "part-time," "part time," and "parttime"
[ part time
] matches "part-time" and "part time"
[ e-mail ] matches
"e-mail," "email," and "e mail"
[ email ] matches
"email"
If you aren't sure whether a word is hyphenated, go ahead and search for it with a hyphen.
The following table summarizes how Google interprets your query.
Search Behaviors Descriptions Implicit AND Google returns pages that match all your search terms. Because you don't need to include the logical operator AND between your terms, this notation is called an implicit AND. Exact Matching Google returns pages that match your search terms exactly. Word Variation
Automatic Stemming Google returns pages that match variants of your search terms. Common-Word Exclusion Google ignores some common words called "stop words," e.g., the, on, where, and how. Stop words tend to slow down searches without improving results. 32-Word Limit Google limits queries to 32 words. Term Proximity Google gives more priority to pages that have search terms near to each other. Term Order Google gives more priority to pages that have search terms in the same order as the query. Case Insensitivity Google is case-insensitive; it assumes all search terms are lowercase. Ignoring Punctuation Google ignores most punctuation and special characters including , . ; ? [ ] ( ) @ / * < >
| What Appears on the Results Page |
The results page is filled with information and links, most of which relate to your query.
Google Logo: Click on the Google logo to go to Google's home page.
Statistics Bar: Describes your search, includes the number of results on the current results page and an estimate of the total number of results, as well as the time your search took. For the sake of efficiency, Google estimates the number of results; it would take considerably longer to compute the exact number. This estimate is unreliable.
Every underlined term in the statistics bar is linked to its dictionary definition. Queries that are linked to just one definition are followed by a definition link.
Tips: Sometimes Google displays a tip in a box just below the statistics bar.
Search Results: Ordered by relevance to your query, with the result that Google considers the most relevant listed first. Consequently you are likely to find what you're seeking quickly by looking at the results in the order in which they appear. Google assesses relevance by considering over a hundred factors, including how many other pages link to the page, the positions of the search terms within the page, and the proximity of the search terms to one another.
Below are descriptions of some search-result components. These components appear in fonts of different colors on the result page to make it easier to distinguish them from one another.
Snippets: (black) Each search result
usually includes one or more short excerpts
of the text that matches your query with your search terms in
boldface type. Each distinct excerpt or snippet is separated
by an ellipsis (...). These snippets, which appear in a black
font, may provide you with
When Google hasn't crawled a page, it doesn't include a snippet. A page might not be crawled because its publisher requested no crawling, or because the page was written in such a way that it was too difficult to crawl.
URL of Result: (green) Web address of the search result. In the screen shot, the URL of the first result is inventors.about.com/library/weekly/aa042597.htm.
Size: (green) The size of the text portion of the web page. It is omitted for sites not yet indexed. In the screen shot, "5k" means that the text portion of the web page is 5 kilobytes. One kilobyte is 1,024 (210) bytes. One byte typically holds one character. In general, the average size of a word is six characters. So each 1k of text is about 170 words. A page containing 5K characters thus is about 850 words long.
Large web pages are far less likely to be relevant to your query than smaller pages. For the sake of efficiency, Google searches only the first 101 kilobytes (approximately 17,000 words) of a web page and the first 120 kilobytes of a pdf file. Assuming 15 words per line and 50 lines per page, Google searches the first 22 pages of a web page and the first 26 pages of a pdf file. If a page is larger, Google will list the page as being 101 kilobytes or 120 kilobytes for a pdf file. This means that Google's results won't reference any part of a web page beyond its first 101 kilobytes or any part of a pdf file beyond the first 120 kilobytes.
Date: (green) Sometimes the date Google crawled a page appears just after the size of the page. The date tells you the freshness of Google's copy of the page. Dates are included for pages that have recently had a fresh crawl.
Limiting the number of results from a given site to two ensures that pages from one site will not dominate your search results and that Google provides pages from a variety of sites.
More Results: When there are more than two results from the same site, access the remaining results from the "More results from..." link.
When Google returns more than one page of results, you can view subsequent pages by clicking either a page number or one of the "o"s in the whimsical "Gooooogle" that appears below the last search result on the page.
If you find yourself scrolling through pages of results, consider increasing the number of results Google displays on each results page by changing your global preferences (see the section Changing Your Global Preferences).
In practice, however, if pages of interest to you aren't within the first 10 results, consider refining your query instead of sifting through pages of irrelevant results. To simplify such refinements, Google includes a search box at the bottom of the page you can use to enter your refined query.
Sponsored Links: Your results may include some clearly identified sponsored links (advertisements) relevant to your search. If any of your search terms appear in the ads, Google displays them in boldface type.
Spelling Corrections, Dictionary Definition, Cached, Similar Pages, News, Product Information, Translation, Book results: Your results may include these links, which are described on the next few pages.
Here's another screen shot of the results page in case the one at the top of this page scrolled off your screen.
For more on what's included on Google's results page, visit www.google.com/help/interpret.html.
| How Google Works |
If you aren't interested in learning how Google creates the index and the database of documents that it accesses when processing a query, skip this description. I adapted the following overview from Chris Sherman and Gary Price's wonderful description of How Search Engines Work in Chapter 2 of The Invisible Web (CyberAge Books, 2001).
Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:
Let's take a closer look at each part.
Googlebot, Google's Web Crawler
Googlebot is Google's web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It's easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn't traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google's indexer.
Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it's capable of doing.
Googlebot finds pages in two ways: through an add URL form, www.google.com/addurl.html, and through finding links by crawling the web.
Unfortunately, spammers figured out how to create automated bots that bombarded the add URL form with millions of URLs pointing to commercial propaganda. Google rejects those URLs submitted through its Add URL form that it suspects are trying to deceive users by employing tactics such as including hidden text or links on a page, stuffing a page with irrelevant words, cloaking (aka bait and switch), using sneaky redirects, creating doorways, domains, or sub-domains with substantially similar content, sending automated queries to Google, and linking to bad neighbors. So now the Add URL form also has a test: it displays some squiggly letters designed to fool automated "letter-guessers"; it asks you to enter the letters you see — something like an eye-chart test to stop spambots.
When Googlebot fetches a page, it culls all the links appearing on the page and adds them to a queue for subsequent crawling. Googlebot tends to encounter little spam because most web authors link only to what they believe are high-quality pages. By harvesting links from every page it encounters, Googlebot can quickly build a list of links that can cover broad reaches of the web. This technique, known as deep crawling, also allows Googlebot to probe deep within individual sites. Because of their massive scale, deep crawls can reach almost every page in the web. Because the web is vast, this can take some time, so some pages may be crawled only once a month.
Although its function is simple, Googlebot must be programmed to handle several challenges. First, since Googlebot sends out simultaneous requests for thousands of pages, the queue of "visit soon" URLs must be constantly examined and compared with URLs already in Google's index. Duplicates in the queue must be eliminated to prevent Googlebot from fetching the same page again. Googlebot must determine how often to revisit a page. On the one hand, it's a waste of resources to re-index an unchanged page. On the other hand, Google wants to re-index changed pages to deliver up-to-date results.
To keep the index current, Google continuously recrawls popular frequently changing web pages at a rate roughly proportional to how often the pages change. Such crawls keep an index current and are known as fresh crawls. Newspaper pages are downloaded daily, pages with stock quotes are downloaded much more frequently. Of course, fresh crawls return fewer pages than the deep crawl. The combination of the two types of crawls allows Google to both make efficient use of its resources and keep its index reasonably current.
Google's Indexer
Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google's index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms.
To improve search performance, Google ignores (doesn't index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google's performance.
Google's Query Processor
The query processor has several parts, including the user interface (search box), the "engine" that evaluates queries and matches them to relevant documents, and the results formatter.
PageRank is Google's system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank.
Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page. A patent application discusses other factors that Google considers when ranking a page. Visit SEOmoz.org's report for an interpretation of the concepts and the practical applications contained in Google's patent application.
Google also applies machine-learning techniques to improve its performance automatically by learning relationships and associations within the stored data. For example, the spelling-correcting system uses such techniques to figure out likely alternative spellings. Google closely guards the formulas it uses to calculate relevance; they're tweaked to improve quality and performance, and to outwit the latest devious techniques used by spammers.
Indexing the full text of the web allows Google to go beyond simply matching single search terms. Google gives more priority to pages that have search terms near each other and in the same order as the query. Google can also match multi-word phrases and sentences. Since Google indexes HTML code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.g., in the title, in the URL, in the body, and in links to the page, options offered by the Advanced-Search page and search operators.
Let's see how Google processes a query.
3. The search results are returned to the user in a fraction of a second. 1. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book--it tells which pages contain the words that match any particular query term.
2. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result. Copyright © 2003 Google Inc. Used with permission.
For more information on how Google works, take a look at the following articles.
| Advanced Searching |
When you don't find what you're seeking, consider specifying more precisely what you want by using Google's Advanced Search features.
Don't be frightened by the name "Advanced Search"; it's easy to use, and it allows you to select or exclude pages with more precision than Google's standard search box.
Click on the Advanced Search link at the right of Google's search box.
or visit www.google.com/advanced_search and fill in the form.
Filling in the top portion of the Advanced Search form is an easy way to write restricted queries without having to use the " ," +, -, OR notation discussed in the section Crafting Your Query.
Advanced Search
Find resultsBasic Search
ExampleBasic Search
Find resultswith all of the words [ tap dance ] with all search terms with the exact phrase [ "tap dance" ] with terms in quotes in the specified order only without the words [ tap -dance ]
[ -tap dance ]including none of the terms preceded by a - with at least one of the words [ tap OR ballet ]with at least one of the terms adjacent to OR
By using special characters and operators, you can
Advanced Operators are query words that have special meaning to Google.
Note: The colon (:) after the operator name is required.
The following table lists features available on the Advanced Search page that are accessible via search operators.
Advanced Search
FeaturesSearch
OperatorsFile Format filetype: Occurrences
in the title of the page
in the text of the page
in the URL of the page
in the links to the page
allintitle:
allintext:
allinurl:
allinanchor:
Domain site: Similar related: Links link:
The following table lists the search operators that work with each Google search service. Click on an operator to jump to its description
Search Service Search Operators Web Search allinanchor:, allintext:, allintitle:, allinurl:, cache:, define:, filetype:, id:, inanchor:, info:, intext:, intitle:, inurl:, inlink:, phonebook:, related:, rphonebook:, safesearch:, site:, stocks:, Image Search allintitle:, allinurl:, filetype:, inurl:, intitle:, site: Groups allintext:, allintitle:, author:, group:, insubject:, intext:, intitle: Directory allintext:, allintitle:, allinurl:, ext:, filetype:, intext:, intitle:, inurl: News allintext:, allintitle:, allinurl:, intext:, intitle:, inurl:, location:, source: Froogle allintext:, allintitle:, store:
For tips, tricks, and examples of many special features and advanced operators, visit the Google Guide Example Reference (Cheat Sheet), www.googleguide.com/example_ref.html.
| Conclusion and Links to Cheatsheets |
Google strives to make it easy to quickly find whatever you're seeking, whether it's a web page, a recent news story, a photograph, advice, or a present for a friend. The following cheatsheets provide handy summaries of some of Google's features and services.
I sincerely hope that Google Guide's Power Googling has helped you become (more) proficient in using Google. I have tried to anticipate your questions and problems. Please let me know if I have missed something or if you have corrections or suggestions for improving Google Guide, by emailing feedback(at)googleguide.com (replace "- at -" by "@"). I welcome all comments. I look forward to hearing from you. --Nancy
This page was last modified on Tuesday February 28, 2006.
| [Home] [Intro] [Contents] [Favorites] [Query Input] [Understanding Results] [Special Tools] [Services] [Developing a Website] [Appendix] |
|
For Google tips, tricks, & how Google works, visit
Google Guide at www.GoogleGuide.com. By Nancy Blachman and Jerry Peek who aren't Google employees. For permission to copy & create derivative works, visit Google Guide's Creative Commons License webpage. |