Google indexes 35 trillion web pages. (Compare the volume with LinkedIn’s. LinkedIn profiles are 0.001% of Google’s Index!)
However, mining Google is not straightforward because the web has different kinds of pages. We can search for terms in the page titles, URLs, or links to the page but usually not for values like job titles or companies. If you want a reminder, here is The Full List of Google Advanced Search Operators.
Compared to, say, ten years ago, Google’s search has changed dramatically, not only adding trillions of pages but learning to recognize user’s intent when they are searching. That is a definition of Semantic Search. Because of clever Google’s interpretation of search strings, we need to be Masters of Boolean – and know when to let Google control results as well.
From our experience training, we have consistently seen two kinds of attendees. Some have an open mind and try to “get it” and apply what they learn right away. We offer a month of support on all our offerings to stimulate that – and are celebrating with the students when they get to understanding and success. Others, especially self-identified “old school” recruiters, copy and keep search strings or use Boolean Builders, all of which have less than optimal templates, or write very long strings on Google. They do not try to understand why the results look like this or that on the screen.
However – everybody can learn, and many have! Boolean search is not Rocket Science. All you need is an open mind and a computer with a browser and wi-fi.
If you are Googling, do overcome these mistakes and unhelpful habits:
- There is no operator AND
- NOT needs to be written as the minus -, no space between the minus and keyword
- Parentheses are ignored. OR is always a priority (unlike it is on Bing or LinkedIn)
- Operators like site: must be lower-case, no space between the operator and keyword
- Do not trust or compare the numbers of results
- Put your terms in the order you expect to see them
- Do not try to “catch everything” with one search string. Strings are not “built”, they are run and immediately modified
- Do not be a perfectionist. Searches will have false positives and miss something. Your goal is not strings but results
- Overusing the operator OR leads to shrinking results – search simple
I cannot stress #9 enough. It is not a good idea to use ORs on Google. (I never do). I can give you numerous examples of how simple search works better than long ORs. It is time to change this habit.
Please join me at our fully refreshed webinar “Boolean Basics & Beyond” coming to your laptop at home on April 7, 2020. We will mostly cover Google but will talk about other sites as well. Seating is limited (due to our need to support everyone who signs up).
Wow! No wonder my iterative searches don’t swing the pendulum the way I’m expecting, because my methodologies are outdated. I’m interested in joining the webinar.
Hi Ryan, sounds good! Please sign up at https://sourcingcertification.com/booleanandbeyond/.
Hi Irina. Thanks so much for your post. Can you give the source of the “35 trillion pages” indexed by Google?
Google has always been very stingy with information on the subject. And I only find a few references evoking “estimates” and “studies”… but without ever citing the original source. Already in 2014, a certain JJ Rosen, founder of the company Atiba, affirmed it without giving sources (https://eu.tennessean.com/story/money/tech/2014/05/02/jj-rosen- popular-search-engines-skim-surface / 8636081 /). In 2016, Google explained that they evaluated the size of the Web at 130 trillion pages, without ever claiming that they indexed them (https://www.seroundtable.com/google-130-trillion-pages-22985.html). Today, the How Search Works site (https://www.google.com/search/howsearchworks/) no longer displays any numbers. On the other hand, I find this figure of 35 trillion in a communication from Moz claiming to work on 35 trillion URLs and not pages (https://moz.com/blog/link-count-metrics).
Google number of paged indexed by Google or something like that. (Read the article!)
Hi Irina. Be sure, I carefully read your article… and I do not see any reference indicating from where and from when this number of 35 trillion pages in Google Index comes out. Just curious to know.
The article says to Google simple things. 🙂
On it! https://twitter.com/henkvaness/status/1247106321297670144?s=20
hi – does the same apply to searches on Google Scholar (especially with regards to OR and parentheses)? thank you!
There is no need for parentheses in any Google special search, Scholar included. Yes, I would say don’t use ORs there either.