What Google Can’t Find in Australia and NZ

(This post is about what Google can’t find, with Australia- and NZ- based examples. It should be useful for everyone, no matter where they live.)

Vertical-googlebot

At the recent Sourcing Summits NZ and Australia I asked the audience to name some cases where web pages would not be found by Google. Examples provided included: “some people are not online” (at all!) and “if I am logged in to my bank, I hope Google doesn’t find this info”.

Further, we agreed, that if anyone is logged-in to a member site (whether paid or not), then the pages seen by that individual contain “inside” data, often personalized to the member, and therefore, in most cases, cannot be found by Google.

Another example brought up was “a page with search results”, using a search engine, or using any site providing search. Those results are pulled onto one page just for us to look at them that once. The pages with search results are usually short-lived; Google will not find them.

There are a few more general categories of web pages, that Google will not find, not covered by the above. One case to remember is that webmasters can tell Google to stay away from some pages. Websites can prevent Google from indexing portions of the sites, by providing directives in the file named robots.txt. Google will respect the rules.

It’s not true that Google can find “most of it” either (as someone said they’d hope). Here’s what the big picture looks like:

The-Deep-Web1

(By the way saying “Deep Web search on Google”, an expression I heard from an American colleague -who also claimed good skills in that – is not right. Deep Web is, by definition, what Google can’t find.)

Of course, Google finds LOTS and is worth extensively using for research.

It’s important, however, to straighten the expectations about what can be found and what can’t. Here’s a specific case that may not be clear upfront. Even if you do not have to log into a site – some of the site’s pages, dynamically created in response to a search, or, more generally, pages, pulling out from a database and including some info “dynamically”, just to show the page, may not be found by Google.

EXAMPLE

As an example, let’s consider a search for corporate members on the Recruitment Association RCSA

rcsa-search

Sure enough, the page depicted above is “constructed dynamically” – and will not be found by Google.

Let’s also look at one of the results:

rcsa-search-result

 

This page, while being “just” a listing page for a member, cannot be found by Google either! None of the four member listing pages found in the search above, can be found by Google. If you are in doubt, you can try searching for them: site:membershipcentre.rcsa.com.au “absoluteit.co.nz”

Generally, if you see a question mark ? in a page’s URL, like for the page above, chances are split in terms of Google finding the page. It may, it may not; it depends. (We’ll go into investigating a bit deeper in a future post.)

The moral of the story is: if it’s not straightforward whether a site can be “fully” X-Rayed for content, it is always a good idea to try searching both by X-Raying and “internally”. You might be finding more results by combining both ways. In the RCSA example above, internal searches will provide about 98% listings that X-Ray won’t.

For an in-depth Sourcing Methodologies study and specific Australia- and NZ-based examples to illustrate and work with, please check out my upcoming Webinar (Aug 26, 2014).

Sample Sourcing Certification Exam Questions

certification

Are you interested in getting certified?

Our – substantially reworked and improved in 2014 – Sourcing Certification Program now offers to take the Exam and Getting Certified independently of signing up for subscription to the interactive materials, as we have called them, “The SourceBook“.

You would definitely be well prepared if you subscribe to the material. But if you feel ready and confident, you can give the Exam a try and become Certified. The Exams are offered monthly, at the end of each month. The next round is coming up at the end of August 2014, and we already have quite a few folks signed up for it.

For those who are curious, here are a few sample Exam questions. Please note: no answers will be provided to these; of course, you are welcome to compare the notes in the comments. The questions are not too difficult, but you’d need t do a bit of research for each one. The Exam has 60 questions to answer.

Q1) For the person whose work phone number is 413-794-3248, what is her work email address?

Q2) Find a page on the High Country Fusion Company website, listing the company’s Australia staff. One of the people listed on the page does not have a LinkedIn account. What is the last name of that person?

Q3) One of the Engineering staff listed in this Directory has a work phone number publicly shared on his LinkedIn profile. What is that number?

Q4)  The site http://doleta.gov  has an Excel file (the xlsx type) that contains a long list of professionals with the names, company names, titles, emails, phone numbers, and addresses. How many records does the list have?

Q5) Complete the search string by replacing the ??? by a word, to create a Google search for profiles of members of Florida Direct Marketing Association:

site:fdma.org/???

Q6) Which of the following search strings on Google will find not only the pages with the word Programmer, but also some pages with of its synonyms?

  1. Programmer
  2. ~Programmer
  3. Programmer OR Developer OR Engineer
  4. Any one of the above will work

That’s it for now! Additionally, you can test your Sourcing knowledge on the site and get a response from an auto-checker.

The informational materials on the Sourcebook, including an outline and a sample video – can be found on the Certification site.

Roads Less Taken

roads-less-taken

By all means we should be reaching for the “low hanging fruit” first. To find the target potential candidates we should first go to the ATS, search on Job Boards and on LinkedIn. It would be silly to ignore the easy available sources and do something “exotic” instead.

However, everyone else is also going to the same search dialog on LinkedIn and typing in similar, if not the same, keywords, based on the role. As a result, for the popular roles, the same LinkedIn members get similar-sounding InMails. This leads to: 1) their annoyance and 2) our drop in productivity.

Yet some professionals do have profiles on LinkedIn, could be perfect matches for the roles, and are rarely found.

Here’s how to

Find Hard-to-Find Professionals That Others Don’t

The below Sourcing Scenario is based on using MS Outlook Social Connector that I wrote about earlier, describing how to find almost anyone’s email in that post.

I’ll use an example just discussed at the Sourcing Summit in New Zealand. As you will see, some of the professionals identified this way would not be found by the above LinkedIn search – even using a very long Boolean OR statement.

Step 1. (Step away from LinkedIn for a bit.) Search for JavaScript Developers on Github in NZ:

nz-sosu

 Step 2. Collect the email addresses using an email extractor of your choice; then go to the Outlook Social Connector:

nz-sosu-1

Step 3. Review the profiles.

nz-sosu-2

As you can see, this way you may find skilled professionals, whom others are not likely to find. This is a road less taken – but it’s not hard to take at all.

The MS Outlook Connector is not the only way to perform this kind of “Lazy” Sourcing (where I described the general concept) but it is certainly an excellent sourcing and a general productivity tool. I recommend using it.

P.S. Sourcers –  Any guesses where the above photo was taken? :)

Sourcing Methodologies – Thu July 24th, 2014

sourcing-theory

 

In this newly developed webinar, “Sourcing Methodologies”,  I will cover several innovative sourcing and research concepts, that, once put to work, are going to make your Sourcing soar to the new heights! (See the registration link below.)

“X-Raying”, “Flipping” and “Peeling” are ways to source talent, that were named and conceptualized fifteen years ago. While they remain perfectly applicable and useful, today’s Internet has a gigantic volume and much more complex structure, compared to back then. The modern Sourcing Theory, that I will cover, takes full advantage of what today’s Internet offers. The material is as an outcome of creative (my and colleagues) hands-on Sourcing Practice across industries.

I have named some selected concepts below; sign up for the webinar for the full, detailed coverage, along with plenty of examples.

1. “Visualize Success”: imagine what you are going to find, then use that to find the target pages. While this sounds like a common sense approach, some of its “extreme” applications may surprise you.

2. “Follow the Leads”: identify your ideal candidate online, then find other promising profiles by looking up that professional’s traces online. Consistently following those traces (as described in a post sometime ago) works wonders, especially if your target professionals are members of committees, associations, or other professional “gatherings”.

3. “Sourcing without Searching” (described in the previous post as  “Lazy Sourcing”): obtain a set of data with known structure (such as emails, resumes, or lists of contacts) from the web, then, parse, sort, and filter. There are some true gems that can be found this way. Existing tools make it doable by anyone, not just by “geeks”.

4. “Cross-Referencing”: starting from incomplete initial data, build up professional profiles by locating and assembling the professional bio details. It’s now powered by Social Lookup tools (such as MS Outlook Social Connector) and by creative uses of existing functionality of Social Networks.

While you may already be using some elements of these concepts in your day-to-day sourcing, being aware of the theory behind it will facilitate consistent productivity – and enjoyment – of your work. Get on the phone with that potential candidate faster!

Who should attend: Recruiters, Sourcers, and everyone looking for professionals online.

The webinar will be especially useful for those who feel that their resources are getting exhausted, or that their sourcing is too labor-intensive, and are looking for new ways to source.

Date: Thursday July 24th, 10 AM Pacific Time
Length: 90 minutes
Register: [NOTE - JULY 24: THE REGISTRATION IS NOW CLOSED. PLEASE CONTACT GEORGE@SOURCINGCERTIFICATOIN.COM WITH QUESTIONS]: (you will receive the login instructions within 24 hours after submitting a payment; you will get all the materials one day after the webinar)
Included: Slides, video-recording, and one month of support applying theory to practice

Seating is limited; sign up early.

“Lazy” Sourcing

gold

I imagine that you would agree with me that advanced Boolean searching for talent on Google is harder than searching on a job board or on LinkedIn, which, in turn, is harder than sorting and filtering a list of professionals in an Excel table. Sure enough, advanced search offers control over the results, but in some cases it becomes very elaborate even for skilled Sourcers – and inefficient as well.

An example is searching for “all” women names in searches for diversity. The challenges would be: limits on the length of the search string and the number of keywords; (severe) limits on the numbers of results displayed for any search; and some names that can be either male and female, to name a few.

As another example, an exhaustive Boolean search using job titles would require significant upfront research for what these professionals are called at target companies (that could vary greatly!) and will run into the search limitations as well.

Yet even if there’s a large Excel file, and even if some records have no relevance to the target whatsoever, you might pick up the promising records quickly. More generally, if you put your results into a system capable of searching, sorting, and filtering, that would make a difference in searching efficiency and the results. Searching in a set of records is much easier than searching among volumes of unstructured web pages.

So here’s a concept of  “Lazy Sourcing”. Post a job… no, I didn’t mean to talk about that.

“Lazy Sourcing”: get tons of info, perhaps most if it being irrelevant, then filter out what’s good.

As one example, you could search for text and Word files, that are, potentially, resumes, using simple, open-ended searching (vs. exhaustive keyword combination searching) and save them on your hard drive using Outwit Doc; then, search within the files. If many of the found files are not even resumes, that is fine too and is easier to digest when you have the files handy. (You can do the same with PDF files, but then you might need additional software for searching within the set – actually, if you have one to recommend, please do.)  Use a resume parser if you have access to one.

Another example would be collecting email addresses on an Association site (if it’s doable, of course), then cross-referencing against LinkedIn (just use the technique described in the referenced post) and only reviewing the records matching your target locations and companies. Note that this would not require any keyword searching at all. Additionally, if you did search with keywords you wouldn’t be finding many of the records that show us using this technique.

You would certainly need to eyeball the results before taking any action, but that is necessary no matter how you begin.

 

 

 

 

 

Searching for Contact Info

email-linkedinHere’s a brief note on searching for contact information on LinkedIn. Positing visible contact info on the profiles is discouraged; but, as we know, many members do, particularly those who are Recruiters, Sales, and “Open Networkers”.

At the same time, LinkedIn quietly takes some measures for us to see less of that contact info.

As an example, a search for “gmail.com” in the UK population returns about 250 results, while an X-Ray for “gmail.com” returns “About 3,570 results” and shows about 700. Sure enough, there may be false positives in the X-Ray, but it’s easy to locate some profiles found in X-Ray that are not included in the above LinkedIn internal search. If you are still in doubt, here is a more narrow search; compare internal search vs. X-Ray for the London, UK Area.

Here is another interesting example, showing the same tendency. Search for “com” in the last name, narrow to a the Greater Chicago Area; compare Internal search (under 200 results; 3 more if you search in the first name field) vs. X-Ray (About 460 results).

The moral of the story is that X-Raying for gmail-based email addresses and, possibly, anything else that points to an email address included in the profiles, would bring much better results than internal search.

As a side note, there’s quite a bit of other interpretation going on in the LinkedIn internal search, which didn’t use to happen. Somehow the internal search often recognizes keywords that sound like last names and would interpret the search as if you put the keyword in the last name field. That means that profiles mentioning someone by name would not be found. More on that later…

SourcePedia: Reference and Teaser

Here’s a preview of some of the reference materials I will be sharing at the upcoming Webinar – “SourcePedia”, repeated live on Tuesday, July 15th, 2014. Seating is limited; sign up early to get your spot.

(1) All the LinkedIn names for the Geo-locations, world-wide. You would need these for X-Ray searches, such as this

site:www.linkedin.com/in OR site:www.linkedin.com/pub -pub.dir “location * Glasgow, United Kingdom”

Reference (that you can use): the list of ALL the US-based locations:

LI-US-Locations

 

(2) A “Vocabulary” of 40K+ LinkedIn Skills. (worth attending for this one alone!)

Teaser: the Word Cloud for ALL the Skills:

word-cloud-skills

(made with Wordle)

Plus, see covered:

  • Global Search Engines: Syntax
  • Google Boolean Syntax Charts and examples
  • Comparison Search Engines Charts
  • Postal and Telephone Code References
  • Real Time Search Engines
  • Social Sites Directory
  • Registered, Certified, and Licensed Professionals – Resources
  • Diversity Resources
  • Productivity Browser Add-Ons  List
  • Scraping, Parsing, Filtering, and Sorting Tools List
  • Social Cross-Referencing Tools List
  • …and more.

The idea is to provide a customizable set of references for everyone who searches for professionals on the web. The crowd that had attended the first delivery of the webinar expressed high satisfaction with the content. I hope to “see” many of you at upcoming webinar repeat next week.

 

SourcePedia (Sourcing Reference): Webinar Tuesday July 1st

sourcepedia

Announcing a new Sourcing Webinar:

SourcePedia (Sourcing Reference): Tuesday July 1st

Which search engines index the Internet globally? Where could I find the list of current Boolean operators for sourcing in a one-page document? What are the differences between the Google and the Bing search syntax? Where can I look up the corporate email formats for a given company? Which email collection and verification tools are out there? Where do I find a good list of diversity-related associations? What are the sites that list certified and licensed professionals? How do I look up a company’s competitors? What are the names of all the locations that LinkedIn uses in each country (e.g. Australia or Canada)? What is the up-to-date tool to look up the hidden names?…

This 90-minute webinar is packed with up-to-date references, resources, and tip sheets, that anyone would find to be handy in the practice of searching for target professionals online. It will answer all the questions above and many more.

The materials are all yours to keep and to enjoy in daily sourcing practice. Besides, one month of support is provided to help you to master the reference materials further.

Who should attend: if you search for professionals online as part of your job, this webinar is for you!

Seating is limited; register early.

Date: Tuesday, July 1st, 2014
Time: 9 AM PDT / 12 PM EDT / 5 PM London
Duration: 90 minutes

Included: The slides for the webinar, a complete recording of the webinar, and one month of support

Register at http://sourcingcertification.com/sourcepedia

Mini-Sourcing Contest :)

It’s been a while since I ran the last sourcing contest. Here is a new one. Try your sourcing and research skills!

The first person to email me the correct answer will be eligible to take the Sourcing Certification Exam in July or August 2014 (their choice; the fee waved) and, as usual, will be featured on the Boolean Network.

Read on.

A good friend of mine is a tour guide in Moscow, Russia. She loves to take photos. Yesterday she sent me a photo of a cool-looking car, brought along by American travelers, whom she was helping to get around during their visit.

DSC_0013

I was curious about the visitors.

It turns out that one of the people riding in that car is a woman with an interesting professional background. In the past she used to be a flight test pilot in USAF and spent hundreds of hours in the air. She has other accomplishments too – including several college-level degrees.

The Contest Question: In what year did she get her PhD and what was the title of her PhD thesis?

Deadline: July 2, 2014 (closed if not solved by then).

Tip: you do not need to locate my friend and ask her for any information to find the answer. Google will do the job.

When I receive the correct answer, I will update the post.

Best of luck!

—————–

Update:

Hey All – nice work! I have received the correct answer – which is found by first Googling the car license plate – from several people. You can find more information about the car in this video and an interesting report on the same site about the tour. The rest is also done by Googling. :)

The first person to send me the correct answer was Vidhya Lingappan of Roland & Associates, a company in India that has many excellent researchers, some of whom I have met “virtually”.

Roland himself has achieved the Advanced People Sourcing Certification and is listed among experts on our site.

Congratulations! Vidhya solved it together with a coworker, Siva, so it was a “team” effort.

More to come!

X-Raying Twitter Is Easy

twitter-xray

X-Raying Twitter for member bios has just become really easy. We don’t have to struggle excluding the non-bios via something like

…-inurl:lists -inurl:members -inurl:hashtag -inurl:status -inurl:statuses…

That is because Twitter has recently added a new separate view with “Tweets and replies”. A URL for the “Tweets and replies” ends in /with_replies, while preserving all of the bio information.

Therefore, we now have two ways to X-Ray for bios only:

site:twitter.com inurl:with_replies [add keywords]

(easy!) – or, if you like to get to the original profile URLs for some reason, you can search for

site:twitter.com “Tweets and replies” -inurl:with_replies [add keywords]

When X-Raying Twitter bios, you can use Google search syntax to find people with certain ranges numbers of followers or following, or find those who joined within a certain Month/Year range:

Of course, we have little control over the keywords appearing in the bios vs. in the tweets that were present on the bio page when it was indexed. That is true about the location names as well. We could search for “San Francisco” or “Atlanta GA” and hope that we find these words as locations and not in tweets, but we would need to check.

Why can X-Raying be useful? My former favorite Twitter bio search Tweepz, created by the same people who are behind the Exalead search, stopped tweeting last November and the site shows some signs of decline. On the other hand, Google’s advanced syntax and proximity search expressed via the asterisk * may help to do some creative searches.

I wish Twitter would put some labels by the bio and by the location for easier X-Raying. I hope it will! Now, while it’s not easy to “isolate” the bio, it is possible to X-Ray for tweets of a given person – or any person. The elements status and statuses in the URLs let us do that:

It is possible to X-Ray for Lists as well. Here is an example:

If you are an expert in advanced Boolean searches on Google, you’d be surprised, but in this context you can search for lists that include a given twitter handle by using that special character in the search string, @, that never helps to find email addresses on Google. Take a look:

Finally, Twitter can also be X-Rayed for popular hashtags:

This concludes a brief investigation of the now-easy twitter X-Raying.

Of course, Twitter itself has Advanced Search and Twitter List search – those and the real time search Topsy (acquired by Apple last year) provide nice search capabilities. But some of the above searches they can’t do.