A Tool, a Hack, a Link, and a Site

booleanstrings Boolean Leave a Comment

#hacker

David Galley and I are obsessed with tools, hacks, links, and sites. On our Slack, we have a channel called “Interesting Links” that gets populated nearly daily. Where do we get the info? Facebook Groups, Recruiting Brainfood Newsletter, shares in Messenger by colleagues, tweets with the hashtag #OSINT, RSS feeds from selected sources, and always, Googling. Once in a while, tool authors get in touch asking to be included in Tools.

Tool: SalesQL

Its target audience is sales, but it works wonders in our sourcing projects (combined with a few others). Not only does it guess contacts, providing a confidence rate – it can also grab many profiles from a search page or download your connections, and keeps results in exportable projects. Some functions are paid.

Hack: guess developer’s emails with more than 15% certainty (then verify).

It is not news that people often use one handle for social accounts and Gmail. The rate of right guesses vary. But – I ran an experiment on 16K+ GitHub handles (that we have collected). Then, I constructed the email guesses as <handle>@gmail.com. I uploaded the list to LinkedIn Recruiter – and got 15% matches! Given that more than half of GitHhub developers likely do not register the same email on GitHub and Linkedin, the true success rate is more like 30%. The selection, intentionally, was from a wide range of locations and languages.

Site: see many LinkedIn public profiles in Incognito

Why would you want to do so? Well – not to visit LinkedIn for any reason. Or, figure out which terms to put in your X-Ray searchGoogle’s Mobile Friendly Viewer.

Check out our Training Library – each class has many tips like the above.

These Keywords Are Not on the Job Description

booleanstrings Boolean 3 Comments

For leadership roles, a professional’s value is aligned with their ability to solve problems, thus allowing the company’s further growth. If a job description calls for a leader, you would be searching for profiles with keywords like leadership, growth, expansion, etc., synonymic to the words in the requirements. (Expect another post soon BTW.)

However, a true problem solver will also have keywords on his/her profile that are usually not in the job description. I am talking about “problem” (or negative) words describing difficult situations – that the person has helped solve. Here is an example (to be used on LinkedIn):

underperforming OR(destroyed) OR(losing) OR(loss) OR(lost) OR(declining) OR(failing) OR(fail) OR(“lack of”) OR(poor) OR(insufficient) OR(“did not meet”) OR(slow) OR(stalled) OR(behind) OR(failure) OR(decline) OR (outdated) OR(obsolete) OR(old) OR(broken) OR(inadequate) OR(“not enough”) OR(problem) OR(problems).

This search returns 26 MLN problem solvers! Unsurprisingly, tech giants lead in the numbers of solvers. Give it a try and let me know if it helps to uncover new results.

You can also use the string in Content search (combine the above with some keywords pointing to an industry or company) and find social shares from leaders.

(Can you think of some string additions? Please post in the comments.)

And this is where we humans beet the AI 😉

P.S. Do you own the 5th edition of the “Boolean Book”? There are 300 more searches there.

Three Topics? And Nine Tools

booleanstrings Boolean 3 Comments

Happy New Year, readers, and thanks for following my blog! What would you like me to write about? Please name three topics or questions of interest in the comments – thanks!

In the meantime, here is some recent news.

1. Google launches Question Hub – https://questionhub.withgoogle.com – to allow everyone to “answer” questions by submitting URLs. I imagine Google will be rolling these URLs out in its searches after some checking.

2. A Google algorithm called Smith claims to outperform BERT.

Due to both, we can expect Google to even better “understand” our queries.

3. Russian tool https://search4faces.com does a decent job of face recognition for look-alikes of members of Russian networks like VKontakte.

4. Another Russian tool https://go.mail.ru/search_social is like a CSE over Yandex. 

5. https://www.whatsmyname.app produces a list of over sixty social profiles for a username. (I have used it for our exam questions.)

6. Google search simulation from different locations. – this Chrome extension mimics searching from different locations and in different languages. Use it, varying the setting, to get many more results for a search than “your” Google shows, scrape, and combine results. Or, if you live in one country and source in another, “move” there temporarily.

7. Another way to get different results is by using Anonymized Google on https://www.startpage.com. The more, the merrier!

8. Switch to the old Facebook UI to be able to search across states or countries. (How? Paste a state page URL into the “location” field.)

9. LeadIQ Lead Capture seems to be one of the best contact finders.

Source on!

And please watch for our announcements – we will present a Tools class soon.

Tech Sourcers: Watch Github Profile READMEs

booleanstrings Boolean Leave a Comment

Good news! Github has added an ability to create an informative bio for its users. They call it Profile READMEs. We can hope Developers will post some detail about themselves so that Recruiters reach out to the right people. 😀

Adding a bio is straightforward – you need to create a repository with the same name as your username and drop a Readme.md file there. That file’s contents will be shown on your front profile page – publicly.

From Profile readmes, we can get extra info about members, including their skills and preferences, and often, public email addresses. Here is what a typical Profile readme looks like when displayed on the profile home page:

Since most people will follow the template for a readme file that Github offers, we can expect at least parts of the phrases from it to be kept in the bio. It starts with “Hi There” and has several phrases to be finished such as “I am currently working on…” “Ask me about…” (I did).

Using the phrases, we can X-Ray for bios or search on GitHub itself:

site:github.com “hi there” “how to reach me” “gmail.com” python

“hi there” “how to reach me” python

Members whom you find this way have publicly displayed background and contact info. They are easier to assess as potential candidates and will not mind an appropriate email. Hopefully, members will actively use the feature.

 

Image Diversity Sourcing with No Photos

booleanstrings Boolean 1 Comment

 

Those of us who source for diversity know that reviewing profile photos is one of the key filtering techniques in sourcing for several kinds of diversity. But today, I want to tell you how to find some female candidates with no photos included.

I came up with this hack while sourcing for a Director of Data and AI in Berlin, whom the client would like to be a woman ideally. (It is a very male-dominated field.) I noticed that public XING profiles without a picture have two different generic images depending on gender. So here is what you can do to find women without a profile picture – search by a generic female image while X-Raying XING:

(My friend Glenn Gutmacher noticed that this type of search starts doing something odd when you use phrases in quotes or operator inurl:. Make sure you screen your results.)

Another such site is Healthgrades. It allows the same sort of search for females:

Searching in images and by images gives you an additional sourcing boost. There are more results in image X-Ray searches (typically, twice as many). Research, translation, diversity sourcing, and a free filtered LinkedIn member search can be implemented through image tools. Additional scraping and “in-scraping” (following links for data) will make your X-Rays perform better than searches in LinkedIn Recruiter.

Sourcing nerds and all those who want to source in less-traveled places –

Check out our brand new webinar, “Sourcing in Images!”

How to Exceed 1,000 Search Results on Bing

booleanstrings Boolean Leave a Comment

Guest post by Glenn Gutmacher

In a recent post, Irina described an intriguing follow-up to a discovery made by Dan Russell. In short, the initial discovery was that Google used a completely different index to store its web-crawled image results from its index of regular webpages. What Irina realized is that the same query run on Google Images could yield very different results than on regular Google search.

Why that’s significant, as she explained, is that you might get only a limited number of results from, say, a regular LinkedIn x-ray (site: search), but if you ran that same search on Google Images, you’d find many additional relevant results missing from the regular results. Using simple web scraping and filtering in Excel, you can quickly get all the unique results out of the combined set. It often exceeds 1,000 results in total — the maximum that Google used to return in the old days, but rarely displays even a third of that today!

SO WHAT’S NEW TO LEARN?

Why was I invited to write this guest post? Because Irina and I discussed whether this phenomenon might be the case on other search engines, and I agreed to help prove it on the other biggie known for good LinkedIn search results: Bing.com.

I used Bing’s default Safe Search: Moderate filter in all cases. For Bing’s default (“All”) search, I obtained 997 results for her same query of site:www.linkedin.com/in “registered nurse” dallas tx

Note that the total is significantly more than the approximately 350 that Irina reported for this query on Google (I got 313 results when I googled it). This alone might motivate us to use Bing for more LinkedIn searching!

However, it gets even more exciting when you add in Bing images search. To be as apples-to-apples as possible, I initially used Bing Images’ filters to try to match the same criteria/settings that Irina used in her Google test: People à All (this gets faces as well as photos) and image size of 200 x 200:

https://www.bing.com/images/search?q=site%3awww.linkedin.com%2fin+%22registered+nurse%22+dallas+tx&qft=+filterui:imagesize-custom_200_200&form=IRFLTR&first=1&scenario=ImageHoverTitle

This yielded 485 results. However, I found that searching images with the same query but without any filters yielded a few more results (total of 507, or 22 additional), but all remained solely LinkedIn individual people profile results, given the specificity of the site:linkedin.com/in portion of the query.

THE BIG REVEAL

Now for the amazing conclusion: of the 507 Bing image search results and 997 regular Bing search results, the overlap of URLs was only 10. Yes, ten!  So that provides 1,504 total profiles, and the astonishingly low 1% overlap is remarkably similar to what Irina found in her Google test.

I should note there were a handful of false positives in both the regular and image search results where the profile was not the page of a registered nurse nor someone in Dallas. However, it wasn’t really an error because if you looked in the “People also viewed” right-hand column, there was inevitably one Registered Nurse and yet somebody else with a Dallas TX location! In any case, this doesn’t take away from the fact that the LinkedIn results generated from the image search were almost all *completely different* than the regular results.

So it appears that both Google and Bing are harvesting and processing images in a completely different way than other content, and it would behoove all sourcers to search each filetype if the goal is exhaustive, unique results for your query.

My thanks to Irina for inviting me to write this, and I hope we can do it again sometime on another sourcing topic!

Editor’s Note: If you are not familiar with the common methods to download and deduplicate results like this, you can learn to use scraping tools in our class on December 8, 2020.

About the Author: Glenn Gutmacher was one of the early online sourcing trainers (back when it was called “internet recruiting”) and started one of the first job/resume boards for a New England newspaper chain in the ’90s. Since the new millennium, he has been a full-time sourcer or sourcing manager in multi-year stints at several multibillion revenue companies including Getronics, Microsoft, Avanade and currently State Street Corporation.

Hack: How to Get More Than 1,000 Results on Google #OSINT

booleanstrings Boolean Leave a Comment

While the official displayed results limit is one thousand, these days Google searches never produce more than 300-500 results. But today, I ran into something interesting: you still can get many more results, and even more than a thousand.

The secret is to also search in Images.

If you thought that when you switch to images on Google’s search screen it will pick the indexed images from the displayed results and show in the same order, you are wrong. As we have already learned, images is a different database.

I have switched to Image search, collected results, and compared with the general search results in some tests to arrive at the following:

  1. In images, you can get a lot more results – close to 1,000!
  2. Results in “all” and images overlap very little! If you combine them, you will easily get over a thousand results.

A search that I ran in one of the tests was as follows:

The results were astonishing. Google search brought in about 350 profiles. Image search – almost 800. And, the overlap was only 14 profiles. So, combined, I got over 1,100 results.

How do you collect, combine, deduplicate, and filter results?

It is straightforward if you use scraping tools. Join us for a class on December 8, 2020 to learn about them!

Three Easy Ways to OCR

booleanstrings Boolean Leave a Comment

Sometimes we need to get the text out of an image (such as the one posted above, a document scan, a screenshot, or a photo – think a group picture with nametags, for example). It is just inconvenient not to be able to search within “image-documents” or parse them.

I have good news for you. Character recognition is a solved computer task. There are several convenient ways to address the challenge.

  1. Upload to Google Drive.
    • It will convert your image to text upon uploading.
    • It is far from perfect, though; most formatting will be lost
  2. Google Cloud OCR recognizes formatting.
  3. Yandex Image automatically OCRs when you search by image:

Do you have a favorite OCR tool?

I will be speaking about creative sourcing in images at Sourcing Summit Germany – please come listen if you are attending!

My blog readers: would there be interest in an Image-focused sourcing webinar?

Including Unemployed in LinkedIn Searches

booleanstrings Boolean 1 Comment

While Recruiters typically avoid reaching out to unemployed prospects, this year is different. However, if you search for the current title or company, you will not find professionals who have lost their jobs due to COVID.

Here is how to include them.

On LinkedIn, search using keywords and other filters but not a current title or company. On Recruiter, also, do not use years of experience, at the company, or in position.

What about X-Raying? For people with a current job, LinkedIn public profiles show their titles and companies. For people without, public profile titles include their school in the format <name> – <school name> – LinkedIn:

So, if you X-Ray for a school name in the page titles – site:linkedin.com/in intitle:”university of chicago” – your results will include the school’s alumni with no current jobs. Unfortunately, your results will also include the school’s employees. But if your target audience is outside of people in Education, you can reliably find unemployed by X-Raying. Optionally, add #opentowork to narrow your searches.

P.S. Adding or excluding the word “present” in LinkedIn X-Ray does not help find profiles with or without current positions because the word is added to the profiles by JavaScript. Google often does not index it, and neither does Bing.

All that said, if you are a job seeker, it is best to keep your current job “open,” to be found more often. We will understand.

Facebook Photo Discoveries

booleanstrings Boolean Leave a Comment

While you can magically find photos with any given number of people on Facebook, clicking on them (in Google’s image search) lands on the page, not the photo itself. You have to scroll down to find it. As I have found out, to find “just” photos (or rather, pages with one photo only), you need to X-Ray in a more specific manner, namely,

site:facebook.com/*/photos/a (add your keywords).

Here is an example of precise people counting by Facebook. Another example (suppose I am sourcing in Healthcare):

site:facebook.com chicago hospital “from left to right” “image may contain * people”. You can see a screenshot of the results above.

Once you land on a page with the photo in question, you will often see who who has posted it and often, who is on the photo, who liked it, and who commented.

You can find pages like these:

and

You can then collect the data and engage with selected “likers”.

(To narrow to profile photos, if you were wondering, try this: site:facebook.com/*/photos/a “updated their profile photo”.)

My example strings find images but you can, of course, run them in “all” searches.