Where the (Wild) Files Are

booleanstrings Boolean 0 Comments

When data is exposed to search engines due to an incorrect site configuration, that data becomes available for Sourcing for anyone who knows how to find it – including you and me.

About ten years ago, Sourcing techniques like “flipping” or “peeling” still worked, providing creative Researchers with the data to find and parse – sometimes, folders full of files with desired professional data. These techniques are not as effective any longer. As of today, we rarely see websites that would show anything like this (an exposed file directory):

We can try to look for those by Googling for something like

intitle:”index of” name “last modified” “size” <add keywords>,

but we won’t find a whole lot, due to the modern site protection, built into many site-creating platforms.

However, the web is full of new data sources, that were not available in the past. With the rising popularity of BIG DATA in the CLOUD, we can Google for a different set of files. For example, if a file is stored in the Amazon Cloud and is public (for example, is referenced from a public document), we can locate that file.

Would you like to see some examples? These are resumes stored in the Amazon Cloud. Another example, these are attendee lists.

There are tons more sources beyond the Amazon cloud storage. Here is another source of files:

inurl:wp-content/uploads. (Why? You should be able to explain). Example Google search –

inurl:wp-content/uploads “member directory” ext:PDF – 

finds some interesting data!

Learn about other platforms full of uploaded data that may find its way into Google index (and much more!) at the

Sourcing Methodologies Lecture and Practice.

 

 

 

 

Get a List of Candidates for Your Requirements

booleanstrings Boolean 0 Comments

Our new tool Social List continues to grow in popularity as well as its abilities. I would like to introduce it to more fellow professionals.

Join me for a demo of https://sociallist.io and get a list sourced for your requirements (one per attendee). After the webinar, you will be able to run the tool on your own on a trial basis and check it out. Sign up @ Social List Webinar, or, go straight to the Gotowebinar Link to register. Date/Time: Tue, Aug 29, 9:00 AM – 9:30 AM PDT.

Social List finds lists of Profiles for our subscribers’ requirements. Users can export the search results for filtering, sharing, and storage.

Here are some recent examples of SocialList-sourced exported results; you will be able to get one or more lists like this:

Tax Accountant, Houston Area, CPA – LinkedIn Agent – Search Results

(Note, Social List fully complies with the rules; any questions, let me know.)

Email addresses at Oracle – Zoominfo Agent – Search Results

 

Java and Python, Seattle, works at Amazon – Universal Github Agent  – Search Results

“See you” there!

 

 

 

Melissa Data Goes to Work

booleanstrings Boolean 0 Comments

In his post on SourceCon, @RandyBailey wrote about sourcing using Melissadata.com‘s service Email to Address. The service provides the “associated” (physical) addresses based on an email, and vise versa – it also shows names, emails, and, often, phone numbers based on a street address.

I was exploring the site in relation to our “Data cleaning and enrichment” webinar. Let me share another, unexpected aspect of how the service works. In Email to Address, I first tried to enter an address of a nearby apartment building. I got an impressive list of names, emails – mostly, private (like Gmail), – and selected phone numbers. I was curious whether some numbers were mobile – and yes, they were; I verified several using http://freecarrierlookup.com, and a few were mobile.

It then occurred to me to try to enter an address of a business building. One address I remembered was Apple’s, and I searched for it – One Infinite Drive, Cupertino, CA.

(Try it!)

The results, when I ran the query, contained over 400 names, emails, and some phone numbers. Interestingly, a significant percent of these emails was apple.com-based.

The stats on the above email list, when cross-referenced with LinkedIn (using Talent Pipeline in Recruiter), were interesting: only ~10% of the addresses were identified, but those that were, almost all belong to people who work at Apple at that location. Out of a small sample, it’s hard to generalize, but we do get lists of people working in the building in this exercise.

Here are some other examples:

As long as we verify these email addresses from Melissadata, we can be acquiring these contacts in some quantities. Want to try the above for some major corporations’ headquarters? Share what you find!

In the webinar “Data cleaning and enrichment” (that we are going to repeat) we’ll study some free ways to verify and refresh recruiting data with the help of Social Networks, as well as delegating the enrichment task to specialized (affordable) tools. One of them, Clearbit, shines in giving us demographic previews of our data. Register for the webinar to find out!

 

Facebook Sourcing Mastery

booleanstrings Boolean 0 Comments

 

Which Social Network is best for Talent Sourcing? This question does not have a good answer, because:

  1. There’s no need to narrow our search to one set of data (it would be silly, right?)
  2. We can communicate with the prospects using a different site than the one where we found them.

A better question is – What makes a Social Network valuable?

I think a Social Network is valuable for searching if it has:

  1. Plenty of professional profiles and extra info, containing those backgrounds that we search for;
  2. Convenient, “deep”, structured, within-budget, search capabilities, that allow identifying those professionals.

If the Social Network also allows to reach out to prospects, that is a plus too!

Over the last few years, we have all been feeling that Facebook’s role in Sourcing and Recruiting will grow – along with widening dissatisfaction with LinkedIn. It’s time to closely look at Sourcing on Facebook if you haven’t.

How does Facebook do in terms of the above criteria? Let’s take a look.

  1. “Plenty of professional profiles”.

While a large percent of Facebook members don’t post their professional information, there are obviously many more professionals – and professional data – there. Just take a look at the numbers:

Facebook has a huge amount of professional information, more than any other Social Network does.

2.  “Convenient, “deep”, structured, within-budget, search capabilities”.

Searching on Facebook in my and our colleagues’ experiences, takes some getting used to. If you are used to searching for the field values, such as the job title and location, in a resume database, – get ready; this search is different. We need to learn some ART of Facebook searching. For example, we can search Facebook for fields such as a job title, but doing so is not straightforward. Sometimes, similarly-phrased searches produce different results (which is good to know, to get more results for a search). Facebook’s search results are, of course, produced by software code, not by a human, but often seem rather “informal”.

With the ART, we need to learn some SCIENCE – you might need to produce technical-looking searches, such as

While no technical background is necessary to learn to “speak this language”, creating queries like the above would require some learning curve. But the results would be rewarding. Best of all, the search is completely free (compare to the cost of LinkedIn Recruiter!).

By multiple requests, we are repeating the Facebook Sourcing Mastery webinar on Wednesday, August 16th. Particularly, we’ll talk about the ART and SCIENCE of Facebook search in some detail. You can register at the link. Seating is limited.

 

 

Parlez-Vous Francais?

booleanstrings Boolean

In Geotargeting 101, we narrowed search results to a region using a “Boolean-string-invisible” setting in Google Advanced Search.

The same advanced search dialog has a language setting, that is also not reflected in the search string. The search engine gets it via a URL parameter. For example, if we set the language to English, the added URL piece will be &lr=lang_en.

 

Suppose we are looking for English-speaking people in a non-English speaking region. If we X-Ray LinkedIn (or another social site) for member profiles, using the language restriction, in the results, we will see pages that we can informally describe as having “lots” of English. Members behind those profiles are likely to speak the language.

Here is an example search in the Netherland-based LinkedIn, narrowed to the English language:

site:nl.linkedin.com

Without any keywords, we’ll see some Netherlands-based profiles that have English content, such as a summary or a description of job responsibilities. If we add keywords, we’ll also start seeing profile URLs ending in /en – those are “secondary,” English-language, profiles. Either way, we are encountering people who use English to describe their professional background and are likely to speak the language – which is what we need!

Try these searches (add keywords to make them interesting) and observe how the language setting works:

 

 

Social List: Searching the Structured Web

booleanstrings Boolean

Social List Demo

Join us for a webinar on August 2, 2017.

Register now!

http://booleanstrings.ning.com/events/social-list-http-sociallist-io-sourcing-tool-demo 

You are invited to a demo of our new sourcing tool Social List
(http://sociallist.io).

Social List searches for structured information on the Internet. It combines the conveniences of Google X-Raying of Social Networks with a precise search for fields such as location, job title, and company. Before Social List, this search precision (or “faceted” search) has only been offered in databases (that are often expensive) but not in X-Raying.

Social List offers exporting lists of profiles that it finds, with the structured information included, in an Excel format. Many of our users start their sourcing here, by generating lists of prospects to explore. I must say, I use the tool for my sourcing projects a lot!

Since our first limited release several months ago, we have enhanced Social List by introducing more search options, new search Agents, and display and export structured data such as titles, companies, and locations.

Here are a few quick screenshots; you are invited to see the tool in action at the webinar.

We call our X-Ray tools “Agents”. Here’s a screenshot of the Github Agent search dialog:

Here is what the search results look like – as you can see, we have enhanced them with extra structured information; the example is from our ResearchGate Agent:

Finally, here is what export looks like; the example is from Doximity Agent:

I hope to “see” you at the demo! Any questions, please email me (Irina Shamaeva).

Geotargeting 101

booleanstrings Boolean

How can we search for pages local to a particular country?

For starters, there are country code top-level domains. To find pages that belong to a country-level domain, we can simply use X-Ray:

site:za 

But there are many other domains, that don’t point to a location. When figuring out the region for a page on a generic domain, such as .com or .org, Google uses the pages’ IP address (revealing the location). It may use a few other signals, such as location information within the page HTML code and locations of other pages pointing to this one.

Google’s Advanced Search has a setting, allowing to search for pages, which Google identified as belonging to a region:

 

If we set a region in the advanced dialog, we will not see it reflected in the search string. Instead, the setting generates an addition to the search URL that looks like this:

&cr=countryNZ.

Here is an example search narrowed to a region:

chemical engineer

We can exclude country-specific domains and examine what Google brings in as local to a country, based on factors other than the country-specific domain:

chemical engineer -site:nz 

An interesting – and practical – use case for using the “region” advanced setting is X-Raying LinkedIn. Take a look:

Now, here is a question for my readers: can you reliably X-Ray LinkedIn for US-based profiles only, using the advanced region setting? The first person to email me a correct, supported by examples, answer, will get a guest ticket to one of our next sourcing webinars.

 

 

Which Companies Use Which Technologies

booleanstrings Boolean

If we are looking for professionals who work with certain technologies (for example, Linux, Selenium, Tensorflow, or NetSuite) and have not been able to find enough to cover the demand, how can we find more? If we know that a company uses a particular technology, then the employees who are on a team that uses the technology are likely to use it too. That includes professionals whom we won’t find online by the keywords.

To build a list of target companies, we might ask, for example,

  1. What are the largest companies that use Selenium for automated testing?
  2. Which companies use Tensorflow for deep learning?
  3. Which companies in Seattle use NetSuite products?

There are several ways to research, that can be combined. Let me point out two simple techniques:

  1. Look at job posts. Here is an example job search on Indeed.com – full-time in New York, with Selenium. See the list of identified companies circled in green. These companies use Selenium. We can expect someone who is in the testing department to use Selenium even if they don’t say that on their LinkedIn profile or do not have that profile at all.

 

2. Search on a social site with a large number of target professionals. This could be LinkedIn or XING or an industry-oriented social site. If we search for people on LinkedIn using a technology keyword, we won’t find “everybody” – but we will find many profiles, which, in turn, will “tell us” about potential target companies.

As an example, which companies in Seattle work with NetSuite? Here is a people search –  that reveals several companies:

 

There are other sources to check, including the simple “which companies use *” (use a technology name instead of *). That is for those of us who are lazy and want to check if someone else already did the research for us. 🙂

Don’t miss the upcoming webinar with Martin Lee – Recruitment Research – What, Why and How!

 

A Github Productivity Tool

booleanstrings Boolean

It’s not often that I post a blog about a tool the next day I hear about it; this one is an exception. As if in response to my complaint about finding Github profiles where a particular programming language is “dominant”, i.e. most of the person’s code is written in that language, I got a message about OctoHR, a new free, lightweight, and useful Chrome extension. OctoHR shows the language “preference” information and reveals an email address as well.  It can save Recruiters who directly source on Github a lot of time.

Here are two screenshots – the tool clearly shows which of the two profiles is a potential “Java Developer” candidate and which is not, independently of their activity level:

Additionally, the OctoHR has a quick UI dialog with input text boxes for locations and languages. That is to help those who are not using advanced Github user search(perhaps for the lack of time).

Try the tool out at OctoHR and please give its author @dmitryzaets feedback or ask for desired additional features.

Irina

P.S. I can suggest a desired feature. Can we have a search for Github users for “predominant language”=Java + “follower number” > 50 + “Java repositories number” > 7? That might be a hard-to-implement requirement!

 

 

Lesser-Known Github Sourcing Tips

booleanstrings Boolean

Searching for Software Engineers? Github.com got plenty of attention from bloggers; you can find a couple of older posts on my blog as well. Recently, I went back to Github to source for top-notch Java back-end coders for a fast-expanding start-up. Let me share some search tips picked along the way.

Choosing Keywords

As a general consideration, to expand the results, it helps to know which skills or technologies imply some other skills, that may not be explicitly named, and drop those words from search.

Here is a practical example. I could include back-end OR server in searching, but these days Java is predominantly used for back-end programming, so these keywords were not necessary, as long as I searched for Java.

X-Raying for Languages

We know that the repositories “tabs” on user profiles include programming languages’ names. So, if we narrow X-Raying to those “tabs” (pages), we can add one or more languages as keywords:

site:github.com inurl:tab=repositories Java Scala Python.

(If you were wondering, the = sign can be replaced by another character without affecting the search).

X-Raying for Languages and Technologies

Github repository pages also have code names and  descriptions, so if we are looking for keywords that mean some software libraries or technologies, but not languages, we can do it in the same manner:

site:github.com inurl:tab.repositories Java Spring NoSQL.

Those familiar with the terminology would appreciate Google bringing in MongoDB as a synonym for NoSQL – quite appropriate for the search since MongoDB is a popular NoSQL database:

 

X-Raying for One “Primary” Language

Repositories pages link to programming language-specific pages, which can be X-Rayed “individually,” as in

site:github.com inurl:tab.repositories inurl:language.Java Spring NoSQL.

I wanted to see members who have used Java a lot, not just occasionally. Unfortunately, there seems to be no straightforward way to search for those. (Google’s Numrange – searching for a range of numbers, that could be of help – hasn’t performed that well lately).

To preview and compare the number of repositories in a given language vs. the total number of repositories, we can try something like

results for repositories site:github.com inurl:tab.repositories inurl:language.php

Public Emails Are Gone

Perhaps in response to some undiscriminating Recruiters who mass-email its users, a few months ago Github removed email addresses from its public profiles. We can still get results for something like

“gmail.com” site:github.com inurl:tab=repositories Java Spring NoSQL,

but those results will eventually be gone. We can appreciate the volume of the newly indexed profiles by looking at

sign in to view email site:github.com inurl:tab=repositories.

To see the “public” email addresses, we now need to be logged in. However, Github membership is free so it’s not a huge problem.

That’s it for now. In a future post, I will cover some aspects of the internal GitHub search, beyond the documented operators. Social List subscribers should expect several extra search facets in the Github Agents to be added shortly.