Interesting: Two File Types in Google Images #OSINT

booleanstringsBoolean, Google, OSINT

It is new – Google Image search responds to the filetype: operator for document types such as PDF, DOCX, or PPTX.

security conference attendee list director vp filetype:pdf

This is cool because Images is a separate database – its results ranking is different, and extra results may surface. And you can preview the images before opening the results.

It is possible to combine the image search in documents with other parameters, such as image color, size, type, etc., available in the Advanced Image Search. For example, here are large blue org charts at banks: filetype:pdf orgchart bank.

Before the change, you could use the operator filetype: in Images looking for image types such as PNG, JPEG, BMP, ICO, GIF, etc. – you still can; for example, bank “new york” filetype:ICO.

Now, Google will respond to TWO filetype operators to search for images of a given type in documents of a given type: filetype:pdf filetype:png butterflies.

I sense that it is a work in progress; Google may be re-implementing its “file types” found in search. (We have seen some early signs before). This search for two different file types in the “main” Google now brings results: filetype:pdf filetype:html healthcare providers. (Expect another post. 😉 )

It is interesting and worth some research time!

 

 

 

Check Your Assumptions for LinkedIn Connections’ Search

booleanstringsBoolean, LinkedIn

LinkedIn connections and connection levels are as old as LinkedIn itself. The underlying idea of the Social Network is based on connections!

You would think that LinkedIn (founded in 2002) has figured out the search for connection levels by now. But if you trust connection search – for example, routinely search for the second level members to connect with – you are up for disappointment: connection search has multiple issues. Here is a summary; these things (or rather, bugs) are good to keep in mind.

A. Search for each connection level vs. no level selection is not the same

You might think selecting each connection level – 1st, 2nd, and 3rd – will provide the same results as selecting no levels. (Based on the definitions, you are right.) But it is not so: compare everyone with everyone with connections levels. The latter search misses about 1% of profiles – and not necessarily out of your network.

B. Some members are not found by connection level search

This is a screenshot showing a result – a legitimate 2nd connection – which disappears when any (or all) connection levels are selected. The search depends on something about the member that matches the search and the person who searches (that contributes to the hiding effect).

That might explain why selecting connection levels results in a smaller number of profiles.

C. Third and out-of-network search returns second connections

I am sure you have experienced that. The “top” results of the third+ connections search are mostly second connections for me (ones with whom I have connections in common), not the third.

D. Some 3rd level connections are displayed as the second

But there are no common connections:

 

E. Some 2nd level connections are displayed as third

I invite you to find an example for this one 🙂

What about the first-level connections? Something is going on there as well.

It helps to verify our assumptions in sourcing!

P.S. Check out our latest sourcing wisdom about LinkedIn in the LinkedIn Solved class recording.

 

 

Custom Search Engines’ Filters – Gitlab Example

booleanstringsBoolean, OSINT

Gitlab, like Github, is a platform providing a Version Control System, allowing Developers to collaborate on writing code. It claims to have 30+ MLN profiles, quite comparable to Github’s 60+ MLN. There are some differences in functionality and user profiles.

(If you source for Developers, the below search syntax examples might help in finding results – scroll down to see the Gitlab CSE operator syntax. But most importantly, the Gitlab example demonstrates a powerful sourcing technique – filtered search in CSEs).

Gitlab profiles offer a rich Schema.org’s Person structure, allowing you to X-Ray it in a filtered way using (a favorite subject) Custom (Programmable) Search Engines (CSEs). The structure even has a filter for emails!

If you are not familiar with filtered search available through CSEs on sites that support Schema.org’s structure, here is, in brief, how it works.

Web page creators can put some “meta” code within the page source code – not for rendering but for telling Googlebot what “objects” it contains – a person, an organization, or a movie, for example. The Schema standard defines the format for “communicating” those objects.

The objects embedded in meta-code may have values – say, a page may have an object Person with the value “job title” equal to “developer.” CSEs provide a way to search for that structure and values – for example, you can search for all people with the title “developer” on a particular site (if its structure supports it). I.e., you are getting awesome search abilities- now you can search for results based on the values such as job title, company, location, and more, depending on the site to X-Ray.

David Galley‘s and my book Custom Search Engines – Discover more: is out (yay!) and is the only book about CSEs. It talks about the subject in-depth. (Make sure you get an electronic copy where you can click on links). So will our class.

Join us for an interactive class, Become a Custom Search Engines Expert, on June 29-30 (Tue-Wed) 2021, and get going with CSEs in just two days! The presentation will be of interest to Recruiters, Sourcers, OSINT Researchers, and anyone who does research using Google.

What about Gitlab? Here is the full list of its CSE search operators – use them in the GitLAB CSE below (or build your own CSE).

  1. more:p:hcard-fn:<name>
  2. more:p:person-role:<role>
  3. more:p:person-worksfor:<organization>
  4. more:p:person-location:<location>
  5. more:p:person-email:<email>
  6. more:p:metatags-og_description:<bio>

GitLAB CSE

Here are the values you can search for and extract using CSE APIs – to do so, write some code or use our tool, Social List:

Search in Business vs. Recruiter (Guess Who Wins)

booleanstringsBoolean, LIR

[Edited June 26, 2021: alas, the operators are gone for now. I hope they will be back!
Our class LinkedIn [Sourcing] Solved fully reflects the change.]

The never-documented LinkedIn search operators make people search with a personal account – Business, Job Seeker, and Basic – comparable or better than the most expensive subscription, LinkedIn Recruiter (LIR). Let’s go over a comparison, operator by operator.

headline:

The operator works in LinkedIn Business (and Basic). The search for the headline is absent in Recruiter – and altogether, “officially” on LinkedIn. It is such an omission!

Example: Members who are open to work:  (headline:open OR(headline:looking)) (work OR(job))

Who wins: Business

skills:

With a Business account, you can search for the skills that members enter. In Recruiter, it is impossible. It looks for “assumed” skills picking keywords from profiles. It is not exactly like keyword search but is close – see an example below. (How is it valuable?)

Example – Business vs. Recruiter

Members with the skill including the word “lazy” – skills:lazy. It shows fewer than 500 profiles that have Lazy Eye Treatment or Lazy Loading (a term in Computer Science) skills. Some also said they are lazy, but they are in a small minority. 🙂

In Recruiter, we get 30K+ results for the skill “lazy”(feel free to laugh!).

Searching “lazy” as a keyword shows 49K+ results. I.e., Recruiter “assigns” a “skill” to over 60% of members who have the keyword somewhere on the profile – for example, work at the “Lazy Dog” restaurant. How Recruiter decides to “promote” a keyword to a “skill” it, of course, a mystery.

Who wins: Business

school:

Example: school:harvard

Search by the school name is absent in Recruiter.

Who wins: Business

fieldsofstudy:

Example: fieldsofstudy:101001 (101001 is the code for “Political Science and Government”)

The Field of Study is available in Recruiter.

Who wins: a tie

degree:

Well, there “should be” an operator for the degree – but, alas, it does not work (200 is the code for Bachelors).

Who wins: Recruiter

startyear: endyear: (years in school)

Example: endyear:2022

Who wins: a tie

companytype: companysize:

Example: companysize:I – people who work at companies with 10K+ employees.

Who wins: a tie

yoe: (years of experience)

(Note: LinkedIn tells us that the “years of experience” is the number of years between starting a job and now – or ending work. But it is not precisely that because some members can be found by different years of experience: yoe:3 AND yoe:6.)

In Recruiter, it is easier to search for an interval. The years are between 0 and 30. In Business, you can search for years of experience between 0 and 100.

Who wins: a tie

spokenlanguage:  ‘

The spoken language is a free-form text field on the profiles – members can enter “any” languages, ignoring the prompts for the standard ones. For example, these 11K Developers “speak” Python: spokenlanguage:python.

It is not so easy to search by the “Python” spoken language in Recruiter:

Who wins: Business

functions: seniority:

The operators in LinkedIn Business work just the same as Recruiter selections.

Who wins: a tie

Years at a company or in position:

Alas, I have not found those operators.

Who wins: Recruiter

Also, compared to Recruiter, the Business account allows Boolean search with the operators. For example, you can exclude people from a given industry. NOT (industry:104) excludes members in Staffing and Recruiting.

Who wins: Business

The end score:

  • Business – 6.5 points
  • Recruiter – 4.5 points

Please join me for the class LinkedIn [Sourcing] Solved. We will go over the operators and other less known yet powerful features of a personal LinkedIn account. There is a limited number of seats remaining, and the presentation sold out the first time – sign up now.

10 Alternatives to InMail

booleanstringsBoolean

You can reach prospects who ignore your LinkedIn InMails in at least ten different ways.

I am sure you familiar with some. How many of the following methods do you use in practice? (Please feel free to comment.)

  1. Email. If an email address is associated with a LinkedIn profile, your message will land in the same email Inbox as an InMail. However, the email comes from you, while InMail has a linkedin.com return address. Gmail users may get your InMail in the “Social” category, which they never check. Others ignore everything LinkedIn sends! Email does not have that problem.
  2. Call.
  3. Text.
  4. Invite to connect. LinkedIn has introduced a weekly limit for invitations – however, if you upload a list of emails, the limit does not apply. (Besides, uploading will let you see which profiles are associated with which addresses – big deal!)
  5. Ask a common connection to be introduced.
  6. Find someone with whom you have a LinkedIn Group in common, go to the group, and send a message there. Group messaging is no longer limited.
  7. Source in LinkedIn Events, and you can message everyone. If you organize an event, you can even download a list of attendees along with their emails! To do so, you need to create an event as a company and set the registration “via LinkedIn.” If you search for people with security clearance or other “elusive” qualities (meaning keywords absent from profiles), Events are a promising channel.
  8. Message on any site that allows it – Facebook/Messenger, Reddit, XING, Slack, you name it.
  9. @mention them on any site that allows it. It is a semi-private way to reach out on Twitter.
  10. Interact with their posted content and comments.

Learn how to find contacts in our popular workshop next week! June 16-17 (Wed-Thu). Seating is limited.

Google and LinkedIn Speak Different Boolean

booleanstringsBoolean, Google, LinkedIn

Google and LinkedIn are two sites where sourcers spend most of their time. Both support Boolean search. Yet it works in very different ways.

For starters,

The search is not really Boolean (we can call it pseudo-Boolean).

Google finds synonyms to all terms entered without quotation marks. A search like backend java engineer -engineer returns results while “formally” it should not.

The order of words matters, too; Google attempts to interpret words following each other as phrases. Searching for a sentence from a public page even without the quotes will return that page on top.

Google also attunes the results on the perceived intent of the searcher based on the whole string. That is the definition of semantic search. Apple pie and Apple salary will show the results you expect.

If you want to control the interpretation, you can put keywords and phrases in quotes or use the Verbatim option. But in many cases, Google’s semantic features are trustworthy.

The same search on LinkedIn in Keywords also (unexpectedly) produces results: backend java engineer -engineer. When the Keywords contain a job-title-sounding sequence of words, LinkedIn interprets it as a title past or present and looks for similar titles. Unfortunately, LinkedIn’s interpretation may not match ours – and it is even less intuitive in Recruiter. For example, LinkedIn “thinks” that an Executive Assistant to CEO and CEO are “similar.”

So LinkedIn may add synonyms, but you can’t rely on it to do so as well as Google. LinkedIn Recruiter offers to select titles or sets of titles – but LinkedIn’s standardized values for titles, as well as companies, schools, skills, etc., are limited.

On LinkedIn, you want to avoid the interpretation and control the search using as many synonyms as you can come up with – sr OR snr OR senior, developer OR engineer OR coder, etc.

Choose to enter search strings vs. standardized selections. Text searches will perform better and are easier to adjust. Use our Boolean Builder tool to create LinkedIn-friendly strings (make a copy of the document).

Bottom line:

  • On Google, search simply.
  • On LinkedIn, use long Boolean search strings to cover every possibility in every field.

Are you guilty of using only LinkedIn when you search for potential candidates? Join us for the class Sourcing without LinkedIn – coming up shortly!

 

 

 

Diversity Filter Coverage – Women’s Names

booleanstringsBoolean, Diversity

Diversity Sourcing is not easy. There is no clear way to search for diversity categories on social sites or Google. Our less-than-perfect but necessary approach consists of “shortcuts” – ways to search that are likely to bring up groups of potential diversity candidates. As an example, Jonathan Kidder has a collection of diversity Boolean strings. Glen Cathey’s blog offers quite a few approaches. (In addition to using the shortcuts, we sift through “everyone” in professional search results and notice other profiles of interest.)

It makes sense to combine all possible “shortcuts” – such as the search for female names, associations, schools, etc., for the most inclusive approach. It also makes sense to get an idea of how large a population every approach covers.

Here is some research on finding professional women by the common first names in the US. (For other countries, this exact approach may or may not be possible – I am sharing it as an example).

First, I got a list of 1,000 most common female names in the US by Googling. I also found lists from the Social Security Baby Names page. In the end, the results for the names from both sources were similar. (But the popularity of the names has varied over the years.)

I did the research on LinkedIn, so it is affected by the data LinkedIn has – but it is LinkedIn that we use to search, so these conclusions are not affected by people (many!) who are not members.

I created a 1,000-long OR name string and tried it in LinkedIn Recruiter. While it was “too much” for it to search, it did show the numbers.

In the US, LinkedIn has 49% female members (compared to 43% worldwide) and a total of 170M+ members, i.e., it has 83M+ women. Our OR name search has found 73M+ results. So, a search for 1,000 common names (which you would have to do in portions to get results) amounts to 87% of women in the US. (If you were wondering about 1,000 common men’s names, the percentage is even higher.)

The results are affected by LinkedIn’s first names interpretation, which cannot be turned off – we cannot search “verbatim” (either in Recruiter or business account). It affects our results in good ways since we will see also nicknames and variations. However, some names (like “Andrea”) can be both men’s and women’s, and LinkedIn’s variations will include “Jerry” and “Gerry” if you search for “Geraldine”. That introduces false positives. But, examining results by adding extra filters, I can tell that the percentage of those is small. (We look at each result when sourcing anyway.)

There are name variations across locations. Here in the San Francisco Bay Area, the population is diverse ethnically, which results in fewer “American” names found – more like 50%, but it is still a high percentage. (To increase it, we can add lists of ethnic names). If you narrow to industry – for example, Software – the numbers will go down – but it only reflects the uneven numbers of men and women in the industry.

If you compare the high percentage we got with numbers for other filters (such as “she” or “her”, schools, and  memberships), it becomes clear that name search is powerful. The conclusion is – for the US, female name search is an excellent filter. Just make sure you include 1,000 and search in other ways as well.

Is your team sourcing for Diversity? Join us for the Certified Diversity Sourcing Professional (CDSP) Program August 2021! (June is now sold out.)

 

 

 

 

LinkedIn Search Solved

booleanstringsBoolean, LinkedIn

Searching for professionals on LinkedIn.com? At this time, business and personal users have the most powerful – but not officially documented – search ways and filters and cross-referencing ability, exceeding (the expensive) Recruiter’s.

You can search for unique filters such as headlines and self-entered skills, and combine other filters such as company size, type, years of experience, or at school in Boolean expressions. The Boolean “limitations” can be overcome with modified search syntax. You can upload and cross-reference up to 10K (!) email addresses vs. Recruiter’s 200.

The extra sourcing power is not documented in LinkedIn’s Help.

Take a look at this comparison and join our webinar to learn all about LinkedIn Sourcing.

Search Filter/Account Type Basic or Business RPS and LinkedIn Recruiter
First/Last Name X X
Network Relationship X X
Industry X X
Headline X*
Current Title (Boolean) X X
Current Company (Boolean) X X
Current Company (Checkbox) X X
Past Company (Boolean) X
Past Company (Checkbox) X X
Years at Current Position/Company X
Years of Experience X* X
Company Size X* X
Company Type X* X
School (Boolean) X
School (Checkbox) X
Years of Study X* X
Field of Study X* X
Degree X
Profile Language X X
Spoken Language X* X
Self-entered Skills X*
Calculated Skills X
My Groups X* X
All Groups X
Location (by place name) X X
Location (zip, radius) X
Connections of X
Seniority X* X
Job Function X* X
Recently Joined X

* use the hidden search operators

Search for Group Members (to Message)

booleanstringsBoolean

LinkedIn used to limit messages to your Group members to 15 per month. This restriction is gone. If you have a basic or business account, you can message fellow group members without restrictions.

However, Group member search only offers finding people by name. How do you find group members who match a professional requirement?

Here is a way I came up with. You can do an advanced LinkedIn people search for your filters. One of the filters is “connections”, 1st, 2nd, and 3rd level. There is no option to search for group members. To do so, all you need is adding &network=[“A”] to the end of the search URL. Add it after you have searched for other values – searching for your group members at start will break the other filters.

Here is an example search with some filters that would show people you can message – 1st connections and group members.

(BTW, if you want to see a “clean” URL, use a URL decoding tool like this. The above search URL will become more readable, like this: https://www.linkedin.com/search/results/people/?currentCompany=[“1441”]&geoUrn=[“102095887”]&network=[“F”,”A”]&title=manager. Here, &network=[“F”,”A”] stands for the first level connections and group members.)

There is a number of clicks to send a message to someone in the search results. If you go to their profile, you will see groups in common, then you can go there, find the person, and message.

But it beats InMails because it is free.

Combined with the search operators, it makes a business account an excellent option for sourcing. You can also learn to find members’ contact information in the upcoming class Find Anyone’s Contact Info on May 13, 2021.

Be Negative. Find More

booleanstringsBoolean

Can being “negative” help in sourcing? I do not mean to suggest that you will source better results when you are in a bad mood, voice dissatisfaction, or upset others. This is an essay on the Boolean “NOT” logic.

When searching, the productive approach we teach is to imagine the “right” terms you will find and put those terms and variations into the search field(s). We call it “visualize success.”

But many social network members do not follow standards in entering their profile data and forget to include “our” keywords. A Boolean “NOT campaign” is a way to dig deeper and find them.

Search, negating some seemingly necessary keywords or titles, and see which other relevant terms and results show up. Use the newly found terms to iterate the search. This approach discovers members who have not used the “right” keywords on the profiles but are worth reviewing.

For example, you might be struggling to locate a “purple squirrel” with rare coding skills and the title developer OR engineer. Try, in addition, searching for NOT developer NOT engineer and the skills. You will be finding people with the titles lead, coder, technical staff, abbreviations like MTS, etc., some “creative,” and even misspelled titles. (The latter helps if you decide not to hold misspellings against potential candidates.)

The following example search is for two obscure programming languages. Given the scarcity of the talent pool, it may help to search like this:

(malbolge OR lolcode) (NOT title:(developer OR engineer))

The results of such a search will likely include profiles that your competition will miss. (Companies that name their employees in non-standard ways like Technical Yahoo do a good job of protecting them from being sourced!)

Iterate. If most results come from several companies, exclude those companies. Or, if most people with the skills reside in a few locations, exclude them and search again:

aws architect NOT geo:”san francisco” NOT geo:”new york”

NOTs are also necessary for exploratory research, the initial and ongoing part of every sourcing project, and one of the six core skills we test. For example:

  • By negating the desired title, you will find other possible titles that you can use in your searches.
  • By negating the desired skill, you will find people with comparable skills and learn more terminology.
  • By negating the seniority (excluding directors and managers) but using words pointing to someone in charge – as simple as managed or – “in charge” – you will find others. And you will learn how managers call themselves in some cases – some companies have their own “sets” of job titles.
  • You will also find out what the market is like.

The above applies to LinkedIn and LinkedIn Recruiter. But NOTs also help in Googling and to find sites to X-Ray in particular. If you are Googling for sites to source from, search for the terms and start excluding sites appearing in the results one by one, like -site:linkedin.com -site:facebook.com, etc. You will find useful, targeted sites to source from. Dan Russell of Google likes to write about this. A search engine Millionshort does this type of “thinking” showing results you may not ever see unless you make an effort.