LinkedIn-Based IQ Test

booleanstringsBoolean, LinkedIn 21 Comments

Whether Boolean, Semantic, or Machine Learning win the global search quality competition, remains to be seen. But a search system quality, by definition, depends on how well users can get results they want. For that, a user’s understanding of what to expect in the results is important (I hope you agree with the last statement).

Many modern systems have advanced search syntax – at a minimum, supporting the Boolean logic and the quotation marks for phrases. Systems also do some interpretation of the user’s intent when searching. For example, we can expect Google to: a) see if there are misspellings and offer corrections b) include synonyms for keywords without the quotation marks and for abbreviations; c) try to find pages where keywords are close to each other, in the same order, and rank those pages higher.

On ZoomInfo, we can expect that searches for VP and Vice President will return the same results.

On LinkedIn, however, there has been little interpretation of searches. It has had some title abbreviation recognition (VP = Vice President) on and off, different in Recruiter and personal, and it’s rather unclear where that stands. We’ve also long noticed that LinkedIn is interpreting people’s names – searches for Bob and Robert return similar (though not the same) results.

In any search, we expect that:

  1. If we add a keyword, the number of results goes down
  2. If we add a condition, the number of results goes down
  3. If we didn’t use the quotation marks, the word order should not matter.

So now, let me present you with an “IQ test”, based on the following searches, which produce unexpected results.

Question for you: what is LinkedIn “thinking” (i.e. what is the internal logic, where does any interpretation come in) when it produces these results? (And yes, it’s not what a user would expect.)

Email me the answers (or hypothesis) of why it works like that, or post in the comments. The first few correct responses will get a ticket at the upcoming

LinkedIn webinar

(sold out for this week but we have scheduled a new session).

Comments 21

  1. I’m going to go with the first thought that came to me: “Morgan Stanley Enterprise Java” is natural language, creating a string representing how one would express the inquiry. Although, not to be unfair to LinkedIn, I’d be surprised if they supported natural language.

    Interested, as well, in your replies, Irina.

  2. In short, LinkedIn is ready keywords in sequence.
    Let me know if I need to provide details.

    Waiting for your feedback.

  3. Hi Irina, it looks like only Lenore and I have commented, so it seems that you have done a great job stumping us! Please advise … what is LinkedIn “thinking” (i.e. what is the internal logic, where does any interpretation come in) when it produces these results …
    Many Thanks,
    Mike

  4. In short, LinkedIn is reading keywords in sequence.
    Let me know if I need to provide details.

    Waiting for your feedback.

    1. Post
      Author
      1. Hello Irina,

        I sent you email with my answer. Might not be a good explanation but I tried 🙂

  5. Hi Irina. Fascinating stuff!

    Ok, what I think so far is that LinkedIn is trying to interpret the query based on matching terms against its field-specific indexes for personal names (firstname or lastname) and then for companies. It is also using proximity for this, hence morgan stanley is being recognised as the appropriate company name when the words are together, but not when they are separated. Word placement seems to matter too, in that the first word in the string is more likely to be interpreted as a personal name or a company name than as a keyword. This seems to be predicated on the fact that (presumably many) people will be using the Universal Search Box to search for a person at a known company, ie “I want to find Jane Doe at Google” so I am going to type in Jane Doe Google. Possibly the algorithm is working so that if the first word(s) are not in the name index, then the company index is checked. Eg if morgan is first (without stanley adjacent to it) then that is interpreted as firstname OR lastname Morgan. If enterprise is first, that is not a name, so it is interpreted as a Company.

    When LinkedIn is interpreting a company name it is looking for PAST OR CURRENT Company but when you use the Company= command it is limited to CURRENT Company only. Eg your example:

    enterprise AND company=morgan stanley 24 results
    is interpreted as
    PAST COMPANY: enterprise
    CURRENT COMPANY: morgan stanley

    If you were to try

    enterprise morgan stanley 147 results
    is interpreted as
    PAST OR CURRENT COMPANY: enterprise OR morgan stanley

    I haven’t quite figured at what point terms are being interpreted as keywords, so I look forward to hearing your full explanation!

    1. Post
      Author

      Rachel,

      Thanks for your thoughts.
      As a general comment, whatever LinkedIn is doing when it provides us with these results as in the searches in the post, it is NOT doing a good job. If it did – we, end users, would be satisfied, not puzzled or fascinated!

      So we are looking to describe what the code does. We see the code somehow does different things if it recognizes people and company names. We think it might try to find “familiar” names and company names close to the beginning of the string. It’s fine first steps to a description but not the full answer (which nobody has provided by the way).

      LinkedIn has not made it easy to understand what it does – not even to people who routinely search on it.

      -Irina
      P.S. To make things even more interesting, for many of these searches in LinkedIn Recruiter, Navigator, and Lite, the results are quite different.

      1. Ha, yes I am fascinated in the sense of OMG wow what is going on here?! Not in a good way 😉

        I use Recruiter Lite and always search through the Advanced interface (unless doing a very simple known person search) hence hadn’t seen this before.

        The combination of the counter-intuitive way this works compared to most search interfaces, and the fact that there is no real transparency around the interpretation makes it very difficult for users. Although to be fair, when you type a single word, you do get a prompt of options, eg if you type morgan there will be a prompt for ‘People who work at Morgan Stanley’ amongst other prompts for jobs, groups or people from your network with Morgan as a first name – so the interpretation is somewhat evident here, although not for more complex searches.

        I totally agree regarding getting different results on different LinkedIn platforms. I recently ran some test searches on Recruiter Lite and Sales Navigator side by side and could not always get an exact match in result numbers (despite comparable Advanced search interfaces), even when searching for the ‘entire universe’ of profiles within a country for example – sometimes RL returned more, sometimes SN did.

  6. I think that in order to understand what’s happening in the searches we need to think about what Glen Cathey pointed out years ago. The search engines are trying to guess relevance not only doing it based on the keyword density on the profiles etc. but by truly moving from the “give us what we sad” to the “give us what we want” model (or at least what they think that we want). We’ve seen this in the distribution of the profiles in our search results in the past, but it looks like the same approach appeared in the search itself as well.

    My wild guess was that the reason that LinkedIn produces these results was because how it interprets the questions. I was thinking that it’s doing something similar that we do when we’re running semantic and proximity searches, also recognizing the possibly connected words. So because of this – despite that we didn’t use the quotation marks in the original searches – the word order does matter in some cases.

    To solve the question, I tried to force LinkedIn’s search engine to give me what I sad and I started to play with adding quotations.

    1. morgan stanley enterprise java 14 results  –> Change word order –>
    2. morgan enterprise java stanley 754 results  –> Remove a keyword –>
    3. morgan enterprise java one result  –> Change word order –>
    4. enterprise java morgan 46 results
    5. morgan enterprise 130 results   –> Add a condition –>
    6. morgan enterprise AND company=morgan stanley 536 results –> Remove a keyword –>
    7. enterprise AND company=morgan stanley 24 results

    (note: at the time when I ran these, the second search was giving 758 and the 6th search 539 results)

    I used the second search as a benchmark because the order of the keywords isn’t suggesting that they are connected and LinkedIn is “thinking” about a basic AND AND AND type of search.
    To prove this, I added quotations to the first search and I got the same number, 758 results with:
    – “morgan” “stanley” “enterprise” “java”
    I also tried with adding ANDs to the second search and varying the order of the keywords, nothing changed, 758 results.

    To understand LinkedIn’s “thinking” in the first search I started playing around with the quotations and turns out that in order to get the same 14 results you need to use “morgan stanley” and from there the keywords order doesn’t matter. I got the same results with:
    – “morgan stanley” “enterprise” “java”
    – “morgan stanley” enterprise java
    – “morgan stanley” java enterprise
    – java enterprise “morgan stanley”
    – enterprise java “morgan stanley”

    Not just that the order of the other keywords doesn’t matter for LinkedIn, we shouldn’t get to excited about the idea that linkedin developed some great natural language search, it isn’t connecting java with enterprise as technology related keywords. It looks like it’s only recognizing common company names if the order of the keywords suggests that and it’s running the search:
    – java AND enterprise AND (Current company: “Morgan Stanley” OR Past company: “Morgan Stanley”)
    To prove this, I just ran a separate search java AND enterprise, in UK, and I used first the current company and after that separately the past company filters and in total I got the same results. Just to double check, I ran a different search and went through the steps and I’ve found the same things. The string was:
    – bank of america python developer

    I still can’t believe it, but it looks like for the 3rd and 5th search LinkedIn is thinking that we’re looking for a person who’s first or last name is Morgan and who have the other keywords mentioned on their profile. It’s doing this if the first keyword can’t be matched with a company name but it looks like a person’s name. If this is the case LinkedIn automatically searches for a name and for a name only, it’s not even trying to look up that specific keyword on other parts of the profile.

    In the 4th search because morgan isn’t at the beginning of the search LinkedIn isn’t considering it as a name so it’s looking for it on all parts of the profile (including company, title and name as well). With this search I’ve also found that unfortunately I got the same result as in the 3rd and 5th search when I was mixing the order of the keywords but I was specifically looking for “morgan” or AND Morgan.

    In the 6th search it looks like that you can “overwrite” the first or last name search if you start to look for a specific company. In this case LinkedIn will treat the “keyword which looks like a name” as a normal keyword and it’ll look for it the on whole profile page (including name, company, title, etc.).

    Looking at the search results, LinkedIn’s interpretation of the last search is absolutely hilarious… It looks like that because the first (and only) keyword turns up in company names, LinkedIn is thinking that it can’t be anything else just a current or past company (despite that you’re looking for a specific company as well at the same time). So basically in this case LinkedIn just listed those 24 profiles who currently work at Morgan Stanley in the UK, and one of their past employer’s (or a second current employer’s) name contained the word Enterprise.

    If my hypothesis/answer is correct I would have only one question to LinkedIn: But why?!

  7. I think that in order to understand what’s happening in the searches we need to think about what Glen Cathey pointed out years ago. The search engines are trying to guess relevance not only doing it based on the keyword density on the profiles etc. but by truly moving from the “give us what we said” to the “give us what we want” model (or at least what they think that we want). We’ve seen this in the distribution of the profiles in our search results in the past, but it looks like the same approach appeared in the search itself as well.

    My wild guess was that the reason that LinkedIn produces these results was because how it interprets the questions. I was thinking that it’s doing something similar that we do when we’re running semantic and proximity searches, also recognizing the possibly connected words. So because of this – despite that we didn’t use the quotation marks in the original searches – the word order does matter in some cases.

    To solve the question, I tried to force LinkedIn’s search engine to give me what I said and I started to play with adding quotations.

    1. morgan stanley enterprise java 14 results –> Change word order –>
    2. morgan enterprise java stanley 754 results –> Remove a keyword –>
    3. morgan enterprise java one result –> Change word order –>
    4. enterprise java morgan 46 results
    5. morgan enterprise 130 results –> Add a condition –>
    6. morgan enterprise AND company=morgan stanley 536 results –> Remove a keyword –>
    7. enterprise AND company=morgan stanley 24 results

    (note: at the time when I ran these, the second search was giving 758 and the 6th search 539 results)

    I used the second search as a benchmark because the order of the keywords isn’t suggesting that they are connected and LinkedIn is “thinking” about a basic AND AND AND type of search.
    To prove this, I added quotations to the first search and I got the same number, 758 results with:
    – “morgan” “stanley” “enterprise” “java”
    I also tried with adding ANDs to the second search and varying the order of the keywords, nothing changed, 758 results.

    To understand LinkedIn’s “thinking” in the first search I started playing around with the quotations and turns out that in order to get the same 14 results you need to use “morgan stanley” and from there the keywords order doesn’t matter. I got the same results with:
    – “morgan stanley” “enterprise” “java”
    – “morgan stanley” enterprise java
    – “morgan stanley” java enterprise
    – java enterprise “morgan stanley”
    – enterprise java “morgan stanley”

    Not just that the order of the other keywords doesn’t matter for LinkedIn, we shouldn’t get to excited about the idea that linkedin developed some great natural language search, it isn’t connecting java with enterprise as technology related keywords. It looks like it’s only recognizing common company names if the order of the keywords suggests that and it’s running the search:
    – java AND enterprise AND (Current company: “Morgan Stanley” OR Past company: “Morgan Stanley”)
    To prove this, I just ran a separate search java AND enterprise, in UK, and I used first the current company and after that separately the past company filters and in total I got the same results. Just to double check, I ran a different search and went through the steps and I’ve found the same things. The string was:
    – bank of america python developer

    I still can’t believe it, but it looks like for the 3rd and 5th search LinkedIn is thinking that we’re looking for a person who’s first or last name is Morgan and who have the other keywords mentioned on their profile. It’s doing this if the first keyword can’t be matched with a company name but it looks like a person’s name. If this is the case LinkedIn automatically searches for a name and for a name only, it’s not even trying to look up that specific keyword on other parts of the profile.

    In the 4th search because morgan isn’t at the beginning of the search LinkedIn isn’t considering it as a name so it’s looking for it on all parts of the profile (including company, title and name as well). With this search I’ve also found that unfortunately I got the same result as in the 3rd and 5th search when I was mixing the order of the keywords but I was specifically looking for “morgan” or AND Morgan.

    In the 6th search it looks like that you can “overwrite” the first or last name search if you start to look for a specific company. In this case LinkedIn will treat the “keyword which looks like a name” as a normal keyword and it’ll look for it the on whole profile page (including name, company, title, etc.).

    Looking at the search results, LinkedIn’s interpretation of the last search is absolutely hilarious… It looks like that because the first (and only) keyword turns up in company names, LinkedIn is thinking that it can’t be anything else just a current or past company (despite that you’re looking for a specific company as well at the same time). So basically in this case LinkedIn just listed those 24 profiles who currently work at Morgan Stanley in the UK, and one of their past employer’s (or a second current employer’s) name contained the word Enterprise.

    If my hypothesis/answer is correct I would have only one question to LinkedIn: But why?!

    1. Post
      Author

      Adam,

      Cool, huh?
      It looks like you have described the cases and “what they were thinking” close to reality. 🙂 I was looking for an algorithm description outside of the examples.
      Just one note, I don’t think LinkedIn does anything special about single words in quotation marks.

      Thanks!
      Irina

      1. I think that with the new LinkedIn search we can forget boolean logic because the search engine tries to process our search string as a whole and not keyword by keyword. I looks like the algorithm analyzes our search and if it contains a keyword (or several keywords) that match LinkedIn’s database of companies it automatically processes that as a company. If there are several keywords in the search which are company names it will process it as AND and not as OR. I tried a few other searches and it looks like the location of the company or person’s name in the search doesn’t matter, the algorithm tries to squeeze your keywords into 3 categories: company, name, skill, because it’s trying to do some natural language processing and thinks that us humans would always search like this:
        – people who work here and do this
        – this person here who does this
        – the person who worked here and there and has experience with this
        – etc.
        Also I think that company names have a priority over person names and also over skills, try for example ENTERPRISE MORGAN MICROSOFT it’s translated to COMPANY1 NAME COMPANY2, it will be Morgans who worked at enterprise and microsoft and not Morgans who worked at enterprise and have microsoft .net experience.
        I’m just wondering what happens if the keyword is a first name and/or a company name at the same time. Also what about education? Following this logic the algorithm should process university names automatically as well.

        Looking at the searches again:
        – morgan stanley enterprise java: company1 AND company2 AND skill, also the word order doesn’t changes anything, you’ll get the same results if you search:
        – enterprise java morgan stanley: company1 AND skill AND company2
        – morgan enterprise java stanley: it might happen that of our search isn’t similar at all to a natural language search than the algorithm wont try to squeeze it into the company, name, skill boxes.
        This could be the reason that there is a difference between the results of 3rd and 4th search:
        – morgans who worked at enterprise and btw have java
        – enterprise people who btw have java and morgan, because there isn’t logic behind asking the question this way: enterprise folks who have java and oh btw who are called morgan.
        – morgan enterprise: name AND company, who was that Morgan who works or worked at Enterprise?

        But the last 2 searches are confusing, maybe not only for me but for the search engine as well 🙂 Can’t really understand the why’s, if it’s picking up enterprise as a company in the last search why isn’t interpreting it as a company in the 6th search? Instead of morgan enterprise AND company=morgan stanley I tried:
        – morgan amazon AND company=morgan stanley
        – morgan aws AND company=morgan stanley
        It didn’t interpret neither amazon or aws as a company name, I had search results where it was only mentioned in the skills. So it looks like that if you’re searching in specific fields and your search is longer than one keyword LinkedIn will think that “you know what you’re doing” the algorithm will ditch the semantic search and it will do classic boolean. On the other hand if you just run a quick one word search it will again try to match it to a company/skill/name, for example amazon AND company=morgan stanley gave me the answer to the “any people work or worked at amazon and now they are at morgan stanley?”

        Talking about the single words in quotation marks, it might be that LinkedIn is thinking “you sure have a reason of highlighting that, let me just check if that’s a company or a name by any chance”.
        I think this because these 2 searches gave the same results:
        – enterprise java “morgan”
        – morgan enterprise java

        I’m pretty sure that something happens in the background when you’re using an AND operator or quotations around a specific word to try to tell LinkedIn that hey I want something over here, for example:
        – alexa enterprise java -> 48 results
        – enterprise java alexa -> 3 results
        – enterprise java “alexa” -> same 48 results again

  8. Adam Kovacs – that’s a really interesting observation re the quotes and ANDs.

    I had assumed there was an implicit AND between all the search terms (apart from where morgan is adjacent to stanley and being interpreted as a phrase which matches the indexed Company name) and therefore adding actual ANDs wouldn’t make any difference. If you try adding ANDs between each term in all the example searches (excluding morgan AND stanley obviously) it makes no difference to the results, other than for this one:

    enterprise java morgan 46 results

    which when entered as enterprise AND java AND morgan gives 1 results because suddenly morgan is presumed to be a person’s name again. It is really hard to see any logic to this!

    Also, given that many searchers are used to using quotes for verbatim in Google, ie ‘don’t mess with or interpret this term please’, again we have a counter-intuitive interpretation.

    I do remember reading something way back before the new LinkedIn interface was released saying that the search (through the Universal Search Box) was being simplified and aligned with how ‘most people search LinkedIn’ (or words to that effect) so they have consciously designed a semantic search based on (presumably) analysing search behaviour. However, they have done some rather odd things in the process!

    1. Post
      Author
      1. So in summary, other than our experiments to deduce functioning, it seems that no one clearly understands how LinkedIn search is actually supposed to work now, and they don’t clarify this for us anywhere either?

  9. Pingback: Advanced LinkedIn-Based IQ Test | Boolean Strings

  10. Pingback: How LinkedIn “Loses” Your Potential Candidates | Boolean Strings

  11. Pingback: Crelate's Top 10 Sourcing Articles from March 2017 - Crelate

Leave a Reply

Your email address will not be published. Required fields are marked *