A “Dream Software” Design Proposal

booleanstrings Boolean

Disclaimer. This post is somewhat technical and doesn’t contain specific sourcing tips. It is relevant to my SourceCon Presentation, where I go over a specific kind of sourcing tools. Those tools are apparently gaining attention among recruiters. I am going to post detailed reviews of the tools – listed at the bottom of this post for your reference – here on the blog over the next few weeks.

Here’s a “Dream Software” Design Proposal. Both the most challenging and the key piece for the dream software is connecting the parts of a distributed profile. End-users of the dream software don’t appreciate the challenge! For a human, the connection between two online profiles can be clear, while for the computer it is not as easy, since all of the informal clues need to be formally coded.

If the dream software vendor is reasonably careful and tries not to glue profiles together unless it’s very clear that they should be, the end-user will complain about duplicates. If the vendor is boldly making guesses, then profiles from different individuals will be incorrectly collected as one record, which is, in fact, even worse. (My coworker David Galley tests out the software using his own name; if you are one of the vendors, I recommend to try it out using your tool.)

In the proposed design we solve the “matching profiles from different sites” challenge upfront, by only working with unique identifiers, such as an email address, either work or private, a phone number, a combination of a person’s name and a company name that fits only one person, or an image.  If a company uses an email format, then for not-very-common first-and-last names we can reliably construct the work email address that can be verified (using the logic like this) in the process of creating a record. An excellent identifier is a person’s photo that is often the same across different social profiles.

We start building the database with those identifiers. There’s a variety of ways to collect those from the open web. As an example, we could start with recent resumes posted online and get email addresses from them as the IDs. That would collect a very large number of those IDs. (Remember our sourcing challenge asking “How many resumes are there on the Internet?”) There are also sites that list attendees, members, etc. – as we teach each other in people sourcing discussions and classes. There are lists of professionals with contact info in excel and PDF files. If that’s not enough, there are obscured email addresses across email list archives and the like.

From the unique IDs we go to various social networks and blogs to pick additional information by cross-referencing. We know that an email address identifies the member on all major networks, including LinkedIn, Twitter, Facebook, Google+, and more. If we can be friends with Rapportive/LinkedIn, or just with LinkedIn, we get a head start on cross-referencing. In fact, having an agreement with LinkedIn is especially important; worst case, if this is not accomplished, a public LinkedIn profile can be picked dynamically.

For any social profile that lists other profiles – that often happens on Google+, but not only – we add those profiles to the person’s record as well. Mind you, we are still confident that it’s the same person’s social profiles.

We don’t do much else. Rather, we carefully parse and collect the info obtained by cross-referencing into our database and provide reasonable faceted search for the end-user. Parsing can be specifically implemented for a few dozen social networks and forums (which we’ll need to watch for updates of the HTML formats). For online resumes we can rely on a resume parsing tool.

If there are other proven ways (not to guess but) to cross-reference more social profile data from the already-collected data in the records, we’d implement that as well.

While every People Sourcer and all the Dream Software tools do cross-referencing, we’ll need to be extremely careful about privacy issues and explore how to best address them.

If anyone is up for funding the proposal, just give me a ring, will you?

Thanks! and I will be reviewing the existing Dream Software tools in the upcoming blog posts. I will also be sharing an additional design idea for the existing tools that comes directly from experiencing the LinkedIn’s Talent Pipeline.

In the meantime, please take a look at some of the tools (repeating, just in case: I am not affiliated with any):

  1. TalentBin
  2. Dice Open Web
  3. TheSocialCV
  4. Entelo
  5. RemarkableHire
  6. Gild

Please stay in touch about your experiences!