Cyborg scraping

LinkedIn has a page that shows the people who most recently followed you.

At first, it shows just 20 people. But as you scroll, it keeps fetching the rest. I’d love to get the full list on a spreadsheet. I’m curious about:

  1. What kind of people follow me?
  2. Which of them has the most followers?
  3. Who are my earliest followers?

But first, I need to scrape this list. Normally, I’d spend a day writing a program. But I tried a different approach yesterday.

Aside: it’s easy to get bored in online meetings. I have a surplus of partially distracted time. So rather than writing code to save me time, I’d rather create simple tasks to keep me occupied. Like scrolling.

So here’s my workflow to scrape the list of followers.

Step 1: Keep scrolling all the way to the bottom until you get all followers.

Step 2: Press F12, open the Developer Tools – Console, and paste this code.

copy($$('.follows-recommendation-card').map(v => {
  let name = v.querySelector('.follows-recommendation-card__name')
  let headline = v.querySelector('.follows-recommendation-card__headline')
  let subtext = v.querySelector('.follows-recommendation-card__subtext')
  let link = v.querySelector('.follows-recommendation-card__avatar-link')
  let followers = '', match
  if (subtext) {
    if (match = subtext.innerText.match(/([\d\.K]+) follower/)) {
      followers = match[1]
    } else if (match = subtext.innerText.match(/([\d\.K]+) other/)) {
      followers = match[1]
    }
  }
  followers = followers.match(/K$/) ? parseFloat(followers) * 1000 : parseFloat(followers)
  return {
    name: name ? name.innerText : '',
    headline: headline ? headline.innerText : '',
    followers: followers,
    link: link ? link.href : ''
  }
}))

Step 3: The name, headline, followers and link are now in the clipboard as JSON. Visit https://www.convertcsv.com/json-to-csv.htm and paste it in “Select your input” under “Enter Data”.

Step 4: Click on the “Download Result” button. The JSON is converted into a CSV you can load into a spreadsheet.

I call this “Cyborg scraping“. I do half the work (scrolling, copy-pasting, etc.) The code does half the work. It’s manual. It’s a bit slow. But it gets the job done quick and dirty.

I’ll share later what I learned about my followers. For now, I’m looking forward to meetings 😉

PS: A similar script to scrape LinkedIn invitations is below. You can only see 100 invitations per page, though.

copy($$('.invitation-card').map(v => ({
  name: (v.querySelector('.invitation-card__title') || {}).innerText || '',
  link: v.querySelector('.invitation-card__link').href,
  subtitle: (v.querySelector('.invitation-card__subtitle') || {}).innerText || '',
  common: (v.querySelector('.member-insights__count') || {}).innerText || '',
  message: (v.querySelector('.invitation-card__custom-message') || {}).innerText || '',
})))

PS: A similar script to scrape LinkedIn people search results is below.

copy($$('.entity-result').map(v => {
  const name = v.querySelector('.entity-result__title-text [aria-hidden="true"]');
  const link = v.querySelector('a');
  const badge = v.querySelector('.entity-result__badge [aria-hidden="true"]');
  const title = v.querySelector('.entity-result__primary-subtitle');
  const subtitle = v.querySelector('.entity-result__secondary-subtitle');
  const summary = v.querySelector('.entity-result__summary--2-lines');
  const insight = v.querySelector(".entity-result__simple-insight-text");
  return {
    name: name?.innerText || '',
    link: (link?.href || '').split('?')[0],
    badge: badge?.innerText || '',
    title: title?.innerText || '',
    subtitle: subtitle?.innerText || '',
    summary: summary?.innerText || '',
    insight: insight?.innerText || '',
  }
}))

2 thoughts on “Cyborg scraping”

  1. I think rather than create JSON, building a comma separated string and console.log-ging that string will show the whole list in the Dev Tools Console. Using the Save As option in the context menu, the list can be saved as a .CSV file directly without Step 3 & 4.

  2. Pingback: Tools to publish annotated talks from videos - S Anand

Leave a Comment

Your email address will not be published. Required fields are marked *