cyber daily logo

Breaking news and updates daily. Subscribe to our Newsletter

Breaking news and updates daily. Subscribe to our Newsletter X facebook linkedin Instagram

The great data ‘e-scrape’

Veteran cyber security advisor Eric Pinkerton explains how you can safeguard your personal data on social media platforms.

user iconEric Pinkerton
Wed, 15 Sep 2021
The great data ‘e-scrape’
expand image

In April this year we opened up our news feeds to find that in the same week both Facebook and LinkedIn had suffered a massive data breach impacting over 533 million and 500 million users, respectively.

Fast forward two months, and like a bad case of deja vu, our news feeds are again filled with news of another massive data breach at LinkedIn, this time affecting 700 million users.

The problem is that, whilst these incidents have frequently been heralded in the press as ‘breaches’, they are really better described as ‘data scrapes’?

So, what does that actually mean, and more importantly, does it justify the level of hysteria that some of these articles convey?

Put plainly, a breach requires an attacker to make a system do something it was not designed to do, and by contrast a data scrape is the result of an attacker making use of an existing feature in the way it was intended. The contention arises in the scale at which this occurs.

In the case of Facebook, the contacts feature of the mobile app was abused. This feature is one of the methods that Facebook uses to present you with a list of Facebook users who you may know. The problem is that by creating a contact list of every phone number possible, the attacker was able to query Facebook’s servers to see if any of their ‘existing contacts’ were Facebook users.

For each match, Facebook allowed them to retrieve the full name, profile pic, home city, and anything else publicly available for the user associated with that number, thus the resulting database would allow them to search for a given phone number in order to return the name of the owner of that phone.

Now it’s unlikely that Facebook’s developers ever considered a use case involving a legitimate request for every single user, and good practice would have been to employ rate limiting controls or a CAPTCHA to detect and prevent automated bulk queries of this nature.

  • This issue came to light because a researcher found the subsequent database being offered for sale on a forum, but it’s a mathematical certainty that this function was being routinely abused for years.
  • Nobody actively publishes their phone number for all and sundry on their Facebook profile, but it’s now increasingly difficult to register on any social media platform without supplying a mobile number, often as part of a validation check, or even as a second factor of authentication.

When it was launched in 2004, ‘The Facebook’ as it was then called was simply an online version of an existing paper directory containing the names and faces of students on the Harvard Campus.

Initially, all users had access to all the information, and this was not perceived as an issue, because access was limited to other Harvard students.

That changed in 2007 when everyone with an email address was invited to join, and as the news feed was launched the first privacy issues began to surface. Users began to notice that updates and pictures they had intended to share with a small select group of friends were being shared with a much wider audience. This led to cases where vulnerable LGBTI users were unintentionally outed to their families or colleagues.

When Facebook went mobile in 2009 there was an explosion in the amount of media people were uploading and these users were afforded three options for the sharing of information: Everyone; Friends; and the innocent sounding ‘Friends of Friends’.

‘Friends of Friends’ sounds like a fun coach trip with a group of 30 or 40 carefully selected people, but it’s actually more akin to a sellout rock concert in a stadium full of strangers because the average (mean) number of friends on Facebook is 338. Multiply that number by itself and you have a group often in excess of a hundred thousand people.

Of course, it’s never really been in any social media companies’ interest to facilitate their users ability to restrict access to the content they upload.

By drip feeding us with incremental changes to privacy settings, and in many cases making these burdensome for users to navigate and adopt, Social networks have engineered privacy to be hard, whilst simultaneously claiming credit for improving the problems they themselves have given birth to.

All of this created the ideal conditions for Cambridge Analytica’s app "This Is Your Digital Life". The app asked a series of questions in order to build out a psychological profile of the user, but in doing so it was able to collect the personal data of that users friends using Facebook's Open Graph platform.

This enabled CA to harvest the data of up to 87 million users, data which the users themselves had explicitly elected not to make public.

Today there exists a wealth of data about you that few of us can begin to comprehend, and yet we all actively consented to it being shared with ‘select partners’ because we blindly clicked on ‘I Agree’ at the bottom of a 30-page end user licensing agreement when we signed up.

This dataset includes additional information and metadata that we did not consciously provide, For example the recent LinkedIn data scrape was purported to contain information inferred by LinkedIn relating to the users salaries.

When this data is collected, aggregated and correlated with other data sets, it’s value to an attacker increases considerably.

So, what if anything can individuals do about this?

Well short of eschewing all things technology, and moving to a cabin in the woods, or changing your name and email every few months, there is not much that we can do.

I certainly don’t expect anyone to read and understand all of the EULA’s and Privacy Policies we routinely agree to.

Here’s what I choose to do:

  1. I’m wary of what information I consciously upload to social media, and try to weigh up the positives with the negatives before I post;
  2. I seldom add friends on FB/LinkedIn that I do not know or have never met (Especially if they have an attractive picture in their profile);
  3. I use sock puppet accounts and or a burner SIMs for anything I perceive to be risky such as open source intelligence gathering; and
  4. Finally, I have accepted the reality that privacy is akin to physical fitness, no longer a given, but an aspiration that requires constant effort and sacrifice to attain and maintain.

Eric Pinkerton is a cyber security advisor at Trustwave.

Comments powered by CComment

cd intro podcast

Introducing Cyber Daily, the new name for Cyber Security Connect

Click here to learn all about it
cyber daily subscribe
Be the first to hear the latest developments in the cyber industry.