Blog

The rise of #Fakedata

By Rupert Burnham | October 17, 2017

Data.  It’s changing the very way that we live and how society operates with the level of insight that it can provide.  However like the old adage you’re only as good as the tools you use, the insight gleaned is only as good as the data it comes from.  Herein lies a problem; what if the data used to extrapolate is false or misleading?  If so then any insight gained could be wrong and potentially have serious ramifications.

Fake news abounds with no signs of abating.  In tandem and not given as much prominence, reports of fake data are also on the rise and potentially far more serious.  In recent months we’ve learnt that Facebook has been artificially inflating its reach.  It has come to light that their advertising data doesn’t tally with census data for millennial and other demographics to the tune of millions of people.  They have claimed that they can reach more people than actually exist in the UK, US, Australia, Ireland and France, according to census data.  In the UK, Facebook says that it can reach 7.8 million users aged between 18 and 24, though according to the Office of National Statistics, there were only 5.8 million people in that age group in the whole of the country in 2016. The problem appears to be systematic and global, with similar discrepancies being found around the world in some of Facebook’s key markets.

Advertisers are the ones chiefly impacted by this and they are angry, especially as they are still smarting about ads running around fake news sites, and the social media site’s admission of inflating the average time people spend watching videos, in some cases by up to 80%.


In a statement about this latest finding, Facebook admitted that its audience estimates didn’t match census data, but added that this was by design as ad reach numbers “are designed to estimate how many people in a given area are eligible to see an ad a business might run. They are not designed to match population or census estimates”.  They also added, “This is just an estimator and campaign planning tool. It’s not a business’ actual reach or campaign reporting, and is not billable.”  That’s all well and good, but what the social media giant has done is tantamount to lying, and there are far reaching implications to this.  The data they’ve provided is inaccurate, meaning the calculations on which advertisers have based their decisions are also inaccurate.  Worryingly, Facebook don’t seem overly concerned by their actions.  This is almost of more concern, that a company as large as Facebook should be blasé about this falsehood. Being as large as they are, they don’t even need to inflate their numbers.

This example of fake data isn’t even the worst though. Recent reports have surfaced that there is a thriving ecosystem of websites that allow users to automatically generate millions of fake "likes" and comments on Facebook, as documented by researchers at the University of Iowa.  Working with a computer scientist at Facebook and one in Lahore, Pakistan, the team uncovered more than 50 sites offering free, fake "likes" for users' posts in exchange for access to their accounts, which were then used to falsely "like" other sites in turn.

But what’s the impact of this?  Well, a large number of “likes” pushes a posting up in Facebook’s algorithm, making it more likely that the post will be seen by more people, which in turn lends further credibility and legitimacy to the post in an escalating cycle.  If as Mark Twain said “A lie can travel half way around the world while the truth is putting on its shoes”, then this is the hyper accelerated version, and worrying.  It makes it harder to be subjective and ascertain whether content is real or fake.  Also of concern here is that users are knowingly entering into an agreement to falsely obtain "likes."  However in doing so, they may not realise what they're actually giving up.  It may seem a relatively harmless agreement, but they’re willingly handing over full control of their Facebook account.  This means that the sites providing fake ‘likes’ can also access all of the information that’s available on profile pages, see posts, get hold of friends lists, even read private messages.  In this case, there was no way of telling whether the information was being collected and sold to others, but it’s a realistic assumption.

Fake data and fake news can also combine as we’ve recently seen in last year’s US Presidential election.  Facebook recently admitted that it had found that an influence operation based in Russia spent $100,000 on ads promoting divisive social and political messages in a two-year-period through to May.  They ascertained that many of the ads promoted 470 “inauthentic” accounts and pages that Facebook has now suspended.  Rather than backing a particular political candidate (too obvious), the ads spread polarizing views on topics such as immigration, race and gay rights.  No links were apparently found linking it to any presidential campaign, three-quarters of the ads were national in scope, and the rest did not appear to reflect targeting of political swing states as voting neared.

It would be extremely hard, if not impossible to measure exactly what impact these ads will have had, but it is concerning, it foments division and blurs the line between what is real and what is false and people may be basing their opinions and decisions on this misinformation.  The reach these ads may have had is also hugely concerning.  Facebook has said these ads reached 10 million of its users, but they may have actually been seen billions of times.

So what are the ramifications and what can be done about the proliferation of fake data and news?  Increasingly it is felt that there’s an onus of responsibility on social media companies and tech giants such as Facebook and Google to ensure that the data and news they serve is accurate and correct.  Such is their position in society and so heavily are they used by companies and individuals alike that it could be argued they have a moral duty to society to ensure that their content is genuine.  Whether this be in the form of regulation, or self-governance, is of much heated debate. 

Problematic within all of this, is us, the consumers.  It seems to be in our nature to believe what’s reported as fact.  If the data states that something is correct, then it must be so.  All to often we’re happy to leave the critical analysis to others and digest what we’re spoon-fed and told.

Responsibility for what data is real and fake doesn’t just lie with social media companies and tech giants though.  We have a personal responsibility to ensure that the data and information that we generate, that we publish willingly and knowingly is also accurate.  Often we are happy to sign up and away, without knowing what information or rights we are actually giving away.  It is this information that is harvested on a grand scale and used to tailor services, programs, algorithms and ideas.  The social media profiles that we create we sometimes purposefully change.  Most of us are guilty of exaggerating some aspects of our profiles making it unclear whether they’re an accurate and true reflection of us as an individual.  In some instances, the viewing profiles of some platforms are based on our personal profiles. However if these viewing profiles are based on questionable profiles and data, then the suggestions made and insights gathered will be wrong and misleading.

If we are to demand that we be presented with accurate data and news, then we should also be providing accurate data.  The more data we generate as individuals and collectively, then the more pressing the need for checks and procedures becomes.  The Data Science Universe, as is often lauded, has the potential to revolutionise society as we know it.  But if the data is wrong, the ramifications may become more serious.  Systems may not work as intended, and dubious judgements, decisions, algorithms and programs could be made.  In an extreme scenario, the whole edifice could come crashing down like a house of cards.

There’s a strong argument to make for social media platforms, much as with publishers, to have to prove that the data and information that they’re displaying is shown to be accurate and correct, that sources and information are rigorously checked to ensure their veracity.  But if this is what’s required, then the same should apply to us as individuals.  

At the very least, in the current climate, we should all be very cautious about what we base our insights and decisions on; dangers abound for those companies who hang their hats on false information.  Just recently it was revealed that more than 80% of the comments submitted to a US regulator on the future of net neutrality came from bots, according to researchers, and worryingly, most were against net neutrality

In ‘A Study in Scarlett’ by Sir Arthur Conan Doyle, Sherlock Holmes said, “It is a capital mistake to theorize before one has data.” Very true, but then wonders just how much fake data the famed detective had to contend with.

Tags: facebook fake news fakedata mipcom tv