There is no substitute for diligent research when a journalist fact-checks a story.
But if you, the reader, suspects an article, video or image might be fake, there are some simple tools you can use to help decide whether what you’re looking at is real or phony.
As journalists working for an organisation that verifies social media content, we spend our time monitoring the depths of the internet, trying to sort fact from fiction on behalf of news providers, including the Australian Broadcasting Corporation.
These are some of the low-cost tools we use to do our work — which are also available to you, at the astounding price of absolutely free.
Digital products can change drastically over time — sometimes for the worse — and are often replaced by better ones, so we’re not saying these tools are fool-proof. But they’re a good place to start if you’re looking to do some digital detective work.
Reverse image searches
You’d be forgiven for seeing this tweet and believing it was real. In the wake of the Paris attacks in November 2015, this image was shared on various social media platforms and showed the bolstering image of a crowd holding illuminated letters stating they were “not afraid”.
Terrorists had killed 130 people late at night on November 13, in a series of coordinated attacks at the Bataclan Theatre. The world was reeling and needed something beautiful to hold onto.
A heartfelt image usually does the trick, right?
Shame it wasn’t as it seemed.
Despite its hopeful veneer, the image was from earlier in the year in the immediate aftermath of the Charlie Hebdo attack in January. It was taken as Parisians mobilised in the Place de la Republique, holding aloft their hopeful message.
But how to work it out? It’s not much harder than a few clicks: reverse searching.
It’s a simple process: logging into Google Chrome, right-clicking on an image and selecting “search Google for image”. Downloadable tools like TinEye or Reverse Image Search will do the same job.
All three quickly parse the web for earlier instances in which the image, or something resembling it, has been used.
In this case, it showed a tweet from January 8 bearing the image, proving it was not from that evening, but from the time of the attack at the Charlie Hedbo offices.
One of the best ways to monitor changes online is through archiving repositories. One particularly popular one is Wayback Machine. It’s existed since 1996 and is a virtual library of the changes to billions of pages across the web. Crawlers add live grabs of webpages at a series of points throughout the year, saving them exactly as it appeared in that moment, and some people even volunteer their time to regularly archive pages.
Unfortunately, Wayback doesn’t index social media pages or anything that requires a log-in and password, but it’s still extremely useful. If an organisation tries to erase certain content from the face of the internet during a crisis, Wayback is often used as an accountability measure.
In 2017, when the US Department of State published a blog post that seemed to be an advertisement for President Donald Trump’s property Mar-a-Lago, it was promptly pulled amid the backlash. But not before it was archived. That link now shows a short statement about the article and its reason for deletion.
At its core, Wayback is a great way of capturing the world at a point in time. Having the extension installed also notifies the user if there are previous instances of the page that have been archived.
For example, if you wanted to check the changing views of the Salvation Army about same-sex relationships, Wayback will show you that in 2006, their website stated “Homosexual practice, however, is, in the light of Scripture, clearly unacceptable”.
This is no longer the case in the current day, and that link is dead. The church has even gone so far as to remove the page in its entirety.
Another example is pm.gov.au — the official website for the ever-changing role of prime minister. On August 24, it shows two captures. At 1:49am, Malcolm Turnbull beamed out of pm.gov.au. Just 12 hours later, the page was under construction, stating Scott Morrison had been sworn in that day.
A bot on social media is a profile or account whose posts are automated or scheduled. It would be inaccurate to describe all bot accounts as fake, as some are managed by real people, but not in the way a typical user on Facebook or Twitter would use them.
While they often get a bad rap, not all bots are malicious — and some automated scripts can be helpful, educational and creative.
For example, on Reddit there are automated bots programmed to read headlines on news articles and provide alternative sources on the same story to promote a diverse media diet.
But “good bots” generally don’t try to hide their intentions and it’s often the ones disguising their programming that are most problematic.
Bots commonly exist as scam accounts on Facebook and Twitter, using photos of attractive and scantily clad women to lure people into clicking malicious links or surrendering financial information with the promise of erotic encounters. These accounts are also prevalent on dating apps such as Tinder and Bumble.
Then there are the accounts designed to bolster visibility or promote a particular message and are used to spam out favourable news stories and broadly influence a political conversation.
Bots can also be used to artificially boost an account’s popularity, allowing thousands of “followers” to be purchased for as little as $5. The major social media platforms announced a massive purge of automated accounts this year, leading some profiles to lose half their follower base in efforts to promote transparency and authentic discourse.
But they’re still prevalent and harder than ever to spot.
Limitation in language and machine learning used to give bots a sense of uncanny valley, and automated accounts would typically post with an utter contempt for basic grammatical rules. But now, the wealth of data available to bots has allowed their posts to appear more sophisticated and avoid the flags that would previously get them auto-banned. The new generation of bots sound more human, even if the logic of their content is cooked.
It’s possible to analyse an account manually to determine the probability it is automated by employing a series of checks; such as whether their name appears to have been randomly generated in a machine, if they’re posting too frequently, if they use a generic profile picture or if there is no originality in their language.
However, more than once journalists have been disappointed to find an account they believed was a bot was actually very real, if poorly managed.
Botometer, a tool developed by the Observatory on Social Media, attempts to do most of the heavily lifting by analysing over 1,200 features of an account to provide a probability rating on its authenticity. It’s far from perfect and it will often produce false positives (for example, it will read organisation accounts belonging to Australian and American politicians as being bots) but it’s easier than counting thousands of tweets yourself.
Optical character recognition
Google Translate is one of the more reliable free translation services available, but it requires a direct copy of the information to function. You have to be able to copy and paste your information into the webpage, which isn’t always possible if you’re trying to translate something directly off an image and it’s not in a Latin-based language.
Optical character recognition, or OCR, refers to the mechanical or electronic conversion of an image of text into a usable format. Think of it as text-to-speech, but with pictures.
In our newsroom, when we’re trying to geo-locate an image or read in a foreign country, online OCR tools allow us to take a screenshot of a road sign or store front and convert it into a readable format. This often gives us clues as to where the video originated from and can help us to distinguish whether a car explosion occurred in Damascus or Homs in Syria.
A search for online OCR will provide an ample list of tools with multilingual support, but none of them are even close to optimal. Handwritten notes or obscured text are difficult for the reader to identify and extract.
Copyfish, for example, has a high-degree of reliability with very clean text, like subtitles in a video, but has difficulty in reading anything not cleanly printed.
So, whenever an OCR is used, a repeated attempt with numerous programs and tools is advised.
Amy Lees and Kevin Nguyen are journalists at Storyful Sydney, providing editorial consultation for News Corp and ABC in Australia.