Simple strategies to find the right data for your stories

By Gemma Ritchie

At The Outlier, we turn data into compelling charts and stories covering everything from health and climate change to food prices and loadshedding. One of the most common questions we get is: Where do we find our data?

A data story usually starts in one of two ways:

  • You have data that you want to analyse to uncover insights and create charts.
  • You have a topic in mind but need to find data to support your story.

If you’re in the second camp, this guide is for you. We’ll focus on how to find reliable data online.

Many organisations, research institutes and international agencies make data publicly available – you just need to know where to look.

Evaluating data

To ensure you source reliable data on a particular subject, consider the following approaches:

  • Look for reputable sources: Prioritise data from reputable, authoritative and peer-reviewed sources such as academic journals, research institutes, government agencies and professional organisations.
  • Assess the source: Check who collected, analysed and published the data. Evaluate their reputation, expertise and credibility. Transparency regarding their methodology, assumptions and limitations is also key.
  • Avoid unreliable sources: Avoid biased, outdated or unverified sources such personal blogs, social media posts or commercial websites, especially companies that may have vested interests.

For example, mining employment data from the Department of Mineral Resources and Energy is published in their Mineral Economics Bulletin, which helps verify its credibility.

Types of data

Who shares data? Here are links to the data collected and published by some useful and reputable websites:

Governments and their agencies

South Africa:

  • Statistics South Africa: The national statistical service with comprehensive data on various aspects of SA society, economy and environment. Reliable and regularly updated.
  • Municipal Money: This platform offers detailed financial information about municipalities in South Africa, including budgets and expenditures – essential data for assessing municipal performance.

Elsewhere:

International organisations

Academic and research organisations

  • Google Scholar: Provides access to a vast range of scholarly articles across disciplines. It’s an excellent starting point for finding peer-reviewed research.
  • ResearchGate: Repository for academic papers across various fields.
  • Oxford University’s excellent Our World in Data. It compiles research from various sources, focusing on understanding trends in development issues.

Non-profits

  • Open Data Institute, based in the UK, promotes the use of open data to drive innovation.
  • OpenAfrica, a Code for Africa project that collects and shares open data relating to African development issues.
  • World Resources Institute. Focused on environmental issues, its data relates to global sustainability practices and policies.

Industry and private-sector data

  • Statista: Aggregates data from wide range of industries. Some content is behind a paywall.
  • Bloomberg Terminal: Leading source of financial markets. Subscription-based.
  • S&P Global: Extensive market intelligence. Subscription-based.
  • Moody’s Analytics: Economic forecasts and risk analysis. Subscription-based.
  • Nielsen: Consumer behaviour analytics and market trends. Paid (business agreement).
  • Quandl: Financial, economic and alternative datasets; particularly useful for quantitative analysis in finance. Free and paid.
  • Amazon Web Services Open Data: Large-scale datasets. Free.
  • Google Trends: Search trends over time at no cost.
  • BP Statistical Review of World Energy: Global energy statistics, useful for analysing trends. Free

Community and crowdsourced data

  • Wikipedia: Valuable summaries of almost every topic, with references to primary sources that can be further investigated.
  • Ushahidi: Citizen-generated data relating to governance and human rights issues.

Tools to find data

The key is knowing which organisation is likely to have the data you need. There are two ways to find it: using AI tools or smart keyword searches.

1. Using AI

AI tools such as Perplexity, Gemini, ChatGPT and Claude can help you quickly identify organisations that share data – often more efficiently than a traditional Google search. We like starting with Perplexity, as it provides multiple sources as well as descriptions of the data provided.

Example: Finding dam level data

Typing this prompt into Perplexity: List reliable and authoritative sources of data about South Africa’s dams

Will likely give you an answer much like this one:

Here are some reliable and authoritative sources of data about South Africa’s dams:

At The Outlier, we also look at other well-known local institutions which may have relevant water-related data, such as:

2. Using keywords

Another simple method is to use a search engine and include terms like data, dashboard or statistics in your searches.

Example: Searching for cholera data

Googling ‘cholera dashboard’ will give you the World Health Organization’s cholera dashboard as a top result. This provides authoritative and freely downloadable data.

When sourcing data on the web, always note the date of the information and how regularly it is updated. The WHO’s cholera dashboard, for example, is refreshed every two weeks.

Search shortcuts

If you’re looking for something specific, refine your search using advanced Google techniques:

  • site: Limits results to a specific website.
  • filetype: Finds specific document types like Excel spreadsheets.
  • intitle: Searches for pages with specific keywords in the title.
  • inurl: Finds pages with specific keywords in the URL.
  • AND Include all search terms
  • OR Include at least one search term
  • ” ” Search for an exact phrase
  • ****Include variations of the search term or use it for wild card searches
  • ~ Include synonyms of the search term

Example: Finding mining employment data

Searching ‘mining employment South Africa data’ brings up results from Statista, CEIC Data and Statistics South Africa. But what if you specifically need data from the department of mineral resources and energy?

Try adding:

  • site:dmre.gov.za → Limits results to the department of mineral resources and energy’s website.
  • filetype:xlsx → Finds Excel files containing relevant data.

Try it yourself!

Now that you have these search strategies, put them to the test. Finding the right data is often just a few smart searches away.

Notebook