Downloading data from a Google Spreadsheet is usually as easy as clicking File->Download->CSV
But sometimes you need to automate the download.
If you have editing access to the file, you can do this by publishing a CSV version of the file (File→Share→Publish to the web)
If you don’t have editing access, you can still access the data on public documents by using a URL like this:
https://docs.google.com/spreadsheets/d/KEY/export?format=csv
The KEY in the URL above would be a long alphanumeric string that looks something like this:
1ZpX_twP8sFBOAU6t--Vvh1pWMYSvs60UXINuD5n-K08
Replace the word ‘KEY’ with the matching alphanumeric key string that you find in the URL for a spreadsheet.
For example, this is the URL for a public spreadsheet with loadshedding data for South Africa. The KEY is the portion in bold:
https://docs.google.com/spreadsheets/d/1ZpX_twP8sFBOAU6t--Vvh1pWMYSvs60UXINuD5n-K08/edit#gid=863218371
To construct the CSV URL, remove everything after the forward-slash that follows the KEY and replace it with export?format=csv
The new URL will look like:
https://docs.google.com/spreadsheets/d/1ZpX_twP8sFBOAU6t--Vvh1pWMYSvs60UXINuD5n-K08/export?format=csv
If you open this URL you will have a CSV-format file downloaded to your computer.
Note: The spreadsheet URL used in the example above is for the Eskom Se Push loadshedding dataset. (Loadshedding is the South African term for rolling blackouts, which have been in effect since 2014.) We use this technique to download a daily copy of the data into DataDesk, which we then publish as in both CSV and JSON format. The end result is the loadshedding calendar we publish on The Outlier.
- Update: The Outlier’s latest version of the loadshedding calendar can be found here