Tag: cleaning data
-
OpenRefine Part 2: Removing duplicates and using version history
OpenRefine is one of The Outlier’s favourite tools when working with large datasets. This powerful open-source program is ideal for cleaning messy data. In this post, we focus on two essential features: removing duplicates and using version history to keep track of your changes.
-
5 reasons to switch to OpenRefine to clean data
It has a pretty steep learning curve but it’s definitely worth the effort to learn OpenRefine if you need to clean large amounts of very messy data.
-
10 spreadsheet formulas we use to superpower our data analysis
Both Microsoft Excel and Google Sheets are powerful spreadsheet programmes for storing and sorting through mounds of data. Here are our favourite formulas for making sense of data.