Cleaning data in the Spreadsheet

There's a lot of great efficiency tools that data analysts use all the time, such as:

  • conditional formatting

    It is a spreadsheet tool that changes how cells appear when values meet specific conditions.

  • removing duplicates

  • formatting dates

  • fixing text strings and substrings

  • splitting text to columns

Common functions to use

  • COUNTIF

    Count the number of records according to the condition

  • LEN

    return the length of the string

  • LEFT, RIGHT, MID

    return the left, right or middle substring

  • CONCATENATE

  • TRIM

Common way to find the dirty data

  • Sorting

  • Filtering

  • Pivot table

  • VLOOKUP

    =VLOOKUP(What you want to look up, where you want to look for it, the column number in the range containing the value to return, return an Approximate or Exact match – indicated as 1/TRUE, or 0/FALSE).

  • Graphics

Avanced cleaning skills

Data mapping

Data mapping is the process of matching fields from one database to another.

This is very important to the success of data migration, data integration, and lots of other data management activities.