|
Standardization
aspires to perfect consistency and predictability across the data
set; both within and between fields.
The following examples show an illustration of
Non-Standardized vs. Standardized data:
Notice
the inconsistent casing, abbreviations, punctuation, ‘state’
entries, the missing leading zero in the Connecticut (CT) ZIP code,
and several entirely blank records.
...after
some preliminary standardization routines were run on it:
Now
all the records are Upper Case, punctuation has been removed, address elements have been standardized to USPS regulations,
the ZIP+4 has
been added, the leading zero on the Connecticut zip has been added,
and blank records are gone.
Validation
Enhancement.
|