Normalizing (Un-Crosstabbing) Crosstabbed CSV Data TablesΒΆ

un-xtab.py is a Python module and command-line program that rearranges data in a CSV file from a crosstabbed format to a normalized format. It takes data that are in this form:

Station 2006-05-23 2006-06-15 2006-07-19
WQ-01 4.5 3.7 6.8
WQ-02 9.7 5.1 7.2
WQ-03 10 6.1 8.8

And rearranges it into this form:

Station Date Value
WQ-01 2006-05-23 4.5
WQ-02 2006-05-23 3.7
WQ-03 2006-05-23 6.8
WQ-01 2006-06-15 9.7
WQ-02 2006-05-15 5.1
WQ-03 2006-06-15 7.2
WQ-01 2006-07-19 10
WQ-02 2006-07-19 6.1
WQ-03 2006-07-19 8.8

You can use the un-xtab program to rearrange data that have been provided in a format designed for readability into a format that is more suitable for storage in a database, or for use with statistical, modeling, graphics, or other software.

The un-xtab program can deal with crosstabbed formats that include multiple rows of column headers and groups of data values that were crosstabbed together. For example, un-xtab can convert something like this:

Laboratory Name                  
Project Name                  
        Arsenic   Cadmium   Copper  
        mg/kg   mg/kg   mg/kg  
        dry   dry   dry  
Client sample ID Lab sample ID Preparation date Analysis date Concentration Qualifier Concentration Qualifier Concentration Qualifier
SD00231 45298-01 2013-06-23 2013-06-24 0.11 U 112   251  
SD00232 45298-02 2013-06-23 2013-06-24 8.21   582   396  
SD00233 45298-03 2013-06-23 2013-06-24 0.11 U 45.1   2.4 U
SD00235 45298-04 2013-06-23 2013-06-24 0.11 U 69   2.4 U
SD00236 45298-05 2013-06-23 2013-06-24 4.53   98.9   85.8  

to this:

Client sample ID Lab sample ID Preparation date Analysis date Analyte Units Measurement basis Concentration Qualifier
SD00231 45298-01 2013-06-23 2013-06-24 Arsenic mg/kg dry 0.11 U
SD00231 45298-01 2013-06-23 2013-06-24 Cadmium mg/kg dry 112  
SD00231 45298-01 2013-06-23 2013-06-24 Copper mg/kg dry 251  
SD00232 45298-02 2013-06-23 2013-06-24 Arsenic mg/kg dry 8.21  
SD00232 45298-02 2013-06-23 2013-06-24 Cadmium mg/kg dry 582  
SD00232 45298-02 2013-06-23 2013-06-24 Copper mg/kg dry 396  
SD00233 45298-03 2013-06-23 2013-06-24 Arsenic mg/kg dry 0.11 U
SD00233 45298-03 2013-06-23 2013-06-24 Cadmium mg/kg dry 45.1  
SD00233 45298-03 2013-06-23 2013-06-24 Copper mg/kg dry 2.4 U
SD00235 45298-04 2013-06-23 2013-06-24 Arsenic mg/kg dry 0.11 U
SD00235 45298-04 2013-06-23 2013-06-24 Cadmium mg/kg dry 69  
SD00235 45298-04 2013-06-23 2013-06-24 Copper mg/kg dry 2.4 U
SD00236 45298-05 2013-06-23 2013-06-24 Arsenic mg/kg dry 4.53  
SD00236 45298-05 2013-06-23 2013-06-24 Cadmium mg/kg dry 98.9  
SD00236 45298-05 2013-06-23 2013-06-24 Copper mg/kg dry 85.8