Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

pandas read_csv converting mixed types columns as string

The Dtypewarning is a Warning which can be caught and acted on. See here for more information. To catch the warning we need to wrap the execution in a warnings.catch_warnings block. The warning message and columns affected can be extracted using regex, then used to set the correct column type using .astype(target_type)

import re
import pandas 
import warnings

myfile = 'your_input_file_here.txt'
target_type = str  # The desired output type

with warnings.catch_warnings(record=True) as ws:
    warnings.simplefilter("always")

    mydata = pandas.read_csv(myfile, sep="|", header=None)
    print("Warnings raised:", ws)
    # We have an error on specific columns, try and load them as string
    for w in ws:
        s = str(w.message)
        print("Warning message:", s)
        match = re.search(r"Columns \(([0-9,]+)\) have mixed types\.", s)
        if match:
            columns = match.group(1).split(',') # Get columns as a list
            columns = [int(c) for c in columns]
            print("Applying %s dtype to columns:" % target_type, columns)
            mydata.iloc[:,columns] = mydata.iloc[:,columns].astype(target_type)

The result should be the same DataFrame with the problematic columns set to a str type. It is worth noting that string columns in a Pandas DataFrame are reported as object.



This post first appeared on Martin Fitzpatrick – Python Coder, Postgraduate, please read the originial post: here

Share the post

pandas read_csv converting mixed types columns as string

×

Subscribe to Martin Fitzpatrick – Python Coder, Postgraduate

Get updates delivered right to your inbox!

Thank you for your subscription

×