python - parser class, passing in a way to specify column changes -
a bit stuck here on way go this. have class simple parser of csv files encapsulates data in self.data
parameter , offers methods data.
import os collections import namedtuple app import config class csvreader(): def __init__(self, csv_name): self.csv_name = csv_name def read_csv(): open(os.path.join(config['csv_path'], self.csv_name)) f: c_read = csv.dictreader(f) self.csvrow = namedtuple('csv_entry', c_read.fieldnames) self.data = [self.csvrow(**row) row in c_read] # ...
the problem having presents when want different data representations different columns. here's sample data:
name is_registered role 'crow' '1' '3' 'not crow' '0' '2'
in case, want more this:
name is_registered role 'crow' true 'bird' 'not crow' false 'user'
for part, lot remain string. however, want change data types match more intuitive type or name. how efficient way pass instructions on how treat columns in class?
the solution have come adding methods so:
def column_to_boolean(self, field): index, entry in enumerate(self.data): as_dict = entry._asdict() # namedtuples immutable as_dict[field] = as_dict[field] == '1' self.data[index] = self.csvrow(**as_dict)
then in app, following:
my_csv = csvreader('blah.csv') my_csv.column_to_boolean('is_registered') my_csv.column_to_enum('role', {'3': 'bird', '2': 'user'})
however, quite time consuming , rather annoying every single column change of every single row. seems there must quicker way iterate [number of column changes] times (in 1 iteration on data).
is there way this?
Comments
Post a Comment