A csv reader based on Haskell-cassava library : performance

I implemented my own csv reader using cassava library. The reader from missingh library was taking too long (~ 17 seconds) for a file with 43200 lines. I compared the result with python-numpy and python-pandas csv reader. Below is rough comparison.

cassava (ignore #) 3.3 sec
cassava (no support for ignoring #) 2.7 sec
numpy loadtxt > 10 sec
pandas read_csv 1.5 sec

As obvious, pandas does really well at reading csv file. I was hoping that my csv reader would do better but it didn’t. But it still beats the parsec based reader hands down.

The code is here https://github.com/dilawar/HBatteries/blob/master/src/HBatteries/CSV.hs

Advertisements