CSV parser for C#

Need to parse CSV (Comma Separated Values) files in C#? There are many solutions starting from the OLE DB adapter, but here's an easy-to-use CSV Parser written in pure C#: CSVReader.cs. Now, here's a quick tutorial.

First, let's recite the rules of CSV: Each line in a text file represents a record. The fields on each line are separated by commas. If a field starts by a double quote ("), the field ends when the next quote is encountered. If you need to embed a quote inside a quoted field, use a double quote (""). Take for example the next trivial CSV file:

my fields,go,here
John said: "Don't move","""I won't"", he replied"

The first line parses into three separate fields ("my fields", "go", "here"). The second one is trickier, but it produces two values. You need to note that the quotes in the first field (John said: "Don't move") do not mean field boundaries. The behavior would be different if a double quote started the field, as it does for the second field ("I won't", he replied). This is why the quotes don't need doubling for the first field.

Now, the CSVReader class can be used to read the file like this:

using (CSVReader csv = new CSVReader(@"c:\myfile.csv")) {
string[] fields;
while ((fields = csv.GetCSVLine()) != null) {
Console.WriteLine("New CSV line begins");
foreach (string field in fields)
Console.WriteLine("CSV field: " + field);
}
}

And as you can guess, the code produces output like this:

New CSV line begins
CSV field: my fields
CSV field: go
CSV field: here
New CSV line begins
CSV field: John said: "Don't move"
CSV field: "I won't", he replied

As usual, feedback and/or bug reports are welcome.

October 23, 2004 В· Jouni Heikniemi В· 38 Comments
Posted in: .NET

38 Responses

  1. Frank Zehelein - November 16, 2004

    Greate Software! Thank you for sharing it!
    A verry little improvement could be to make the separating char adjustable:
    public char Separator
    {
    get
    {
    return separator;
    }
    set
    {
    separator = value;
    }
    }
    #region Private variables
    private char separator = ',';

    int nextComma = data.IndexOf(separator, fromPos);

  2. Jouni - November 16, 2004

    Yeah Frank, I was thinking about the same thing as well. I'll probably devise something like that for the next JHLib release of the CSVParser.

  3. Akshay - November 19, 2004

    Good stuff, my compliments on some well-written code.
    I realise this might sound trivial, but just to cover my bases:- under what license can I include and re-distribute your code?

  4. Jouni - November 19, 2004

    The newest version of the code was posted in the JHLib collection. I suggest you use JHLib's version, as it has been corrected according to FxCop rules and suggestions (i.e. it fits the generic Microsoft library conventions better). The changes are fairly trivial, though – if you've already taken this code, you're not missing out on much (apart from possible future updates).
    JHLib's license statement applies here, too: "JHLib is free. It is not released under any formal license such as GPL; it's just plainly and simply free. You can do whatever you wish with the code; I don't offer support or carry responsibility for anything related to the source or the binaries. That said, I'm naturally interested in feedback and suggestions, as well as your own code changes. Also, it would be nice to hear if you start using the library somewhere – it always gives ideas for further development."
    (from http://www.heikniemi.net/jhlib/)
    Feel free to mail me if you need any further help.

  5. robin - November 24, 2004

    hi jouni!
    nice class! i am using it in a customer-project. since one requirement was that there can be spaces between commas and datafields. since your class got confused with that, i improved it a little bit to handle that requirement. if you are interested, send me an email and i will send you the modified version..
    best regards,
    robin

  6. Dane Paul - December 16, 2004

    Yeah, this is a really good class. I made a simple modification for it to accept a string from the code and parse that string only. Everything works perfect now. Thanks a bunch.

  7. Baz - January 4, 2005

    One problem I see with your class is that if I want to read the Nth line, I need to parse N-1 lines even if I don't care about them.

  8. dan hoang - January 19, 2005

    Easy to use but I found a problem parsing the following line, it could be solved by adding one line of code shown below.
    "Distance Offset = -21.466 m,,,,,,,,,,,,,,,,,,,,"
    // If we're at the end of the string, let's consider this a field that
    // only contains the quote
    if (fromPos == data.Length-1) {
    startSeparatorPosition = fromPos; // Add this line
    fromPos++;
    return "\"";
    }

  9. Flack - February 4, 2005

    Hello,
    I am trying to use your code here to parse a certain line. The line itself comes from a user selecting some rows in an open Excel file and dropping them onto my form.
    Anyway, the line looks like this:
    """TE,ST""",q,1,",",",/, ,","
    This line corresponds to the row in Excel, which has these values (each line represents a value from A1 to H1):
    "TE,ST"
    q
    1
    "
    ,
    /
    ,
    Now, when this line is parsed, I get back 7 values, as follows:
    "TE,ST"
    q
    1
    ,
    ,/, ,
    "
    Is there any way that your code could be changed to handle this case correctly, or is it too complicated?
    Thanks for the help.

  10. Jouni - February 5, 2005

    I've heard the same being said by somebody else. A guy whose name I don't remember sent me email a couple of months ago and told me he changed the parser to accept Excel-originated CSV without a hitch. He promised to send me the updated source but never did.
    Thanks for the test case though; I'm pretty sure the code can be changed to do what you wish. I have plans to collect up all the recent cases and make fixes so that they all work. No promises on the timeframe though, I'm pretty busy these days. If you hack the code yourself, please do mail me the source if you can. :-)

  11. Rob Mello - April 14, 2005

    Nice work Jouni.

  12. Abi - April 18, 2005

    Nice piece of work.
    If I want to parse only from the third line of my CSV file, how would I do it using this code?. Any suggestions?
    Thanks,
    -Abi

  13. PaLoMo2 - April 26, 2005

    BUG: I heave esported a multiline textbox field and the result is this:
    Name,FamilyName,Tel,Note
    John,Holmes,5552522,"test note
    note note"
    I thing that the problem is the char "\n" and the file reader is able to read 1 line at time.

  14. PaLoMo2 - April 26, 2005

    BUG: I heave esported a multiline textbox field and the result is this:
    Name,FamilyName,Tel,Note
    John,Holmes,5552522,"test note
    note note"
    I thing that the problem is the char "\n" and the file reader is able to read 1 line at time.

  15. Dick Walker - May 15, 2005

    Hi,
    thanks for the code. It doesn't seem to cope with double quotes within double quotes. See example. The 5 field is split into 2.
    "20000083","HEATHER SMITH MAIL RETURNED","","18 DURBAN WAY 10 APR'97","MINTO "LEFT ADDRESS"","MADE CLASS 9","NSW","2566","","","","",0.00

  16. Chuck King - May 19, 2005

    Cool code…thanks!
    If you want to make sure you won't be there a good while, change the sample code to use a stringbuilder, something like:
    private void button1_Click(object sender, System.EventArgs e)
    {
    using (CSVReader csv = new CSVReader(@"c:\Test1.csv"))
    {
    string[] fields;
    int linenumber = 0;
    System.Text.StringBuilder sb = new System.Text.StringBuilder();
    while ((fields = csv.GetCSVLine()) != null)
    {
    linenumber++;
    sb.Append("CSV Line Number " + linenumber.ToString() + " begins ********************\n");
    foreach (string field in fields)
    sb.Append("——— CSV field: " + field + "\n");
    }
    txt1.Text += sb.ToString();
    }
    }

  17. E - July 29, 2005

    Doesnt support records spread out across multiple lines. :|

  18. Nemanja - September 24, 2005

    Just stubmled at this code, 'cause i'm too lazy to write csv reader from scratch.
    Line:
    if (i < data.Length – 1 && data[i + 1] == '"')
    should be changed to:
    if (i < data.Length – 1 && (data[i + 1] == '"' || data[i – 1] == '\\'))
    This way reader can reckognize more accurately quotes embedded in string. Hope this helps…

  19. Lee Newman - November 11, 2005

    Thanks!!

  20. Craig - January 16, 2006

    Hi,
    I've got this change to catch the End of File character.
    public string[] GetCSVLine()
    {
    string data = reader.ReadLine();
    if (data == null) return null;
    if (data.Length == 1) return null; // EOF char
    if (data.Length == 0) return new string[0];
    ArrayList result = new ArrayList();
    ParseCSVFields(result, data);
    return (string[])result.ToArray(typeof(string));
    }
    Craig

  21. Chris Walker - March 14, 2006

    Just downloaded this and wanted to say THANKS! I was going to build this class myself and you just saved me some time!
    Woot! You Rock!

  22. Aleksey Sokolovskiy - May 9, 2006

    Thank you very much! It's really a time saver. The code works perfectly.

  23. Steven - May 16, 2006

    Thanks a mil for the code. Very helpful in teaching me how to code better too.

  24. Mark - August 3, 2006

    Anyone know how to hack this to use a file uploaded by a user via FileUpload… yet without saving the file to the server?

  25. Grant Merwitz - August 15, 2006

    Great class, saved me alot of time.
    Thanks, u rock!

  26. Grant Merwitz - August 15, 2006

    Mark,
    I would suggest using the constructor that reads a stream "public CSVReader(Stream s)",
    try using the FileUpload's stream attribute.
    HTH

  27. AA - October 20, 2006

    I get an error when attempting to open a csv file that is already open in notepad. Just wanted to check if the code is able to handle that in some way.

  28. Anthony Main - October 23, 2006

    I have just found a bug in your reader (am yet to investigate a fix)
    If a field in the CSV contains data split over multiple lines it returns an array with only elements upto that field

  29. Craig - November 20, 2006

    Wow thanks alot really really helpful!

  30. mandar - March 2, 2007

    Simple CSV parser/reader function

  31. mandar - March 2, 2007

    Simple CSV parser/reader function
    http://www.codeproject.com/useritems/Basic_CSV_Parser_Function.asp

  32. Michael - March 25, 2007

    Dude, your code is awesome – thanks a stack. Needed to get a test app out very quickly, and it's really saved a lot of time.
    one thing I noticed, though – any fields after the first one which are encapsulated in double quotes are returned with those double quotes – eg:
    "FRP002", "Frozen Peas", "340g", 23.16
    will be returned as:
    FRP002
    "Frozen Peas"
    "340g"
    23.16
    I fixed that on my side with a routine to check for a pair of double quotes, (to get around the kind of problem), and if it doesn't find a pair it takes off the first and last quotes, ie (if strHeader is teh sting being returned for that value:
    if (strHeader.Length>4)
    {
    if (strHeader.StartsWith("\"\"") == false)
    {
    if (strHeader.StartsWith("\"") == true)
    {
    strHeader = strHeader.Substring(1);
    }
    }
    if (strHeader.EndsWith("\"\"") == false)
    {
    if (strHeader.EndsWith("\"") == true)
    {
    strHeader = strHeader.Substring(0, strHeader.Length-1);
    }
    }
    }
    … and then strHeader will have been "cleaned" of the extra pair of quotation marks…

  33. Michael - October 15, 2007

    For an alternate approach I ended up using a regex from to handle the splitting of a single line read from the csv file:
    public static string[] SplitCsv(string values)
    {
    Regex regex = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
    string[] result = regex.Split(values);
    return Array.ConvertAll(result, delegate(string s)
    {
    //remove start and end quote if it exists
    if (s.StartsWith("\""))
    s = s.Substring(1, s.Length – 2);
    //unescape quotes
    return s.Replace("\"\"", "\"");
    });
    }

  34. Erick Rivas - December 6, 2007

    I hope you don't mind. I uploaded the CSVReader project up to ohloh.net, with a couple of minor bug fixes and enhancements.

  35. Paul Sanders - December 10, 2007

    Very useful – saved me a lot of time. Thanks very much for sharing.
    Paul Sanders
    http://www.alpinesoft.co.uk

  36. mo - January 9, 2008

    thank you it works nice

  37. David Kemp - January 10, 2008

    I'm trying this with:
    A,"test","test" something
    Excel (which is the user's default 'benchmark' test) imports this as
    A
    test
    "test" something
    Your library seems to parse this as:
    A
    test
    test
    something
    I can't seem to see which is better, or worse, but I'd like it if it were configurable.

  38. Job - February 13, 2008

    Very nice software, but im getting a "out of memory" error.
    while (pos < data.Length)
    {
    result.Add(ParseCSVField(data, ref pos));
    }
    The result array is not large enouge for my data.
    Spec's say the capicity will be fixing automaticly.
    What can I do?