Simple Csv File Duplicate Row Removing Application

I’m in need of a very simple application. I could probably do this, using Excel or OpenOffice Calc, but, I’d prefer to have a simple application that can do this quickly.

I will submit 2 files (.csv) to the application.

Input1.csv will have several thousand rows of data (likely between 100,000 and 500,000). It will have just one column.

Input2.csv will have several columns, and maybe, anywhere from a few hundred to a few thousand rows.

What I need, is for the application to remove all rows from Input2.csv, which already exist in Input1.csv. The basis of comparison will be all in Column A (again, Input1.csv will only have… one column, Column A, so this is simple logic).

Please view the examples attached. Input1.csv, will be the main database. Input2.csv will be the smaller database, and, Output.csv will be the results file. The results file will be Input2.csv, with the rows removed, which already have a matching record in Input1.csv.

That’s it. I know this can be done with an Excel or Calc macro, but, since I’ll often be using files that have more than 60,000 rows (this is the approx. limit on calc and excel) I prefer to just have a custom small application that will let me do this quickly on my computer.

Leave a Reply

Your email address will not be published. Required fields are marked *