How to merge CSV files in Java_问答_开发者_运维开发者技术经验分享

My first CSV file looks like this with header included (header is included only at the top not after every entry):

NAME,SURNAME,AGE
Fred,Krueger,Unknown
.... n records

My second file might look like this:

NAME,MIDDLENAME,SURNAME,AGE
Jason,Noname,Scarry,16
.... n records with this header template

The merged file should look like this:

NAME,SURNAME,AGE,MIDDLENAME
Fred,Krueger,Unknown,
Jason,Scarry,16,Noname
....

Basically开发者_StackOverflow if headers don't match, all new header titles (columns) should be added after original header and their values according to that order.

Update

Above CSV were made smaller so I can illustrate what I want to achieve, in reality CSV files are generated one step before this (merge) and can be up to 100 columns

How can I do this?

I'd create a model for the 'bigger' format (a simple class with four fields and a collection for instances of this class) and implemented two parsers, one for the first, one for the second model. Create records for all rows of both csv files and implement a writer to output the csv in the correct format. IN brief:

 public void convert(File output, File...input) {

   List<Record> records = new ArrayList<Record>();
   for (File file:input) {
     if (input.isThreeColumnFormat()) {
        records.addAll(ThreeColumnFormatParser.parse(file));
     } else {
        records.addAll(FourColumnFormatParser.parse(file));
     }
   }
   CsvWriter.write(output, records);
 }

From your comment I see, that you a lot of different csv formats with some common columns.

You could define the model for any row in the various csv files like this:

public class Record {
  Object id; // some sort of unique identifier
  Map<String, String> values; // all key/values of a single row
  public Record(Object id) {this.id=id;}
  public void put(String key, String value){
    values.put(key, value);
  }
  public void get(String key) {
    values.get(key);
  }
}

For parsing any file you would first read the header and add the column headers to a global keystore (will be needed later on for outputting), then create records for all rows, like:

//...
List<Record> records = new ArrayList<Record>()

for (File file:getAllFiles()) {
  List<String> keys = getColumnsHeaders(file);
  KeyStore.addAll(keys);  // the store is a Set
  for (String line:file.getLines()) {
    String[] values = line.split(DELIMITER);
    Record record = new Record(file.getName()+i);  // as an example for id
    for (int i = 0; i < values.length; i++) {
      record.put(keys.get(i), values[i]);
    }
    records.add(record);
  }
}
// ...

Now the keystore has all used column header names and we can iterate over the collection of all records, get all values for all keys (and get null if the file for this record didn't use the key), assemble the csv lines and write everything to a new file.

Read in the header of the first file and create a list of the column names. Now read the header of the second file and add any column names that don't exist already in the list to the end of the list. Now you have your columns in the order that you want and you can write this to the new file first.

Next I would parse each file and for each row I would create a Map of column name to value. Once the row is parsed you could then iterate over the new list of column names and pull the values from the map and write them immediately to the new file. If the value is null don't print anything (just a comma, if required).

There might be more efficient solutions available, but I think this meets the requirements you set out.

Try this:

http://ondra.zizka.cz/stranky/programovani/ruzne/querying-transforming-csv-using-sql.texy

crunch input.csv output.csv "SELECT AVG(duration) AS durAvg FROM (SELECT * FROM indata ORDER BY duration LIMIT 2 OFFSET 6)"

How to merge CSV files in Java

Update

精彩评论

关注公众号

热门标签

图文推荐

How to merge CSV files in Java

Update

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：