开发者

Need an algorithm to group several parameters of a person under the persons name

开发者 https://www.devze.com 2022-12-30 03:59 出处:网络
I have a bunch of names in alphabetical order with multiple instances of the same name all in alphabetical order so that the names are all grouped together. Beside each name, after a coma, I have a ro

I have a bunch of names in alphabetical order with multiple instances of the same name all in alphabetical order so that the names are all grouped together. Beside each name, after a coma, I have a role that has been assigned to them, one name-role pair per line, something like whats shown below

name1,role1

name1,role2

name1,role3

name1,role8

name2,role8

name2,role2

name2,role4

name3,role1

name4,role5

name4,role1

...

..

.

I am looking for an algorithm to take the above .csv file as input create an output .csv file in the following format

name1,role1,role2,role3,role8

name2,role8,role2,role4

name3,role1

na开发者_StackOverflowme4,role5,role1

...

..

.

So basically I want each name to appear only once and then the roles to be printed in csv format next to the names for all names and roles in the input file.

The algorithm should be language independent. I would appreciate it if it does NOT use OOP principles :-) I am a newbie.


Obviously has some formatting bugs but this will get you started.

var lastName = "";

do{
  var name = readName();
  var role = readRole();
  if(lastName!=name){
    print("\n"+name+",");
    lastName = name;
  }
  print(role+",");
}while(reader.isReady());


This is easy to do if your language has associative arrays: arrays that can be indexed by anything (such as a string) rather than just numbers. Some languages call them "hashes," "maps," or "dictionaries."

On the other hand, if you can guarantee that the names are grouped together as in your sample data, Stefan's solution works quite well.


It's kind of a pity you said it had to be language-agnostic because Python is rather well-qualified for this:

import itertools
def split(s):
  return s.strip().split(',', 1)
with open(filename, 'r') as f:
  for name, lines in itertools.groupby(f, lambda s: split(s)[0])
    print name + ',' + ','.join(split(s)[1] for s in lines)

Basically the groupby call takes all consecutive lines with the same name and groups them together.

Now that I think about it, though, Stefan's answer is probably more efficient.


Here is a solution in Java:

Scanner sc = new Scanner (new File(fileName));
Map<String, List<String>> nameRoles = new HashMap<String, List<String>> ();
while (sc.hasNextLine()) {
  String line = sc.nextLine();
  String args[] = line.split (",");
  if (nameRoles.containsKey(args[0]) {
    nameRoles.get(args[0]).add(args[1]);
  } else {
    List<String> roles = new ArrayList<String>();
    roles.add(args[1]);
    nameRoles.put(args[0], roles);
  }
}

// then print it out
for (String name : nameRoles.keySet()) {
   List<String> roles = nameRoles.get(name);
   System.out.print(name + ",");
   for (String role : roles) {
     System.out.print(role + ",");
   }
   System.out.println();
}

With this approach, you can work with an random input like:

name1,role1

name3,role1

name2,role8

name1,role2

name2,role2

name4,role5

name4,role1


Here it is in C# using nothing fancy. It should be self-explanatory:

static void Main(string[] args) 
{
    using (StreamReader file = new StreamReader("input.txt"))
    {
        string prevName = "";
        while (!file.EndOfStream)
        {
            string line = file.ReadLine(); // read a line

            string[] tokens = line.Split(','); // split the name and the parameter

            string name = tokens[0]; // this is the name
            string param = tokens[1]; // this is the parameter

            if (name == prevName) // if the name is the same as the previous name we read, we add the current param to that name. This works right because the names are sorted.
            {
                Console.Write(param + " ");
            }
            else // otherwise, we are definitely done with the previous name, and have printed all of its parameters (due to the sorting).
            {
                if (prevName != "") // make sure we don't print an extra newline the first time around
                {
                    Console.WriteLine();
                }
                Console.Write(name + ": " + param + " "); // write the name followed by the first parameter. The output format can easily be tweaked to print commas.
                prevName = name; // store the new name as the previous name.
            }
        }
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号