开发者

Data transformation

开发者 https://www.devze.com 2023-01-02 20:08 出处:网络
I have data composed of a list of employers and a list of workers.Each has a many-to-many relationship with the other (so an employer can have many workers, and a worker can have many employers).

I have data composed of a list of employers and a list of workers. Each has a many-to-many relationship with the other (so an employer can have many workers, and a worker can have many employers).

The way the data is retrieved (and given to me) is as follows: each employer has an array of workers. In other words:

employer n has:
  worker x, worker y etc.

So I have a bunch of employer objects each containing an array of workers.

I need to transform this data (and basically invert the relationship). I need to have a bunch of worker o开发者_开发知识库bjects, each containing an array of employers. In other words:

worker x has:
  employer n1, employer n2 etc.

The context is hypothetical so please don't comment on why I need this or why I am doing it this way. I would really just like help on the algorithm to perform this transformation (there isn't that much data so I would prefer readability over complex optimizations which reduce complexity). (Oh and I am using Java, but pseudocode would be fine).


You want a Map<Worker,Set<Employer>> for this, or possibily a Multimap<Worker,Employer> from Guava.

In pseudocode:

initialize employerMap(worker => set of employers)
for every employer n do
    for every worker x in n.workers do
         employerMap[x].add(n)

Essentially a Worker can be mapped to multiple Employer, so you either:

  • Have a Map<Worker,Set<Employer>>
    • each key in a Map can only have one value, so the value in this case is a Set of values
  • Have a Multimap<Worker,Employer>
    • Multimap can map a key to multiple values

Example

Here's an example of doing the mapping. Note that storing the data in Object[][] arr like this is definitely not recommended, and is only used as part of the example. Concentrate on the actual mapping part.

import java.util.*;
public class MultiMapExample {
    public static void main(String[] args) {
        Object[][] arr = {
            { "IBM",    new String[] { "Joe", "Jack", "Carol"   }},
            { "MS",     new String[] { "Jack", "Andy", "Carol" }},
            { "Google", new String[] { "Bob", "Alice", "Carol"  }},
        };
        Map<String,Set<String>> employerMap =
            new HashMap<String,Set<String>>();

        for (Object[] data : arr) {
            String employer = (String) data[0];
            String[] workers = (String[]) data[1];
            for (String worker : workers) {
                Set<String> employers = employerMap.get(worker);
                if (employers == null) {
                    employerMap.put(worker, employers = new HashSet<String>());
                }
                employers.add(employer);
            }
        }

        for (String worker : employerMap.keySet()) {
            System.out.println(worker + " works for " + employerMap.get(worker));
        }
    }
}

This prints (order may vary):

Alice works for [Google]
Jack works for [IBM, MS]
Bob works for [Google]
Andy works for [MS]
Carol works for [IBM, Google, MS]
Joe works for [IBM]

I recommend keeping the data in a Map like this, but if you must transform the list of employees to an array for some reason, you can use Collection.toArray(T[]).

You should generally prefer List and other Java Collections Framework classes to arrays, though.


On implementing equals, hashCode, etc

The above example uses String for simplicity. You probably should have an actual Worker and Employer types instead, which is a good thing. You have to make sure that they implement equals and hashCode properly, though.

See also

  • Effective Java 2nd Edition
    • Item 8: Obey the general contract when overriding equals
    • Item 9: Always override hashcode when you override equals

API links

  • Object.equals(Object)
  • Object.hashCode()
  • java.lang.Comparable - not needed here, but another important Java API contract

Related questions

On equals/hashCode combo:

  • Overriding equals and hashcode in Java
  • Why both hashCode() and equals() exist
  • How to ensure hashCode() is consistent with equals()?

On equals vs ==:

  • Difference between equals and ==
  • why equals() method when we have == operator?
  • Java String.equals versus ==


Not java-specific, but here's the approach:

  • make a dictionary of <Worker, Set<Employer>>,
  • iterate through all the workers of each employer, and
  • for every worker, add the current employer to the Dict[Worker] set

You'll need to add a check to create the set the first time each worker is found.

0

精彩评论

暂无评论...
验证码 换一张
取 消