开发者

Is there a suitable data structure for solving this question?

开发者 https://www.devze.com 2023-03-29 13:43 出处:网络
I have four group of data: //group 1 2 2 6 2 2 7 2 3 5 2 3 6 2 3 7 3 2 5 3 2 6 3 2 7 3 3 4 3 3 5 3 3 6 3 3 7 ... ...

I have four group of data:

//group 1
2 2 6
2 2 7
2 3 5
2 3 6
2 3 7

3 2 5
3 2 6
3 2 7
3 3 4
3 3 5
3 3 6
3 3 7
...
...
7 2 2
7 2 3
开发者_高级运维7 2 5
7 2 7
7 3 2
7 5 2
7 6 2

//group 2
2 2 2
2 2 3
2 2 4
2 2 5
2 3 2
2 3 3

3 3 2
3 3 3
3 4 2
...
...

5 2 2

//group 3
2 4
2 5

3 3
3 4
3 5
3 6
...
...
7 2

//group 4
6
7
8

And what I want to do is for given input number(s), give all the possible results. A example might help to explain what I want to do: Say input is 7, then the output should be the following:

from group 1
7 2 2
7 2 3
7 2 5
7 2 7
7 3 2
7 5 2
7 6 2

from group 2
//nothing

from group 3
7 2

from group 4
7

Then I add second input 2 (so the total input is 7 2), then the result should be

from group 1
7 2 2
7 2 3
7 2 5
7 2 7

from group 2
//nothing

from group 3
7 2

from group 4
//nothing

Then I add a 3rd input 5 (so the total input is 7 2 5), then the result should be

from group 1
7 2 5

from group 2
//nothing

from group 3
//nothing

from group 4
//nothing

It appears to be I need a forest(several trees) for this, correct ? If so , is there any good c++ tree implementation of forest for this task or I better hand made one myself ?

Many Thanks


Something like to hold the data

std::set<std::vector<int> > data;

Now you can create one of these for each group if there is no guarantee that the number of items in each group is the same, or if you know what each group is a specific number of items, then put them all in the same set.

Then use std::find_if with a custom predicate with the above data. And in this predicate have a std::vector which is the sequence you are looking for.

struct search_sequence
{
  bool operator()(std::vector<int> const& cVal) const
  {
    if (sequence.size() <= cVal.size())
      return std::equal(sequence.begin(), sequence.end(), cVal.begin());
    return false;
  }

  std::vector<int> sequence;
};

Now applying this with std::find_if will find all sequences in data which start with the search sequence.

EDIT: To store in the single instance, wrap the vector, e.g.

struct group_entry
{
  int id;
  std::vector<int> data;

  friend bool operator<(group_entry const& lhs, group_entry const& rhs)
  {
    return lhs.id < rhs.id && lhs.data < rhs.data;
  }
};

Now your set contains

std::set<group_entry> data;

Add all the data from all the groups

Modify the predicate:

struct search_sequence
{
  bool operator()(group_entry const& cVal) const
  {
    if (sequence.size() <= cVal.data.size())
      return std::equal(sequence.begin(), sequence.end(), cVal.data.begin());
    return false;
  }

  std::vector<int> sequence;
};


A "forest of trees" in c++ terms would be:

vector<set<string> >

Where the strings in the set are "2 2 6", "2 2 7", etc.

Assuming you only want to use prefixes, and all numbers are one digit (or zero justified to the same width), you could implement the algorithm with set::lower_bound and set::upper_bound on the prefix you want (in the example, first "7" then "7 2", etc).

Example:

void PrintResults(const vector<set<string> >& input, const string& prefix) {
  for (int i = 0, end(input.size()); i < end; ++i) {
    cout << "from group " << i + 1 << endl;
    const set<string>& group_set = input[i];
    set<string>::const_iterator low(group_set.lower_bound(prefix)), high(group_set.upper_bound(prefix));
    if (low == high) {
      cout << "//nothing" << endl;
    } else {
      for (; low != high; ++low) {
        cout << *low << endl;
      }
    }
  }
}

You could use a vector of tries for this, but there isn't a std library version.


Since the depth appears to be fixed

std::map<int, std::map<int, std::set<int> > >

would do the job. Not sure it's worth it given the number of data items though.


Here is the sketch of a possible solution:

class Group
{
    int id;

    std::vector<int> values;
}

You use this class to store both an entire group (the initial data), and the result of a query (content of a group after applying some filter).

A tree is constructed using nodes and edges; each node uses a vector of Group to store the result for that node.

class Node;

typedef Node *NodePtr;

class Edge
{
    NodePtr target;

    int value;
};

class Node
{
    // Results for each group. Maybe empty for certain groups.
    // Contains all elements for all groups in the root node.
    std::vector<Group> results;

    std::vector<Edge> children;
};

When you search, you start at the root. To match, e.g. 7 2, you look for a child of the root that is reached by traversing an edge with value == 7. Then you look at the target node of that edge, and you look for an edge with value == 2. When you reach the final node of the path, you have the result in the results vector.

Update Of course you also have the problem of constructing such a tree. You can do it using a recursive algorithm.

You start with the root node containing all the groups and all the lists. Then, for each first element of a list, you add an edge with a corresponding node and the corresponding result set.

You repeat the above step for each child node, this time looking at the second elements in the lists. And so on until you cannot extend the tree any more.


As other posters have said, you want a prefix tree. Here's a simple sample to get you started, assuming the only characters are 0-7. Note that I put in very little safety, and number assumes that strings given to it are all followed by spaces (EVEN THE LAST ONE), and results are returned the same way (it's easier). For real code, more safety should be involved. Also, code is uncompiled/untested, so probably has errors.

class number { //create a prefix node type
    number& operator=(const number& b); //UNDEFINED, NO COPY
    int endgroup;  //if this node is the end of a string, this says which group
    number* next[8];  // pointers to nodes of the next letter
public:
    number() :group(-1) { //constructor
        for(int i=0; i<8; ++i)
            next[i] = nullptr;
    }
    ~number() { // destructor
        for(int i=0; i<8; ++i)
            delete next[i];
    }
    void add(char* numbers, int group) { //add a string to the tree for a group
        if(next[numbers[0] == '\0') //if the string is completely used, this is an end
            endgroup = group;
        else {
            int index = numbers[0]-'0'; //otherwise, get next letter's node
            if (next[index] == nullptr)
                next[index] = new number; //and go there
            next[index].add(numbers+2, group); //+2 for the space
        }
    }
    void find(char* numbers, 
        std::vector<std::pair<int, std::string>>& out, 
        std::string sofar="") 
    { //find all strings that match
        if(numbers[0]) { //if there's more letters 
            sofar.append(numbers[0]).append(' '); //keep track of "result" thus far
            int index = numbers[0]-'0'; //find next letter's node
            if (next[index] == nullptr)
                return; //no strings match
            next[index].find(numbers+2, out, sofar); //go to next letter's node
        } else { //if there's no more letters, return everything!
            if (endgroup > -1) //if this is an endpoint, put it in results
                out.push_back(std::pair<int, std::string>(endgroup, sofar));
            for(int i=0; i<8; ++i) { //otherwise, try all subsequent letter combinations
                if (next[i]) {
                    std::string try(sofar);  //keep track of "result" thus far
                    try.append('0'+i).append(' ');
                    next[i].find(numbers, out, try); //try this letter
                }
            }
        }
    }
} root; //this is your handle to the tree

int main() {
    //group one
    root.add("2 2 6", 1);
    root.add("2 2 7", 1);
    //...
    //group two
    root.add("2 2 2", 2);
    //...

    std::string pattern;
    char digit;
    while(true) {
        std::cin >> digit;
        if (digit<'0' || digit > '7')
            break;
        pattern.append(digit).append(' ');
        std::vector<std::pair<int, std::string>> results;
        root.find(pattern.c_str(), results);
        for(int g=1; g<4; ++g) {
            std::cout << "group " << g << "\n";
            for(int i=0; i<results.size(); ++i) {
                if( results.first == g)
                    std::cout << results.second;
            }
        } 
    }
}


Arrays of strings will do the trick. Each line is a string. You might decorate the lines with separators to make the search easier, like "/7/2/2/" instead of "7 2 2" so you can search for "/2".

My guess is your professor wants you to use a more complex data structure.

0

精彩评论

暂无评论...
验证码 换一张
取 消