开发者

Counting clusters in a hashset in c

开发者 https://www.devze.com 2023-02-14 11:36 出处:网络
I\'m making a hashset ADT in c for a homework assignment. I cannot figure out for the life of me why my logic isn\'t working for a function that counts clusters in a hashset.

I'm making a hashset ADT in c for a homework assignment. I cannot figure out for the life of me why my logic isn't working for a function that counts clusters in a hashset.

 void printClusterStats (hashset_ref hashset) {
   int **clusters = (int**)calloc (hashset->length, sizeof(int));
   assert (clusters);
   int ct = 0;
   // i traverses hashset->array
   // ct adds up words in each cluster
   // this lo开发者_JAVA百科op screws up vvv
   for ( int i = 0; i < hashset->length; ++i) {
      if (hashset->array[i] == NULL) {
         clusters[ct] += 1;
         ct = 0;
      }else {
        ct += 1; 
      }
   }
   clusters[ct] +=1;  //catch an ending cluster

   printf("%10d words in the hash set\n", hashset->load);
   printf("%10d length of the hash array\n", hashset->length);
   for ( int i = 0; i < hashset->length; i++){
      if (clusters[i] == 0) continue;
      else{
         printf("%10d clusters of size %3d\n", clusters[i], i);
      }
   }
   free(clusters);
}

The output of this function looks like:

        26 words in the hash set
        63 length of the hash array
        96 clusters of size   0
        32 clusters of size   1
        16 clusters of size   2
         4 clusters of size   4
         4 clusters of size   6
       305 clusters of size  33
-703256008 clusters of size  34
-703256008 clusters of size  35

For my input hashset, there are 26 words in an array 63 long. However the counting screws up somehow.

EDIT: I've counted the clusters manually and discovered every count is 4 times what it should be. What does that mean?


This line creates an array of pointers to int

int **clusters = (int**)calloc (hashset->length, sizeof(int));

rather than an array of int that you actually want if you are storing cluster counts

int *clusters = (int*)calloc (hashset->length, sizeof(int));   

Consequently, when you do clusters[ct] += 1; it will be treated as pointer arithmetic, and add 4 to the cluster count each time, since you are on a system with 4-byte pointers.


int **clusters = (int**)calloc (hashset->length, sizeof(int));

Should be

int *clusters = (int*)calloc (hashset->length, sizeof(int));

I'm not very competent at c yet, so I can't explain why this solved my problem. But there you go if you were curious!

Here's the correct output

26 words in the hash set
63 length of the hash array
24 clusters of size   0
 8 clusters of size   1
 4 clusters of size   2
 1 clusters of size   4
 1 clusters of size   6
0

精彩评论

暂无评论...
验证码 换一张
取 消