I have a list of names and 开发者_StackOverflowcounts for males and females. Looking like this
ABEL 32898 82
CALLAN 1087 868
What is the best way in Pig to count up the total number of males and the total number of females?
Have a look to the GROUP ALL operation:
data = LOAD 'data' AS (name:CHARARRAY, males_count:INT, females_count:INT);
data_all = GROUP data ALL;
counts = FOREACH data_all GENERATE SUM(data.males_count) AS tot_males, SUM(data.females_count) AS tot_females;
精彩评论