开发者

GC Overhead limit exceeded error when reading a text file

开发者 https://www.devze.com 2023-01-21 00:33 出处:网络
I am getting java.lang.OutOfMemoryError: GC overhead limit exceeded error when reading from a text file.I am not sure what is going wrong.I am running my program on a cluster having sufficient memory.

I am getting java.lang.OutOfMemoryError: GC overhead limit exceeded error when reading from a text file.I am not sure what is going wrong.I am running my program on a cluster having sufficient memory.The outer loop iterates for 16000 times and for each iteration of the outer loop the inner loop iterates for about 300,000 times.The error is thrown when the code tries to read a line from the inner loop.Any suggestions will be grately appreciated.The following is my code snippet:

//Read from the test data output file till not equals null
//Reads a single line at a time from the test data
while((line=br.readLine())!=null)
{
    //Clears the hashmap
    leastFive.clear();

    //Clears the arraylist
    fiveTrainURLs.clear();
    try
    {
        StringTokenizer st=new StringTokenizer(line," ");
        while(st.hasMoreTokens())
        {
            String currentToken=st.nextToken();

            if(currentToken.contains("File"))
            {
                testDataFileNo=st.nextToken();
                String tok="";
                while((tok=st.nextToken())!=null)
                {
                    if (tok==null) break;

                    int topic_no=Integer.parseInt(tok);
                    topic_no=开发者_开发问答Integer.parseInt(tok);
                    String prob=st.nextToken();

                    //Obtains the double value of the probability
                    double double_prob=Double.parseDouble(prob);
                    p1[topic_no]=double_prob;

                }
                break;
            }
        }
    }
    catch(Exception e)
    {
    }

    //Used to read over all the training data file
    FileReader fr1=new FileReader("/homes/output_train_2000.txt");

    BufferedReader br1=new BufferedReader(fr1);
    String line1="";

    //Reads the training data output file,one row at a time
    //This is the line on which an exception occurs!
    while((line1=br1.readLine())!=null)
    {
        try
        {
            StringTokenizer st=new StringTokenizer(line1," ");

            while(st.hasMoreTokens())
            {
                String currentToken=st.nextToken();

                if(currentToken.contains("File"))
                {
                    trainDataFileNo=st.nextToken();
                    String tok="";
                    while((tok=st.nextToken())!=null)
                    {
                        if(tok==null)
                            break;

                        int topic_no=Integer.parseInt(tok);
                        topic_no=Integer.parseInt(tok);
                        String prob=st.nextToken();

                        double double_prob=Double.parseDouble(prob);

                        //p2 will contain the probability values of each of the topics based on the indices
                        p2[topic_no]=double_prob;

                    }
                    break;
                }
            }
        }
        catch(Exception e)
        {
            double result=klDivergence(p1,p2);

            leastFive.put(trainDataFileNo,result);
        }
    }
}


16000 * 300000 = 4.8 BILLION. If each token takes up only 6 bytes, that in itself is over 24GBs. The garbage collector will be running for a long time when it finally starts into gc with 24GB. Seems like you need to break this up into smaller chunks. You could limit your app memory down to something reasonable like 1GB so that the GC kicks in sooner and can get something done in the time it has to do its work.

0

精彩评论

暂无评论...
验证码 换一张
取 消