开发者

Copying a java text file into a String

开发者 https://www.devze.com 2022-12-22 11:38 出处:网络
I run into the following errors when i try to store a large file into a string. Exception in thread \"main\" java.lang.OutOfMemoryError: Java heap space

I run into the following errors when i try to store a large file into a string.

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2882)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
    at java.lang.StringBuffer.append(StringBuffer.java:306)
    at rdr2str.ReaderToString.main(ReaderToString.java:52)

As is evident, i am running out of heap space. Basically my pgm looks like something like this.

FileReader fr = new File开发者_StackOverflow社区Reader(<filepath>);
sb = new StringBuffer();
char[] b = new char[BLKSIZ];

while ((n = fr.read(b)) > 0) 
     sb.append(b, 0, n);    

fileString = sb.toString();

Can someone suggest me why i am running into heap space error? Thanks.


You are running out of memory because the way you've written your program, it requires storing the entire, arbitrarily large file in memory. You have 2 options:

  • You can increase the memory by passing command line switches to the JVM:

    java -Xms<initial heap size> -Xmx<maximum heap size>
    
  • You can rewrite your logic so that it deals with the file data as it streams in, thereby keeping your program's memory footprint low.

I recommend the second option. It's more work but it's the right way to go.

EDIT: To determine your system's defaults for initial and max heap size, you can use this code snippet (which I stole from a JavaRanch thread):

public class HeapSize {    
     public static void main(String[] args){      
         long kb = 1024;  
         long heapSize = Runtime.getRuntime().totalMemory();    
         long maxHeapSize = Runtime.getRuntime().maxMemory();  
         System.out.println("Heap Size (KB): " + heapSize/1024);  
         System.out.println("Max Heap Size (KB): " + maxHeapSize/1024);  
     }    
}


  • You allocate a small StringBuffer that gets longer and longer. Preallocate according to file size, and you will also be a LOT faster.

  • Note that java is Unicode, the string likely not, so you use... twice the size in memory.

  • Depending on VM (32 bit? 64 bit?) and the limits set (http://www.devx.com/tips/Tip/14688) you may simply not have enough memory available. How large is the file actually?


In the OP, your program is aborting while the StringBuffer is being expanded. You should preallocate that to the size you need or at least close to it. When StringBuffer must expand it needs RAM for the original capacity and the new capacity. As TomTom said too, your file is likely 8-bit characters so will be converted to 16-bit unicode in memory so it will double in size.

The program has not even encountered yet the next doubling - that is StringBuffer.toString() in Java 6 will allocate a new String and the internal char[] will be copied again (in some earlier versions of Java this was not the case). At the time of this copy you will need double the heap space - so at that moment at least 4 times what your actual files size is (30MB * 2 for byte->unicode, then 60MB * 2 for toString() call = 120MB). Once this method is finished GC will clean up the temporary classes.

If you cannot increase the heap space for your program you will have some difficulty. You cannot take the "easy" route and just return a String. You can try to do this incrementally so that you do not need to worry about the file size (one of the best solutions).

Look at your web service code in the client. It may provide a way to use a different class other than String - perhaps a java.io.Reader, java.lang.CharSequence, or a special interface, like the SAX related org.xml.sax.InputSource. Each of these can be used to build an implementation class that reads from your file in chunks as the callers needs it instead of loading the whole file at once.

For instance, if your web service handling routes can take a CharSequence then (if they are written well) you can create a special handler to return just one character at a time from the file - but buffer the input. See this similar question: How to deal with big strings and limited memory.


Kris has the answer to your problem.

You could also look at java commons fileutils' readFileToString which may be a bit more efficient.


Although this might not solve your problem, some small things you can do to make your code a bit better:

  • create your StringBuffer with an initial capacity the size of the String you are reading
  • close your filereader at the end: fr.close();


By default, Java starts with a very small maximum heap (64M on Windows at least). Is it possible you are trying to read a file that is too large?

If so you can increase the heap with the JVM parameter -Xmx256M (to set maximum heap to 256 MB)

I tried running a slightly modified version of your code:

public static void main(String[] args) throws Exception{
    FileReader fr = new FileReader("<filepath>");
    StringBuffer sb = new StringBuffer();
    char[] b = new char[1000];
    int n = 0;
    while ((n = fr.read(b)) > 0) 
         sb.append(b, 0, n);    

    String fileString = sb.toString();
    System.out.println(fileString);
}

on a small file (2 KB) and it worked as expected. You will need to set the JVM parameter.


Trying to read an arbitrarily large file into main memory in an application is bad design. Period. No amount of JVM settings adjustments/etc... are going to fix the core issue here. I recommend that you take a break and do some googling and reading about how to process streams in java - here's a good tutorial and here's another good tutorial to get you started.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号