Analyzing memory with MAT - question about UTF characters_问答_开发者

Analyzing memory with MAT - question about UTF characters

开发者 https://www.devze.com 2023-02-12 17:56 出处：网络

I get an .hprof file and I\'m analyzing it with Eclipse Memory Analyser (MAT). I run Top Component report and, in Duplicate Strings section, MAT detects some String instances with identical content.

I get an .hprof file and I'm analyzing it with Eclipse Memory Analyser (MAT).

I run Top Component report and, in Duplicate Strings section, MAT detects some String instances with identical content.

I'm working with String.intern() and other homework for me, but now this is not my question. That report shows me duplicated Strings like these:

\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000....
\u000a\u0009\u0009
\u000a\u0009\u0009\u0009\u0009 And so on.

Other Strings are readable, but, how about these ones? I'm thinking they are from XML parsing (I use JibX in my app).

My questions are:

What do you think these strings are coming? How 开发者_Python百科can I analyse them better?
If they are from XML parsing or something else, how can I clean/clear them after parsing? Maybe is JibX 1.0.1 Release too old for these issues?

Any suggestion about these UTF-8 like Strings would be very appreciated. Thanks in advance.

You can right-click on the suspicious String and select List Objects/With Incoming References. This will show you the objects that reference your Strings.

Analyzing memory with MAT - question about UTF characters

It is interesting to see Strings with many \u0000 characters, which is very uncommon given the fact that Strings are not 0-terminated in Java, so they are created from a String(byte[]) constructor, maybe a String(byte[],encoding) constructor, from byte arrays containing 0s.

I would use a profiler and analyse the call graphs of these constructors. Then you will find the culprit.