开发者

boost with icu u32_regex memory leak / cache on Win32

开发者 https://www.devze.com 2023-03-23 13:07 出处:网络
When using the boost regex class with the optional ICU support enabled (see boost documentation for details) I seem to get a memory leak or rather some sort of caching of memory happening which I cann

When using the boost regex class with the optional ICU support enabled (see boost documentation for details) I seem to get a memory leak or rather some sort of caching of memory happening which I cannot seem to reset / cleanup.

Has anyone else seen this and maybe knows of a way of clearing the cache so that the boost unit test framework will not report a memory leak?

The details for my problem are :-

ICU version 4.6.0
(Built using supplied vs2010 solution in debug and release configuration)
Boost version 1.45
(built with command "bjam variant=debug,release threading=multi link=shared stage" since standard distribution does not include icu support in regex)
OS Windows 7
Compiler MSVC 10 (Visual Studio 2010 Premium)

Though I did try this with a boost 1.42 with icu 4.2.1 which I happened to have built on my system with same results so don't think its a problem which would be solved by changing to boost 1.47 icu 4.8.1 which are the latest versions.

Compiling the following code (Test.cpp) :-

#define BOOST_TEST_MAIN    //Ask boost unit test framework to create a main for us
#define BOOST_ALL_DYN_LINK //Ask boost to link to dynamic library rather than purely header support where appropriate
#include <boost/test/auto_unit_test.hpp>

#include <boost/regex.hpp>
#include <boost/regex/icu.hpp> //We use icu extensions to regex to support unicode searches on utf-8
#include <unicode/uclean.h>    //We want to be able to clean up ICU cached objects

BOOST_AUTO_TEST_CASE( standard_regex ) 
{
    boost::regex re( "\\d{3}");
}

BOOST_AUTO_TEST_CASE( u32_regex ) 
{
    boost::u32regex re( boost::make_u32regex("\\d{3}"));
    u_cleanup(); //Ask the ICU library to clean up any cached memory
}

Which can be compiled from a command line by:-

C:\>cl test.cpp /I[BOOST HEADERS PATH] /I[ICU HEADERS] /EHsc /MDd -link /LIBPATH:[BOOST LIB PATH] [ICU LIB PATH]icuuc.lib

With the appropriate paths to headers / libs for your machine

Copy the appropriate boost dlls to the directory containing test.exe if they are not pathed in (boost_regex-vc100-mt-gd-1_45.dll and boost_unit_test_framework-vc100-mt-gd-1_45.dll)

When test.exe from above steps is run I get :-

Running 2 test cases...

*** No errors detected
Detected memory leaks!
Dumping objects ->
{789} normal block at 0x00410E88, 28 bytes long.
 Data: <    0N U        > 00 00 00 00 30 4E CD 55 00 00 00 00 01 00 00 00
{788} normal block at 0x00416350, 14 bytes long.
 Data: <icudt46l-coll > 69 63 75 64 74 34 36 6C 2D 63 6F 6C 6C 00
{787} normal block at 0x00415A58, 5 bytes long.
 Data: <root > 72 6F 6F开发者_开发知识库 74 00
...lots of other blocks removed for clarity ...

I'm guessing that icu is actually the culprit here since there it has its name at the start of the 2nd block.

Just doing the 1st test (ie just creating a standard regex not a u32_regex) has no memory leaks detected.

Adding multiple u32_regex's to the test does not result in more memory being leaked.

I attempted to clean up the icu cache by using the u_cleanup() call as per the icu documentation see the ICU Initialization and Termination section.

However I am not very familiar with the icu library (actually am only using it because we wanted unicode aware regex support) and can't see how to get the u_cleanup() call to actually clean up the data when ICU is being loaded by the boost regex dll.

Just to reiterate the problem appears to be :-

boost regex in a dll compiled with optional icu support (I'm pretty sure this uses a static link to icu but may be wrong here)

If I link to icuuc.lib in test program so that I can call u_cleanup() this doesn't appear to affect the memory held by the instance of ICU loaded via the boost regex library (well it would be rather odd if it did)

I can't find any calls in regex library which allow me to ask it to cleanup the ICU data which is really where we want to make the call.


u_cleanup is what cleans up the data, however it can't clean up the data if any items are still open.

Can you try not calling any boost function, but just calling u_cleanup() and see if there are any leaks? And then try just calling u_init() and then u_cleanup()

I'm not familiar with Boost to know if the above code will cleanup the regex, or if boost has any internal caching. The leaked objects don't look like usual ICU data, if ICU's data was still open you would see quite a bit of data, not 14+5 bytes


Just thought that I may as well answer the question here since I did solve this (with help from boost users).

The problem is in the order of tear down - if static objects in the boost regex dll are not destructed before the unit test framework then this will still be cacheing some data. And so the UTF reports memory leaks. Simply calling u_cleanup() isn't sufficient.

The easiest way of ensuring the order is to link with the unit test framework as a static library - this then gets its objects destructed after any dlls and so doesn't report the cached objects as a memory leak since they are already destructed.

0

精彩评论

暂无评论...
验证码 换一张
取 消