开发者

Understanding the efficiency of an std::string

开发者 https://www.devze.com 2023-02-28 02:43 出处:网络
I\'m trying to learn a little bit more about c++ strings. consider const char开发者_如何学C* cstring = \"hello\";

I'm trying to learn a little bit more about c++ strings.

consider

const char开发者_如何学C* cstring = "hello";
std::string string(cstring);

and

std::string string("hello");

Am I correct in assuming that both store "hello" in the .data section of an application and the bytes are then copied to another area on the heap where the pointer managed by the std::string can access them?

How could I efficiently store a really really long string? I'm kind of thinking about an application that reads in data from a socket stream. I fear concatenating many times. I could imagine using a linked list and traverse this list.

Strings have intimidated me for far too long!

Any links, tips, explanations, further details, would be extremely helpful.


I have stored strings in the 10's or 100's of MB range without issue. Naturally, it will be primarily limited by your available (contiguous) memory / address space.

If you are going to be appending / concatenating, there are a few things that may help efficiency-wise: If possible, try to use the reserve() member function to pre-allocate space-- even if you have a rough idea of how big the final size might be, it would save from unnecessary re-allocations as the string grows.

Additionally, many string implementations use "exponential growth", meaning that they grow by some percentage, rather than fixed byte size. For example, it might simply double the capacity any time additional space is needed. By increasing size exponentially, it becomes more efficient to perform lots of concatenations. (The exact details will depend on your version of stl.)

Finally, another option (if your library supports it) is to use rope<> template: Ropes are similar to strings, except that they are much more efficient when performing operations on very large strings. In particular, "ropes are allocated in small chunks, significantly reducing memory fragmentation problems introduced by large blocks". Some additional details on SGI's STL guide.


Since you're reading the string from a socket, you can reuse the same packet buffers and chain them together to represent the huge string. This will avoid any needless copying and is probably the most efficient solution possible. I seem to remember that the ACE library provides such a mechanism. I'll try to find it.

EDIT: ACE has ACE_Message_Block that allows you to store large messages in a linked-list fashion. You almost need to read the C++ Network Programming books to make sense of this colossal library. The free tutorials on the ACE website really suck.

I bet Boost.Asio must be capable of doing the same thing as ACE's message blocks. Boost.Asio now seems to have a larger mindshare than ACE, so I suggest looking for a solution within Boost.Asio first. If anyone can enlighten us about a Boost.Asio solution, that would be great!


It's about time I try writing a simple client-server app using Boost.Asio to see what all the fuss is about.


I don't think efficiency should be the issue. Both will perform well enough.

The deciding factor here is encapsulation. std::string is a far better abstraction than char * could ever be. Encapsulating pointer arithmetic is a good thing.

A lot of people thought long and hard to come up with std::string. I think failing to use it for unfounded efficiency reasons is foolish. Stick to the better abstraction and encapsulation.


As you probably know, an std::string is really just another name for basic_string<char>.

That said, they are a sequence container and memory will be allocated sequentially. It's possible to get an exceptions from an std::string if you try to make one bigger than the available contiguous memory that you can allocate. This threshold is typically considerably less than the total available memory due to memory fragmentation.

I've seen problems allocating contiguous memory when trying to allocate, for instance, large contiguous 3D buffers for images. But these issues don't start happening at least on the order of 100MB or so, at least in my experience, on Windows XP Pro (for instance.)

Are your strings this big?

0

精彩评论

暂无评论...
验证码 换一张
取 消