Embedded C++ : to use STL or not?_问答_开发者_运维开发者技术经验分享

I have always been an embedded software engineer, but usually at Layer 3 or 2 of the OSI stack. I am not really a hardware guy. I have generally always done telecoms products, usually hand/cell-phones, which generally means something like an ARM 7 processor.

Now I find myself in a more generic embedded world, in a small start-up, where I might move to "not so powerful" processors (there's the subjective bit) - I cannot predict which.

I have read quite a bit about debate about using STL in C++ in embedded systems and there is no clear cut answer. There are some small worries about portability, and a few about code size or run-time, but I have two major concerns:

1 - exception handling; I am still not sure whether to use it (see Embedded C++ : to use exceptions or not?)

2 - I strongly dislike dynamic memory allocation in embedded systems, because of the problems it can introduce. I generally have a buffer pool which is statically allocated at compile time and which serves up only fixed size buffers (if no buffers, system reset). The STL, of course, does a lot of dynamic allocation.

Now I have to make the decision whether to use or forego the STL - for the whole company, for ever (it's going into some very core s/w).

Which way do I jump? Super-safe & lose much of what constitutes C++ (i开发者_如何学Pythonmo, it's more than just the language definition) and maybe run into problems later or have to add lots of exception handling & maybe some other code now?

I am tempted to just go with Boost, but 1) I am not sure if it will port to every embedded processor I might want to use and 2) on their website, they say that they doesn't guarantee/recommend certain parts of it for embedded systems (especially FSMs, which seems weird). If I go for Boost & we find a problem later ....

I work on real-time embedded systems every day. Of course, my definition of embedded system may be different than yours. But we make full use of the STL and exceptions and do not experience any unmanageable problems. We also make use of dynamic memory (at a very high rate; allocating lots of packets per second, etc.) and have not yet needed to resort to any custom allocators or memory pools. We have even used C++ in interrupt handlers. We don't use boost, but only because a certain government agency won't let us.

It is our experience you can indeed use many modern C++ features in an embedded environment as long as you use your head and conduct your own benchmarks. I highly recommend you make use of Scott Meyer's Effective C++ 3rd edition as well as Sutter and Alexandrescu's C++ Coding Standards to assist you in using C++ with a sane programming style.

Edit: After getting an upvote on this 2 years later, let me post an update. We are much farther along in our development and we have finally hit spots in our code where the standard library containers are too slow under high performance conditions. Here we did in fact resort to custom algorithms, memory pools, and simplified containers. That is the beauty of C++ though, you can use the standard library and get all the good things it provides for 90% of your use cases. You don't throw it all out when you meet problems, you just hand-optimize the trouble spots.

Super-safe & lose much of what constitutes C++ (imo, it's more than just the language definition) and maybe run into problems later or have to add lots of exception handling & maybe some other code now?

We have a similar debate in the game world and people come down on both sides. Regarding the quoted part, why would you be concerned about losing "much of what constitutes C++"? If it's not pragmatic, don't use it. It shouldn't matter if it's "C++" or not.

Run some tests. Can you get around STL's memory management in ways that satisfy you? If so, was it worth the effort? A lot of problems STL and boost are designed to solve just plain don't come up if you design to avoid haphazard dynamic memory allocation... does STL solve a specific problem you face?

Lots of people have tackled STL in tight environments and been happy with it. Lots of people just avoid it. Some people propose entirely new standards. I don't think there's one right answer.

The other posts have addressed the important issues of dynamic memory allocation, exceptions and possible code bloat. I just want to add: Don't forget about <algorithm>! Regardless of whether you use STL vectors or plain C arrays and pointers, you can still use sort(), binary_search(), random_shuffle(), the functions for building and managing heaps, etc. These routines will almost certainly be faster and less buggy than versions you build yourself.

Example: unless you think about it carefully, a shuffle algorithm you build yourself is likely to produce skewed distributions; random_shuffle() won't.

Paul Pedriana from Electronic Arts wrote in 2007 a lengthy treatise on why the STL was inappropriate for embedded console development and why they had to write their own. It's a detailed article, but the most important reasons were:

STL allocators are slow, bloated, and inefficient
Compilers aren't actually very good at inlining all those deep function calls
STL allocators don't support explicit alignment
The STL algorithms that come with GCC and MSVC's STL aren't very performant, because they're very platform-agnostic and thus miss a lot of microoptimizations that can make a big difference.

Some years ago, our company made the decision not to use the STL at all, instead implementing our own system of containers that are maximally performant, easier to debug, and more conservative of memory. It was a lot of work but it has repaid itself many times over. But ours is a space in which products compete on how much they can cram into 16.6ms with a given CPU and memory size.

As to exceptions: they are slow on consoles, and anyone who tells you otherwise hasn't tried timing them. Simply compiling with them enabled will slow down the entire program because of the necessary prolog/epilog code -- measure it yourself if you don't believe me. It's even worse on in-order CPUs than it is on the x86. For this reason, the compiler we use doesn't even support C++ exceptions.

The performance gain isn't so much from avoiding the cost of an exception throw — it's from disabling exceptions entirely.

Let me start out by saying I haven't done embedded work for a few years, and never in C++, so my advice is worth every penny you're paying for it...

The templates utilized by STL are never going to generate code you wouldn't need to generate yourself, so I wouldn't worry about code bloat.

The STL doesn't throw exceptions on its own, so that shouldn't be a concern. If your classes don't throw, you should be safe. Divide your object initialization into two parts, let the constructor create a bare bones object and then do any initialization that could fail in a member function that returns an error code.

I think all of the container classes will let you define your own allocation function, so if you want to allocate from a pool you can make it happen.

The open source project "Embedded Template Library (ETL)" targets the usual problems with the STL used in Embedded Applications by providing/implementing a library:

deterministic behaviour
"Create a set of containers where the size or maximum size is determined at compile time. These containers should be largely equivalent to those supplied in the STL, with a compatible API."
no dynamic memory allocation
no RTTI required
little use of virtual functions (only when absolutely necessary)
set of fixed capacity containers
cache friendly storage of containers as continously allocated memory block
reduced container code size
typesafe smart enumerations
CRC calculations
checksums & hash functions
variants = sort of type safe unions
Choice of asserts, exceptions, error handler or no checks on errors
heavily unit tested
well documented source code
and other features...

You can also consider a commercial C++ STL for Embedded Developers provided by E.S.R. Labs.

for memory management, you can implement your own allocator, which request memory from the pool. And all STL container have a template for the allocator.
for exception, STL doesn't throw many exceptions, in generally, the most common are: out of memory, in your case, the system should reset, so you can do reset in the allocator. others are such as out of range, you can avoid it by the user.
so, i think you can use STL in embedded system :)

In addition to all comments, I would propose you reading of Technical Report on C++ Performance which specifically addresses topics that you are interested in: using C++ in embedded (including hard real-time systems); how exception-handling usually implemented and which overhead it has; free store allocation's overhead.

The report is really good as is debunks many popular tails about C++ performance.

It basically depends on your compiler and in the amount of memory you have. If you have more than a few Kb of ram, having dynamic memory allocation helps a lot. If the implementation of malloc from the standard library that you have is not tuned to your memory size you can write your own, or there are nice examples around such as mm_malloc from Ralph Hempel that you can use to write your new and delete operators on top.

I don't agree with those that repeat the meme that exceptions and stl containers are too slow, or too bloated etc. Of course it adds a little more code than a simple C's malloc, but judicious use of exceptions can make code much clear and avoid too much error checking blurb in C.

One has to keep in mind that STL allocators will increase their allocations in powers of two, which means sometimes it will do some reallocations until it reaches the correct size, which you can prevent with reserve so it becomes as cheap as one malloc of the desired size if you know the size to allocate anyway.

If you have a big buffer in a vector for example, at some point it might do a reallocation and ends using up 1.5x the memory size that you are intending it to use at some point while reallocating and moving data. (For example, at some point it has N bytes allocated, you add data via append or an insertion iterator and it allocates 2N bytes, copies the first N and releases N. You have 3N bytes allocated at some point).

So in the end it has a lot of advantages, and pays of if you know what you are doing. You should know a little of how C++ works to use it on embedded projects without surprises.

And to the guy of the fixed buffers and reset, you can always reset inside the new operator or whatever if you are out of memory, but that would mean you did a bad design that can exhaust your memory.

An exception being thrown with ARM realview 3.1:

--- OSD\#1504 throw fapi_error("OSDHANDLER_BitBlitFill",res);
   S:218E72F0 E1A00000  MOV      r0,r0
   S:218E72F4 E58D0004  STR      r0,[sp,#4]
   S:218E72F8 E1A02000  MOV      r2,r0
   S:218E72FC E24F109C  ADR      r1,{pc}-0x94 ; 0x218e7268
   S:218E7300 E28D0010  ADD      r0,sp,#0x10
   S:218E7304 FA0621E3  BLX      _ZNSsC1EPKcRKSaIcE       <0x21a6fa98>
   S:218E7308 E1A0B000  MOV      r11,r0
   S:218E730C E1A0200A  MOV      r2,r10
   S:218E7310 E1A01000  MOV      r1,r0
   S:218E7314 E28D0014  ADD      r0,sp,#0x14
   S:218E7318 EB05C35F  BL       fapi_error::fapi_error   <0x21a5809c>
   S:218E731C E3A00008  MOV      r0,#8
   S:218E7320 FA056C58  BLX      __cxa_allocate_exception <0x21a42488>
   S:218E7324 E58D0008  STR      r0,[sp,#8]
   S:218E7328 E28D1014  ADD      r1,sp,#0x14
   S:218E732C EB05C340  BL       _ZN10fapi_errorC1ERKS_   <0x21a58034>
   S:218E7330 E58D0008  STR      r0,[sp,#8]
   S:218E7334 E28D0014  ADD      r0,sp,#0x14
   S:218E7338 EB05C36E  BL       _ZN10fapi_errorD1Ev      <0x21a580f8>
   S:218E733C E51F2F98  LDR      r2,0x218e63ac            <OSD\#1126>
   S:218E7340 E51F1F98  LDR      r1,0x218e63b0            <OSD\#1126>
   S:218E7344 E59D0008  LDR      r0,[sp,#8]
   S:218E7348 FB056D05  BLX      __cxa_throw              <0x21a42766>

Doesn't seem so scary, and no overhead is added inside {} blocks or functions if the exception isn't thrown.

The biggest problem with STL in embedded systems is the memory allocation issue (which, as you said, causes a lot of problems).

I'd seriously research creating your own memory management, built by overriding the new/delete operators. I'm pretty sure that with a bit of time, it can be done, and it's almost certainly worth it.

As for the exceptions issue, I wouldn't go there. Exceptions are a serious slowdown of your code, because they cause every single block ({ }) to have code before and after, allowing the catching of the exception and the destruction of any objects contained within. I don't have hard data on this on hand, but every time I've seen this issue come up, I've seen overwhelming evidence of a massive slowdown caused by using exceptions.

Edit:
Since a lot of people wrote comments stating that exception handling is not slower, I thought I'd add this little note (thanks for the people who wrote this in comments, I thought it'd be good to add it here).

The reason exception handling slows down your code is because the compiler must make sure that every block ({}), from the place an exception is thrown to the place it is dealt with, must deallocate any objects within it. This is code that is added to every block, regardless of whether anyone ever throws an exception or not (since the compiler can't tell at compile time whether this block will be part of an exception "chain").

Of course, this might be an old way of doing things that has gotten much faster in newer compilers (I'm not exactly up-to-date on C++ compiler optimizations). The best way to know is just to run some sample code, with exceptions turned on and off (and which includes a few nested functions), and time the difference.

On our embedded scanner project we were developing a board with ARM7 CPU and STL didn't bring any issue. Surely the project details are important since dynamic memory allocation may not be an issue for many available boards today and type of projects.