All thread create methods like pthread_create() or CreateThread() in Windows expect the caller to provide a pointer to the arg for the thread. Isn't this inherently unsafe?
This can work 'safely' only if the arg is in the heap, and then again creating a heap variable adds to the overhead of cleaning the alloca开发者_C百科ted memory up. If a stack variable is provided as the arg then the result is at best unpredictable.
This looks like a half-cooked solution to me, or am I missing some subtle aspect of the APIs?
Context.
Many C APIs provide an extra void *
argument so that you can pass context through third party APIs. Typically you might pack some information into a struct and point this variable at the struct, so that when the thread initializes and begins executing it has more information than the particular function that its started with. There's no necessity to keep this information at the location given. For instance you might have several fields that tell the newly created thread what it will be working on, and where it can find the data it will need. Furthermore there's no requirement that the void *
actually be used as a pointer, its a typeless argument with the most appropriate width on a given architecture (pointer width), that anything can be made available to the new thread. For instance you might pass an int
directly if sizeof(int) <= sizeof(void *)
: (void *)3
.
As a related example of this style: A FUSE filesystem I'm currently working on starts by opening a filesystem instance, say struct MyFS
. When running FUSE in multithreaded mode, threads arrive onto a series of FUSE-defined calls for handling open
, read
, stat
, etc. Naturally these can have no advance knowledge of the actual specifics of my filesystem, so this is passed in the fuse_main
function void *
argument intended for this purpose. struct MyFS *blah = myfs_init(); fuse_main(..., blah);
. Now when the threads arrive at the FUSE calls mentioned above, the void *
received is converted back into struct MyFS *
so that the call can be handled within the context of the intended MyFS instance.
Isn't this inherently unsafe?
No. It is a pointer. Since you (as the developer) have created both the function that will be executed by the thread and the argument that will be passed to the thread you are in full control. Remember this is a C API (not a C++ one) so it is as safe as you can get.
This can work 'safely' only if the arg is in the heap,
No. It is safe as long as its lifespan in the parent thread is as long as the lifetime that it can be used in the child thread. There are many ways to make sure that it lives long enough.
and then again creating a heap variable adds to the overhead of cleaning the allocated memory up.
Seriously. That's an argument? Since this is basically how it is done for all threads unless you are passing something much more simple like an integer (see below).
If a stack variable is provided as the arg then the result is at best unpredictable.
Its as predictable as you (the developer) make it. You created both the thread and the argument. It is your responsibility to make sure that the lifetime of the argument is appropriate. Nobody said it would be easy.
This looks like a half-cooked solution to me, or am i missing some subtle aspects of the APIs?
You are missing that this is the most basic of threading API. It is designed to be as flexible as possible so that safer systems can be developed with as few strings as possible. So we now hove boost::threads which if I guess is build on-top of these basic threading facilities but provide a much safer and easier to use infrastructure (but at some extra cost).
If you want RAW unfettered speed and flexibility use the C API (with some danger). If you want a slightly safer use a higher level API like boost:thread (but slightly more costly)
Thread specific storage with no dynamic allocation (Example)
#include <pthread.h>
#include <iostream>
struct ThreadData
{
// Stuff for my thread.
};
ThreadData threadData[5];
extern "C" void* threadStart(void* data);
void* threadStart(void* data)
{
intptr_t id = reinterpret_cast<intptr_t>(data);
ThreadData& tData = threadData[id];
// Do Stuff
return NULL;
}
int main()
{
for(intptr_t loop = 0;loop < 5; ++loop)
{
pthread_t threadInfo; // Not good just makes the example quick to write.
pthread_create(&threadInfo, NULL, threadStart, reinterpret_cast<void*>(loop));
}
// You should wait here for threads to finish before exiting.
}
Allocation on the heap does not add a lot of overhead.
Besides the heap and the stack, global variable space is another option. Also, it's possible to use a stack frame that will last as long as the child thread. Consider, for example, local variables of main
.
I favor putting the arguments to the thread in the same structure as the pthread_t
object itself. So wherever you put the pthread record, put its arguments as well. Problem solved :v) .
This is a common idiom in all C programs that use function pointers, not just for creating threads.
Think about it. Suppose your function void f(void (*fn)())
simply calls into another function. There's very little you can actually do with that. Typically a function pointer has to operate on some data. Passing in that data as a parameter is a clean way to accomplish this, without, say, the use of global variables. Since the function f()
doesn't know what the purpose of that data might be, it uses the ever-generic void *
parameter, and relies on you the programmer to make sense of it.
If you're more comfortable with thinking in terms of object-oriented programming, you can also think of it like calling a method on a class. In this analogy, the function pointer is the method and the extra void *
parameter is the equivalent of what C++ would call the this
pointer: it provides you some instance variables to operate on.
The pointer is a pointer to the data that you intend to use in the function. Windows style APIs require that you give them a static or global function.
Often this is a pointer to the class you are intending to use a pointer to this or pThis if you will and the intention is that you will delete the pThis after the ending of the thread.
Its a very procedural approach, however it has a very big advantage which is often overlooked, the CreateThread C style API is binary compatible so that when you wrap this API with a C++ class (or almost any other language) you can do this actually do this. If the parameter was typed, you wouldn't be able to access this from another language as easily.
So yes, this is unsafe but there's a good reason for it.
精彩评论