I'm experimenting with memory management in linux kernel modules and I was wondering why a thread does not see the same memory as the module's functions. What I mean is,
I have a int *x
declared globally. I allocate space using kmalloc and assign 10 to it. Now when I try and access it from within a thread, I get a totally different value.
Why does this happen? How can I get around this?
EDIT:
I run my programs in x86 architecture on a single core (on a VM).
Here is my code: http://pastebin.com开发者_运维技巧/94qGc6ZQ
On SMP architectures values that are cached are not updated across all cores so a thread on another core can be using a stale value.
Another issue you can have is concurrent access between threads that means Thread 1 read x before Thread 2 was able write x but Thread 2 continued and said x = 10 but Thread 1 is still using the old value when x was uninitialized.
The way to solve the second problem (which seems more likely) is to use locking to control access to that variable so only 1 thread can modify/read it at a time to avoid issues of stale values.
(Not hardware kernel module so don't use volatile ;P) use suggestion of smp_wb and smp_rb below.
EDIT: Looks like my first suggestion was right. So to solve this you can use smp_wb on x before doing kmalloc and assignment. Then a read barrier on x before attempting to print the value of x. This effectively tells the CPU read the new value because it might be bad or could have been reordered in access. You may be able to just use a read barrier on the other thread but for safety use barriers where access is done.
You need some sort of lock (and memory barrier that invalidates the cache.)
On SMP kernels, there are lock mechanisms implemented (for the kernel) that takes care of this:
Read http://www.mjmwired.net/kernel/Documentation/memory-barriers.txt and especially "Inter-CPU locking barrier effects"
What architecture are you running on? I don't believe the other answers that say you are hitting memory ordering problems or cache coherency problems, because
- x86 is very strongly ordered, and
kthread_run()
internally takes so many locks etc. that I'm sure there is the equivalent of a memory barrier between the assignment to *x and the start of your thread. So even on more weakly ordered architectures, I don't think you are really missing a memory barrier. - I don't believe there is any architecture where Linux runs that is cache-incoherent between CPUs. You have to be careful with external devices doing DMA into memory, but that's completely different from the issue here.
In other words I think the code as you have written it in your question looks fine. I suspect that this is boiled down from your real code, and in so doing you got rid of the real bug.
It certainly is true that if you have code that modifies a variable used in your thread after the kthread_run
then you have a race condition which could lead to a bug like what you see here.
精彩评论