开发者

How can I allocate memory in Linux that meets paging and cacheability requirements?

开发者 https://www.devze.com 2023-02-25 22:43 出处:网络
I want to allocate space for a large array that will be write-only until the very end of the program.For that reason, I don\'t care if it\'s it cached.

I want to allocate space for a large array that will be write-only until the very end of the program. For that reason, I don't care if it's it cached.

I also want to access this very frequently, so I don't want to have to do a page walk more than once. For that reason I want it to be allocated in a large a page (e.g. 4M).

So how can I...

  • ...request the memory to be either uncacheable开发者_运维知识库 or write-through?
  • ...request the memory to be placed in a large page?

I am working in Linux.


Disabling caching sounds like it would make your writes slower if it forces a write all the way through to the RAM. I'm not sure I'd attempt that at all.

To actually use large pages, I suggest following HugeTLB - Large Page Support in the Linux Kernel. It contains an example of how you can use large pages via a shared memory segment.


With transparent hugepages, simply allocating a 4M-aligned buffer will work. Use aligned_alloc or posix_memalign to get a pointer you can free. (Note that aligned_alloc is required to fail if the buffer size isn't a multiple of the alignment. /facepalm).

Depending on your setting for /sys/kernel/mm/transparent_hugepage/defrag, you may need to use madvise(MADV_HUGEPAGE) on the buffer to strongly encourage the kernel to use hugepages.

Also note that x86-64 uses 2M hugepages. x86-32 uses 4M hugepages. Aligning to 4M is fine if you want the easy solution for both.


request the memory to be either uncacheable or write-through?

AFAIK, you can't easily do that through normal Linux APIs. NT stores work to normal write-back memory, so use that instead. (They over-ride the memory type and are weakly-ordered cache-bypassing).

But if you're not writing full cache-lines at a time, you definitely want cached writes. Especially if there's any spatial or temporal locality, but even if not then letting the store buffer do its job (hiding the latency of cache-miss stores) is a good thing.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号