开发者

Virtual memory beyond page tables

开发者 https://www.devze.com 2023-02-19 03:15 出处:网络
I am working on a research project to develop an OS for a many-core(1000+) chip.we are looking into implementing a virtual memory type system for memory permissions (read/write/execute) that would all

I am working on a research project to develop an OS for a many-core(1000+) chip. we are looking into implementing a virtual memory type system for memory permissions (read/write/execute) that would allow memory to be safely shared across cores.

basically we want a system that would allow us to mark a 'page' as being readable by some subset of cores writeable by another...etc. we are not going to be doing address translati开发者_开发百科on (at least at this point) but we need a way to efficiently set and query permissions. it is going to be a software filled datastructure with a simple TLB style cache.

Our intuition is that simply replicating page tables for each core will be too expensive (in terms of memory usage).

what datastructures would be efficient for this kind of problem?

thanks


Have you looked at how common multi-core (2-12 core) CPU's address this problem?

Do you know where/when/why/how the solution that is used in these common multi-core CPU's -- will not scale to a 1,000+ cores?

In other words -- can you quantify what's wrong with the existing solution, which is working, and has been working, with common CPU's whose core count <= 12 ?

If you know that -- then the answer is closer than you think, because it just requires understanding how AMD/Intel solved the problem on a lesser scale -- and what's needed to make their solution work on a greater one (Maybe more memory for tables, algorithm tweaks, etc.)

Look at AMD's/Intel's data structures -- then build a software simulator for 1,000+ cores with those data structures, and see where/when/why and how your simulation fails -- if it fails...

Ideally build your simulator with a user-selectable number of cores, then TEST, TEST, TEST with different amounts of cores -- working your way up, noting bottlenecks along the way.

Your simulator should work EXACTLY as well as AMD (if you're using AMD data structures) or Intel (if you're using Intel data structures) -- at the same core count as one of their chips... because it should prove that THEY (AMD/Intel) are doing what they're doing correctly (because they are), and because that will help prove that your simulation program is doing it's simulation correctly -- at a specific number of cores.

Wishing you luck!

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号