Can Intel processors delay TLB invalidations?_问答_开发者

Can Intel processors delay TLB invalidations?

开发者 https://www.devze.com 2023-03-23 07:43 出处：网络

This in reference to InteI\'s Software Developer’s Manual (Order Number: 325384-039US May 2011), the section 4.10.4.4 \"Delayed Invalidation\" desc开发者_如何学编程ribes a potential delay in invalida

This in reference to InteI's Software Developer’s Manual (Order Number: 325384-039US May 2011), the section 4.10.4.4 "Delayed Invalidation" desc开发者_如何学编程ribes a potential delay in invalidation of TLB entries which can cause unpredictable results while accessing memory whose paging-structure entry has been changed.

The manual says ... "Required invalidations may be delayed under some circumstances. Software devel- opers should understand that, between the modification of a paging-structure entry and execution of the invalidation instruction recommended in Section 4.10.4.2, the processor may use translations based on either the old value or the new value of the paging-structure entry. The following items describe some of the potential conse- quences of delayed invalidation: If a paging-structure entry is modified to change the R/W flag from 0 to 1, write accesses to linear addresses whose translation is controlled by this entry may or may not cause a page-fault exception."

Let us suppose a simple case, where a page-strucure entry is modified (r/w flag is flipped from 0 to 1) for a linear address and after that the corresponding TBL invalidation instruction is called immediately. My question is--as a consiquence of delayed invalidation of TLB s it possible that even after calling invalidation of TLB a write access to the linear address in question doesn't fault (page fault)?

Or is the "delayed invalidation" can only cause unpredictable results when "invalidate" instruction for the linear address whose page-structure has changed has not been issued?

TLBs are transparently optimisitically not uncached by CR3 changes. TLBs entries are marked with a unique identifier for the address-space and are left in the TLB until they are either touched by the wrong process (in which case the TLB entry is trashed) or the address-space is restored (in which case the TLB was preserved over the address-space changing).

This all happens transparently to the CPU. Your program (or OS) shouldn't be able to tell the difference between this behaviour and the TLB being actually invalidated by a TLB invalidation except via:

A) Timing - i.e. TLB optimisticly not uncaching is faster (which is why they do it)
B) There are edge cases where this behaviour is somewhat undefined. If you modify the code page on which you're sitting or touch memory you've just changed, the old value of the TLB might still be there (even across a CR3 change).

The solution to this is to either:

1) force a TLB update via a invlpg instruction. This purges the TLB entry, triggering a TLB read-in on the next touch of the page.
2) disable and re-enable paging via the CR0 register.
3) mark all pages as un-cachable via the cache-disable bit in CR0 or on all of the pages of the TLB (TLB entries marked uncachable are auto-purged after use).
4) Change the mode of the code-segment.

Note that this is genuinely undefined behaviour. Transitioning to SMM can invalidate the TLB, or might not, leaving this open to a race-condition. Don't depend on this behaviour.