跳转至

Last Level Page Table Walker Tertiary Module

The Last Level Page Table Walker refers to the following module:

  • LLPTW llptw

Design Specification

  1. Supports accessing the last-level page table.
  2. Supports parallel processing of multiple requests.
  3. Supports sending PTW requests to memory.
  4. Supports sending refill signals to the Page Cache.
  5. Supports exception handling mechanisms.
  6. Supports second-stage translation.

Functionality

Accessing the Last-Level Page Table

The function of the Last Level Page Table Walker is to access the last-level page table while increasing the parallel access capability of the Page Table Walker. A Page Table Walker can only process one request at a time, whereas the LLPTW can process multiple requests simultaneously. If there are duplicate requests among multiple requests, the LLPTW will not merge them but will record these requests and share the memory access results to avoid repeated memory accesses.

The LLPTW may receive requests from the Page Cache or the Page Table Walker. Requests from the Page Cache need to satisfy the conditions of hitting the second-level page table, missing the third-level page table, and not being a bypass request. Requests from the Page Table Walker already satisfy the condition of only missing the last-level page table, and therefore can access memory through the LLPTW. An arbiter is used to arbitrate requests from the Page Cache and the Page Table Walker and send them to the LLPTW.

The Page Table Walker and LLPTW cooperate to complete the entire Page Table Walk process. To increase memory access parallelism, the LLPTW assigns different IDs to requests, allowing multiple inflight requests simultaneously. Since the first two levels of the page table may be the same for different requests, and considering that the miss rate for the first two levels is lower than that for the last level, there is no need to consider increasing the access parallelism for the first two levels. The Page Table Walker handles single requests only to reduce design complexity.

Parallel Processing of Multiple Requests

The LLPTW can process multiple requests simultaneously, with the number of parallel requests equal to the number of entries in the LLPTW. If there are duplicate requests among multiple requests, the LLPTW will not merge them but will record these requests and share the memory access results to avoid repeated memory accesses. Each entry in the LLPTW maintains the state of memory access via a state machine. When the LLPTW receives a new request, it compares the address of the new request with the addresses of existing requests. If the addresses are the same, the state of the existing request is copied to the new request. Thus, requests with the same address can share memory access results, avoiding repeated access requests.

Sending PTW Requests to Memory

Similar to the behavior of the Page Table Walker, the LLPTW can also send PTW requests to memory. The LLPTW merges duplicate requests to share memory access results and avoid repeated memory accesses. Since the data returned by memory is large (512 bits each time), the results are not stored in the LLPTW. If a PTW request is passed to the LLPTW while the LLPTW is receiving results for a PTW request it sent to memory, and the physical address of the incoming request matches the physical address returned by memory, the incoming request is sent to the Miss queue to await the next access to the Page Cache.

Sending Refill Signals to Page Cache

The logic for the Last Level Page Table Walker sending refill signals to the Page Cache is also similar to that of the Page Table Walker and will not be detailed here.

Exception Handling Mechanism

An access fault exception may occur in the Last Level Page Table Walker. This exception is delivered to the L1 TLB, which then handles it based on the request source. See Section 6 of this document: Exception Handling Mechanism.

Support for Second-Stage Translation

Four new states have been added: state_hptw_req, state_hptw_resp, state_last_hptw_req, and state_last_hptw_resp. When a two-stage translation request enters the LLPTW, it first undergoes a second-stage translation to obtain the real physical address of the third-level page table. Then, address checking and memory access are performed. After obtaining the third-level page table, before returning, another second-stage translation is performed to obtain the final physical address.

Each entry has a new hptw resp structure to save the result of each second-stage translation. During the first second-stage translation, when the hptw returns, all entries are checked. If any memory access request in the same cacheline has already been sent, the entry directly enters the mem waiting state.

The LLPTW has added some arbiters for second-stage translation. hyper_arb1 is used for the first second-stage address translation, corresponding to the hptw req state. hyper_arb2 is used for the second second-stage address translation, corresponding to the last hptw req state. The hptw_req_arb input ports are hyper_arb1 and hyper_arb2, and the output is the LLPTW's output signal for hptw requests.

Overall Block Diagram

Although the Last Level Page Table Walker can process multiple accesses to the last-level page table in parallel, the internal logic is also implemented through a state machine, similar to the Page Table Walker. Here, the state transition diagram and transition relationships of the state machine are introduced. For the connection relationship between the Last Level Page Table Walker and other modules in the L2 TLB, refer to Section 5.3.3.

The state machine transition diagram is shown in 此图. This state machine depicts the state transitions for non-two-stage address translation requests.

Last Level Page Table Walker State Machine State Transition Diagram

After adding virtualization extensions, when the LLPTW receives a two-stage address translation request, the state machine is as shown in 此图.

Last Level Page Table Walker State Machine State Transition Diagram for allStage Requests

Not all requests entering the LLPTW start from the idle state. Depending on the status of existing entries in the LLPTW, they may enter the idle, addr_check, mem_waiting, mem_out, or cache states. For two-stage address translation requests, they may enter the hptw_req, cache, mem_waiting, and last_hptw_req states.

  • idle: Initial state. When an LLPTW request is finished, it returns to the idle state, indicating that this entry in the LLPTW is empty. When a prefetch request enters the LLPTW and duplicates another LLPTW request, the prefetch request is not accepted, and the LLPTW entry remains idle. It can return to the idle state from three situations:
    1. Currently in the mem_out state, and an access fault occurs during PMP&PMA check. It returns to the L1 TLB, and the state transitions to idle.
    2. Currently in the mem_out state, the last-level page table is found and returned to the L1 TLB. The state transitions to idle.
    3. Currently in the cache state, and the requested page table has been written into the Page Cache. It needs to be sent back to the Page Cache for further lookup, and the state transitions to idle.
  • hptw_req: Enters this state when the incoming request is a two-stage address translation request. This state sends an hptw request to the L2TLB.
  • hptw_resp: After the hptw request is sent, it enters this state, waiting for the hptw request to return. After the request returns, if it duplicates an existing LLPTW entry in the mem_waiting state, it enters mem_waiting; otherwise, it enters addr_check.
  • addr_check: Enters this state when the request entering the LLPTW does not duplicate existing requests in the LLPTW and is not a two-stage translation request. Also, for two-stage address translation requests, after the hptw request returns, it enters this state. The physical address also needs to be sent to the PMP module for PMP&PMA check. The PMP module needs to return the PMP&PMA check result in the same cycle. If no access fault occurs, it enters the mem_req state; otherwise, it enters the mem_out state.
  • mem_req: In this state, PMP&PMA checking is complete, and a request can be sent to memory (mem_arb). For each LLPTW entry, when the virtual page number corresponding to the memory access request sent by mem_arb is the same as the virtual page number in the LLPTW entry, it enters the mem_waiting state, waiting for the memory response.
  • mem_waiting: When the virtual page number of the incoming LLPTW request is the same as the virtual page number corresponding to a PTW request already sent to memory by LLPTW, the state of the new request's LLPTW entry is set to mem_waiting. This state waits for the memory response. When the page table entry returned by memory corresponds to this LLPTW entry, for non-two-stage address translation LLPTW entries, the state transitions to mem_out. For two-stage address translation LLPTW entries, the state transitions to last_hptw_req.
  • last_hptw_req: When the virtual page number of the incoming LLPTW request is the same as the virtual page number corresponding to a request that memory is currently responding to LLPTW, and this request is a two-stage translation request, after memory access obtains the final page table, it enters this state to perform the last second-stage address translation and send an hptw request.
  • last_hptw_resp: Waits for the hptw request to return. After the Hptw request returns, it enters the mem_out state.
  • mem_out: When the virtual page number of the incoming LLPTW request is the same as the virtual page number of an LLPTW entry currently in mem_out/last_hptw_req/last_hptw_resp, and this request is not a two-stage translation request, the state of the new request's LLPTW entry is set to mem_out. Since the third-level page table lookup is complete at this point, the retrieved virtual address and page table entry are returned to the L1 TLB. Additionally, in the case where an access fault occurs in the addr_check state, it also needs to be returned to the L1 TLB, and the access fault is reported to the L1 TLB. After successfully returning the information to the L1 TLB, the state transitions to idle.
  • cache: When the virtual page number of the incoming LLPTW request is the same as the virtual page number of an LLPTW entry currently in mem_out/last_hptw_req/last_hptw_resp, the page table entry obtained from memory lookup has already been written back to the Cache. Therefore, a lookup request needs to be sent to the Cache. The state of the new request's LLPTW entry is set to cache. When the Cache (actually mq_arb) receives this request, the state transitions to idle.

Interface Timing

The Last Level Page Table Walker interacts with other modules in the L2 TLB using a valid-ready handshake mechanism. The signals involved are numerous and detailed, and there are no particularly noteworthy timing relationships, therefore they will not be elaborated upon.