Probe Queue
Function Description
Responsible for receiving and processing consistency requests from L2, containing 8 Probe Entries. Each entry is responsible for one Probe request. It converts the Probe request into internal signals and sends them to MainPipe. MainPipe modifies the permissions of the probed block. After receiving the response from MainPipe, the Probe Entry is released.
The Probe Queue interacts with L2 only through the B channel and connects with MainPipe. Internally, it consists of 8 Probe Entries. Each entry controls the reception, conversion, and sending of request signals through a set of state registers.
Feature 1: Alias Problem
The Kunminghu architecture uses a 64KB VIPT cache, which introduces the cache alias problem. To solve the aliasing problem, the L2 Cache directory maintains the alias bit corresponding to each physical block stored in the DCache. When the DCache needs to fetch a block with a different alias bit at a certain physical address, the L2 Cache initiates a Probe request to probe down the existing alias block in the DCache and records its alias bit in the TileLink B channel. Upon receiving the request, the Probe Queue concatenates the alias bit and the page offset, converts it into an internal signal, and sends it to MainPipe. MainPipe then accesses the DCache storage module to read data.
Feature 2: Blocking Caused by Atomic Instructions
Since atomic operations (including lr-sc) are completed in the DCache, executing an LR instruction ensures that the target address is already in the DCache. To simplify the design, LR registers a reservation set in MainPipe, recording the LR address and blocking Probes to that address. To avoid deadlock, MainPipe will stop blocking Probes after waiting for SC for a certain period (determined by parameters LRSCCycles and LRSCBackOff). Any subsequent SC instructions received after this period are considered SC fail. Therefore, Probe requests need to be blocked from operating on the DCache during the time after LR registers the reservation set and waits for an SC match.
Overall Block Diagram
The overall architecture of the Probe Queue is shown in 此图.
Interface Timing
Request Interface Timing Example
此图 shows the interface timing for the Probe Queue processing a probe request. The Probe Queue first receives a probe request from L2, converts it into an internal request, and allocates an empty Probe Entry for it. After one clock cycle of state transition, it can send a probe request to MainPipe, but due to timing considerations, this request is delayed by one more cycle (there is an arbiter in the Probe Queue to select an entry, and an arbiter at the MainPipe entrance to select requests from various sources; completing two arbitrations in one cycle is difficult, so it is latched for one cycle here). Thus, pipe_req_valid
is asserted after two cycles. Subsequently, after receiving the response from MainPipe, the Probe Entry is released.
ProbeEntry Module
Each Probe Entry is controlled by a series of state registers, and the execution of a Probe transaction is managed by a state machine. 此表 shows the meaning of the three state registers included in each Entry, and the state machine design is shown in 此图:
State | Description |
---|---|
s_invalid | Reset state, this Probe Entry is empty |
s_pipe_req | Probe request allocated, sending request to Main Pipe |
s_wait_resp | Main Pipe request sending completed, waiting for Main Pipe response |