跳转至

Probe Queue

Function Description

Responsible for receiving and processing consistency requests from L2, containing 8 Probe Entries. Each entry is responsible for one Probe request. It converts the Probe request into internal signals and sends them to MainPipe. MainPipe modifies the permissions of the probed block. After receiving the response from MainPipe, the Probe Entry is released.

The Probe Queue interacts with L2 only through the B channel and connects with MainPipe. Internally, it consists of 8 Probe Entries. Each entry controls the reception, conversion, and sending of request signals through a set of state registers.

Feature 1: Alias Problem

The Kunminghu architecture uses a 64KB VIPT cache, which introduces the cache alias problem. To solve the aliasing problem, the L2 Cache directory maintains the alias bit corresponding to each physical block stored in the DCache. When the DCache needs to fetch a block with a different alias bit at a certain physical address, the L2 Cache initiates a Probe request to probe down the existing alias block in the DCache and records its alias bit in the TileLink B channel. Upon receiving the request, the Probe Queue concatenates the alias bit and the page offset, converts it into an internal signal, and sends it to MainPipe. MainPipe then accesses the DCache storage module to read data.

Feature 2: Blocking Caused by Atomic Instructions

Since atomic operations (including lr-sc) are completed in the DCache, executing an LR instruction ensures that the target address is already in the DCache. To simplify the design, LR registers a reservation set in MainPipe, recording the LR address and blocking Probes to that address. To avoid deadlock, MainPipe will stop blocking Probes after waiting for SC for a certain period (determined by parameters LRSCCycles and LRSCBackOff). Any subsequent SC instructions received after this period are considered SC fail. Therefore, Probe requests need to be blocked from operating on the DCache during the time after LR registers the reservation set and waits for an SC match.

Overall Block Diagram

The overall architecture of the Probe Queue is shown in 此图.

ProbeSnoop Flowchart

Interface Timing

Request Interface Timing Example

此图 shows the interface timing for the Probe Queue processing a probe request. The Probe Queue first receives a probe request from L2, converts it into an internal request, and allocates an empty Probe Entry for it. After one clock cycle of state transition, it can send a probe request to MainPipe, but due to timing considerations, this request is delayed by one more cycle (there is an arbiter in the Probe Queue to select an entry, and an arbiter at the MainPipe entrance to select requests from various sources; completing two arbitrations in one cycle is difficult, so it is latched for one cycle here). Thus, pipe_req_valid is asserted after two cycles. Subsequently, after receiving the response from MainPipe, the Probe Entry is released.

ProbeSnoop Timing

ProbeEntry Module

Each Probe Entry is controlled by a series of state registers, and the execution of a Probe transaction is managed by a state machine. 此表 shows the meaning of the three state registers included in each Entry, and the state machine design is shown in 此图:

ProbeEntry State Register Meaning
State Description
s_invalid Reset state, this Probe Entry is empty
s_pipe_req Probe request allocated, sending request to Main Pipe
s_wait_resp Main Pipe request sending completed, waiting for Main Pipe response

ProbeEntry State Machine