Request Arbiter and Main Pipe
The Request Arbiter and the Main Pipe together form the overall five-stage pipeline of the CoupledL2. In sequence, these are abbreviated as the first stage s1
, the second stage s2
, the third stage s3
, the fourth stage s4
, and the fifth stage s5
. The Request Arbiter ReqArbiter primarily constitutes s1
and s2
, while the Main Pipe MainPipe primarily constitutes s3
, s4
, and s5
.
S0 Pipeline Stage
s0
is located only within the Request Arbiter ReqArbiter and is not counted as a separate pipeline stage. s0
is only used to generate the task backpressure signal for each MSHR entry. In the following situations, the ReqArbiter will prevent tasks from the MSHR from leaving the MSHR and entering the pipeline:
- An MSHR task requiring a Directory read in the previous cycle was blocked.
- A blocking signal exists from the GrantBuffer.
- A blocking signal exists from the upstream TileLink C channel.
- A blocking signal exists from the downstream TXDAT channel.
- A blocking signal exists from the downstream TXRSP channel.
- A blocking signal exists from the downstream TXREQ channel.
S1 Pipeline Stage
s1
is located only within the Request Arbiter ReqArbiter.
In s1
, the following request sources are arbitrated:
- MSHR
- Upstream TileLink C channel
- Upstream TileLink B channel
- Upstream TileLink A channel
In the list above, the request sources listed higher have the highest priority. When they enter the ReqArbiter's s1
simultaneously, the highest priority item will be selected for handshake, and other task sources will be blocked. This means MSHR tasks have the highest priority, followed by the upstream TileLink C channel, upstream TileLink B channel, and upstream TileLink A channel.
In s1
, the ReqArbiter also needs to consider blocking signals from the MainPipe. Furthermore, a request can only leave s1
when s2
is ready; otherwise, the request is blocked and buffered in s1
.
After arbitration is completed, a read request is sent to the Directory in s1
.
S2 Pipeline Stage
s2
is located within both the Request Arbiter ReqArbiter and the Main Pipe MainPipe.
Due to frequency limitations of the CoupledL2's SRAM, a Multi-Cycle Path 2 (MCP2) is used, meaning single SRAM read or write requests must last at least two cycles. Therefore, in s2
, the ReqArbiter will block all back-to-back requests by one cycle to ensure that the hold time and request interval for requests on the MainPipe meet the requirements of MCP2.
In s2
, the ReqArbiter decides whether to read from the ReleaseBuffer or RefillBuffer. It also sends a read request to the ReleaseBuffer or RefillBuffer in s2
.
In one of the following situations, the ReqArbiter will send a read request to the RefillBuffer in s2
:
- The task is a downstream cache line write-back or eviction task caused by a replacement task (at this point, the data being written back is no longer needed; the data read as part of the replacement is written to DataStorage at this point).
- The task is an upstream TileLink A channel request but does not use the data returned by an upstream Probe (if data from an upstream Probe response were used, the ReleaseBuffer should be read).
In one of the following situations, the ReqArbiter will send a read request to the ReleaseBuffer in s2
:
- The task is an MSHR task, and the downstream request requires reading data from an upstream Probe response.
- The task is an MSHR task, and the upstream TileLink A channel request needs to use data from an upstream Probe response.
- The task is not an MSHR task, and a nesting occurred between a downstream Snoop and a downstream write-back request task.
The ReqArbiter sends the task into the MainPipe in s2
.
The MainPipe generates blocking signals for s1
in s2
and sends them back to the ReqArbiter and RequestBuffer. The MainPipe needs to send blocking signals to various components and channels in the following situations:
- When a task arrives at
s2
, if it is not certain that it will not write to the Directory, a signal is sent to the RequestBuffer to block requests for the same Set. - When a task arrives at
s2
, if it is not certain that it will not write to the Directory, a signal is sent to the ReqArbiter to block MSHR requests for the same Set. - When a task arrives at
s2
, if it is not certain that it will not write to the Directory, a signal is sent to the ReqArbiter to block upstream TileLink C channel requests for the same Set. - When a task arrives at
s2
(as well ass3
,s4
, ands5
, i.e., including all tasks still on the MainPipe, which will not be repeated in subsequent sections), a signal is sent to the ReqArbiter to block downstream RXSNP channel requests for the same address.
S3 Pipeline Stage
s3
is located only within the Main Pipe MainPipe. Most of the request judgment, distribution logic, and interactions with various other modules are located in the s3
stage.
Cache Line State Collection
The read request sent by the ReqArbiter to the Directory in s1
can obtain the read result in s3
. If a request from the downstream RXSNP channel exhibits nesting with an MSHR, meaning the address of the downstream Snoop request is the same as an outstanding MSHR address, the cache line state from that MSHR entry will be used to override the Directory read result.
MSHR Allocation
The MainPipe will allocate an MSHR in the s3
stage when one of the following conditions is met:
- The task originates from the upstream TileLink A channel.
- Acquire*, Hint, or Get requests miss the cache line.
- Acquire* toT hits a cache line in the BRANCH state.
- CBO*-type CMO requests.
- Alias replacement requests.
- Any task requiring sending a Probe request upstream.
- Get request hits a cache line in the TRUNK state and it exists in the upstream L1.
- CBOClean request hits a cache line in the TRUNK state and it exists in the upstream L1.
- CBOFlush request hits a cache line that exists in the upstream L1.
- CBOInval request hits a cache line that exists in the upstream L1.
- The task originates from the downstream RXSNP channel.
- The corresponding Snoop type hits the corresponding cache line state.
- A Forwarding Snoop and hits the cache line.
For non-Forwarding Snoop type requests from downstream, the situations requiring MSHR allocation are shown in the following table:
Snoop Request Type | Hit State | Present in L1 |
---|---|---|
SnpOnce | TRUNK | Yes |
SnpClean | TRUNK | Yes |
SnpShared | TRUNK | Yes |
SnpNotSharedDirty | TRUNK | Yes |
SnpUnique | - | Yes |
SnpCleanShared | TRUNK | Yes |
SnpCleanInvalid | - | Yes |
SnpMakeInvalid | - | Yes |
SnpMakeInvalidStash | - | Yes |
SnpUniqueStash | - | Yes |
SnpStashUnique | TRUNK | Yes |
SnpStashShared | TRUNK | Yes |
SnpQuery | TRUNK | Yes |
Directory Write
In s3
, the MainPipe sends write requests to the Directory according to the task requirements.
DataStorage Read/Write
In s3
, the MainPipe sends read or write requests to DataStorage according to the task requirements.
Request and Message Distribution
In s3
, the MainPipe sends requests in one of the following channel directions according to the task requirements:
- Upstream TileLink D channel
- Downstream TXREQ channel
- Downstream TXRSP channel
- Downstream TXDAT channel
The specific distribution direction is determined by the task itself. See MSHR for details.
Snoop Request Handling
Downstream Snoop requests may not allocate an MSHR and may instead complete the response action directly in the MainPipe. The state transitions for such Snoop requests are determined in s3
of the MainPipe. The Snoop requests occurring in s3
and their corresponding state transitions are shown in the following table:
Snoop Request Type | Initial State | Final State | RetToSrc | Snoop Response |
---|---|---|---|---|
SnpOnce | I | I | X | SnpResp_I |
UC | UC | X | SnpRespData_UC | |
UD | UD | X | SnpRespData_UD_PD | |
SC | SC | 0 | SnpResp_SC | |
1 | SnpRespData_SC | |||
SnpClean, | I | I | X | SnpResp_I |
SnpShared, | UC | SC | X | SnpResp_SC |
SnpNotSharedDirty | UD | SC | X | SnpRespData_SC_PD |
SC | SC | 0 | SnpResp_SC | |
1 | SnpRespData_SC | |||
SnpUnique | I | I | X | SnpResp_I |
UC | I | X | SnpResp_I | |
UD | I | X | SnpRespData_I_PD | |
SC | I | 0 | SnpResp_I | |
1 | SnpRespData_I | |||
SnpCleanShared | I | I | 0 | SnpResp_I |
UC | UC | 0 | SnpResp_UC | |
UD | UC | 0 | SnpRespData_UC_PD | |
SC | SC | 0 | SnpResp_SC | |
SnpCleanInvalid | I | I | 0 | SnpResp_I |
UC | I | 0 | SnpResp_I | |
UD | I | 0 | SnpRespData_I_PD | |
SC | I | 0 | SnpResp_I | |
SnpMakeInvalid | - | I | 0 | SnpResp_I |
SnpMakeInvalidStash | - | I | 0 | SnpResp_I |
SnpUniqueStash | I | I | 0 | SnpResp_I |
UC | I | 0 | SnpResp_I | |
UD | I | 0 | SnpRespData_I_PD | |
SC | I | 0 | SnpResp_I | |
SnpStashUnique, | I | I | 0 | SnpResp_I |
SnpStashShared | UC | UC | 0 | SnpResp_UC |
UD | UD | 0 | SnpResp_UD | |
SC | SC | 0 | SnpResp_SC | |
SnpOnceFwd | I | I | 0 | SnpResp_I |
UC | UC | 0 | SnpResp_UC_Fwded_I | |
UD | UD | 0 | SnpResp_UD_Fwded_I | |
SC | SC | 0 | SnpResp_SC_Fwded_I | |
SnpCleanFwd, | I | I | X | SnpResp_I |
SnpNotSharedDirtyFwd, | UC | SC | 0 | SnpResp_SC_Fwded_SC |
SnpSharedFwd | 1 | SnpRespData_SC_Fwded_SC | ||
UD | SC | X | SnpRespData_SC_PD_Fwded_SC | |
SC | SC | 0 | SnpResp_SC_Fwded_SC | |
1 | SnpRespData_SC_Fwded_SC | |||
SnpUniqueFwd | I | I | 0 | SnpResp_I |
UC | I | 0 | SnpResp_I_Fwded_UC | |
UD | I | 0 | SnpResp_I_Fwded_UD_PD | |
SC | I | 0 | SnpResp_I_Fwded_UC | |
SnpQuery | I | I | 0 | SnpResp_I |
UC | UC | 0 | SnpResp_UC | |
UD | UD | 0 | SnpResp_UD | |
SC | SC | 0 | SnpResp_SC |
Early Task Completion
Tasks on the MainPipe can finish early in the s3
stage and not proceed to subsequent pipeline stages if one of the following conditions is met:
- The task does not need to move data from DataStorage to the ReleaseBuffer, and one of the following conditions is met:
- The task's request to upstream or downstream channels (Upstream TileLink D, Downstream TXREQ, Downstream TXRSP, Downstream TXDAT) successfully leaves the MainPipe in
s3
. - The task needs to allocate an MSHR.
- The task's request to upstream or downstream channels (Upstream TileLink D, Downstream TXREQ, Downstream TXRSP, Downstream TXDAT) successfully leaves the MainPipe in
- The task's request to the upstream TileLink D channel (AccessAckData, HintAck, GrantData, Grant) is retried.
S4 Pipeline Stage
If a task on the MainPipe was not finished early in the s3
stage, it enters the s4
stage. A task can finish early in the s4
stage and not proceed to subsequent pipeline stages if all of the following conditions are met:
- The task does not need to move data from DataStorage to the ReleaseBuffer.
- The task's request to upstream or downstream channels (Upstream TileLink D, Downstream TXREQ, Downstream TXRSP, Downstream TXDAT) successfully leaves the MainPipe in
s4
.
If the task is not finished in s4
, it proceeds to the s5
stage.
S5 Pipeline Stage
If a task on the MainPipe was not finished early in the s4
stage, it enters the s5
stage.
If a read request to DataStorage was initiated in the s3
stage, the data for the corresponding cache line is available in s5
.
In s5
, the MainPipe writes data from DataStorage or from the MainPipe to the ReleaseBuffer according to the task requirements and the request nesting situation.