Data Cache DCache
- Version: V2R2
- Status: WIP
-
Date: 2025/02/28
Terminology
Abbreviation | Full Name | Description |
---|---|---|
TODO | TODO | TODO |
Submodule List
Submodule | Description |
---|---|
BankedDataArray | Data and ECC SRAM |
MetaArray | Metadata Register File |
TagArray | Tag and ECC SRAM |
ErrorArray | Error Flag Register File |
PrefetchArray | Prefetch Metadata Register File |
AccessArray | Access Metadata Register File |
LoadPipe | Load Access DCache Pipeline |
StorePipe | Store Access DCache Pipeline |
MainPipe | DCache Main Pipeline |
MissQueue | DCache Miss Status Handling Queue |
WritebackQueue | DCache Data Writeback Request Handling Queue |
ProbeQueue | Probe/Snoop Request Handling Queue |
CtrlUnit | DCache ECC Injection Controller |
AtomicsUnits | Atomic Instruction Units |
DCache Design Specifications
Feature | Description |
---|---|
Data Cache | 64KB, 4-way set associative, 256 sets, 8 banks per set |
Virtually Indexed, Physically Tagged (VIPT) | |
Tag and each bank use SEC-DED ECC | |
Cacheline | 64 Bytes |
Replacement | Pseudo-Least Recently Used (PLRU) |
Read/Write Interface | 3*128 bits read pipelines |
1*512 bits write pipeline |
Data RAM
For each request accessing DCache Data, the returned data from DCache Data SRAM is in the format shown in the table below.
Bit Field | Description |
---|---|
[71, 64] | 64bits data ECC encoding result |
[63, 0] | 64bits data |
Tag RAM
For each request accessing DCache Tag, the returned data from DCache Tag SRAM is in the format shown in the table below.
Bit Field | Description |
---|---|
[42, 36] | 36bits tag ECC encoding result |
[35, 0] | 36bits tag |
Meta
For each request accessing DCache Meta, the returned data from DCache Meta is in the format shown in the table below.
Bit Field | Description |
---|---|
[1 : 0] | Cacheline coherence metadata |
2'b00 Nothing | |
2'b01 Branch | |
2'b10 Trunk | |
2'b11 Dirty |
Overall Block Diagram
The overall architecture of the DCache module is shown in 此图.
Functional Description
Feature 1: Load Request Handling
For a standard Load request, the DCache receives a load instruction from the LoadUnit (there are three load pipelines implemented, allowing parallel processing of three load requests). Based on the calculated address, it queries the tagArray and metaArray to determine if there is a hit: if a cacheline is hit, the data response is returned; if it is a miss, a MSHR (MissEntry) item is allocated, and the request is handed over to the MissQueue for processing. The MissQueue is responsible for sending an Acquire request to the L2 Cache to fetch the data for refill, and waiting for the hint signal returned by the L2 Cache. When the l2_hint arrives, a refill request is initiated to the MainPipe to select a replacement way and write the refilled data block into the storage unit. Simultaneously, the fetched refilled data is forwarded to the LoadUnit to complete the response. If the replaced block needs to be written back, a Release request is sent to L2 in the WritebackQueue to write it back. If the allocation of an MSHR item for a missed request fails, the DCache provides a signal indicating MSHR allocation failure, and the LoadUnit and LoadQueueReplay reschedule the load request accordingly.
Feature 2: Store Request Handling
For a standard Store request, the DCache receives a store instruction from the StoreBuffer. It uses the MainPipe pipeline to calculate the address and query the tag and meta to determine if there is a hit. If a cacheline is hit, the DCache data is updated directly and an acknowledgment is returned; if it is a miss, an MSHR item is allocated, the request is handed over to the MissQueue, and L2 is requested to provide the original target data line to be refilled into the DCache. The DCache then waits for the hint signal returned by the L2 Cache. When the l2_hint arrives, a refill request is initiated to the MainPipe to select a replacement way and write the refilled data block into the DCache storage unit. After completing the store operation on this data, an acknowledgment is returned to the StoreBuffer. If the replaced block needs to be written back, a Release request is sent to L2 in the WritebackQueue to write it back. If the allocation of an MSHR item for a missed request fails, the DCache provides a signal indicating MSHR allocation failure, and the StoreBuffer will reschedule the store request later.
Feature 3: Atomic Instruction Handling
Atomic instructions are completed by the DCache's MainPipe pipeline, which performs the instruction computation and read/write operations and returns a response. If data is missing, a request is similarly initiated to the MissQueue to fetch the data, after which the atomic instruction continues execution. For AMO instructions, the computation is completed first, and then the result is written. For LR/SC instructions, their reservation set is set/checked. During the execution of atomic instructions, the core will not issue other requests to the DCache (see the Memblock document).
Feature 4: Probe Request Handling
For a Probe request, the DCache receives the Probe request from the L2 Cache, enters the MainPipe pipeline to modify the permissions of the probed data block. After hitting, an acknowledgment is returned in the next cycle.
Feature 5: Replacement and Writeback
The DCache adopts a write-back and write-allocate write policy. A replacer module calculates which block will be replaced after a missed request is refilled. Configurable replacement policies include random, LRU, and PLRU, with PLRU being the default choice. After the replacement block is selected, it is placed in the WritebackQueue, and a Release request is issued to the L2 Cache. The missed request, after reading the target data block from L2, is filled into the corresponding Cacheline.