跳转至

Data Cache DCache

Terminology

Abbreviation Full Name Description
TODO TODO TODO

Submodule List

Submodule Description
BankedDataArray Data and ECC SRAM
MetaArray Metadata Register File
TagArray Tag and ECC SRAM
ErrorArray Error Flag Register File
PrefetchArray Prefetch Metadata Register File
AccessArray Access Metadata Register File
LoadPipe Load Access DCache Pipeline
StorePipe Store Access DCache Pipeline
MainPipe DCache Main Pipeline
MissQueue DCache Miss Status Handling Queue
WritebackQueue DCache Data Writeback Request Handling Queue
ProbeQueue Probe/Snoop Request Handling Queue
CtrlUnit DCache ECC Injection Controller
AtomicsUnits Atomic Instruction Units

DCache Design Specifications

Feature Description
Data Cache 64KB, 4-way set associative, 256 sets, 8 banks per set
Virtually Indexed, Physically Tagged (VIPT)
Tag and each bank use SEC-DED ECC
Cacheline 64 Bytes
Replacement Pseudo-Least Recently Used (PLRU)
Read/Write Interface 3*128 bits read pipelines
1*512 bits write pipeline

Data RAM

For each request accessing DCache Data, the returned data from DCache Data SRAM is in the format shown in the table below.

Bit Field Description
[71, 64] 64bits data ECC encoding result
[63, 0] 64bits data

Tag RAM

For each request accessing DCache Tag, the returned data from DCache Tag SRAM is in the format shown in the table below.

Bit Field Description
[42, 36] 36bits tag ECC encoding result
[35, 0] 36bits tag

Meta

For each request accessing DCache Meta, the returned data from DCache Meta is in the format shown in the table below.

Bit Field Description
[1 : 0] Cacheline coherence metadata
2'b00 Nothing
2'b01 Branch
2'b10 Trunk
2'b11 Dirty

Overall Block Diagram

The overall architecture of the DCache module is shown in 此图.

Overall DCache Architecture

Functional Description

Feature 1: Load Request Handling

For a standard Load request, the DCache receives a load instruction from the LoadUnit (there are three load pipelines implemented, allowing parallel processing of three load requests). Based on the calculated address, it queries the tagArray and metaArray to determine if there is a hit: if a cacheline is hit, the data response is returned; if it is a miss, a MSHR (MissEntry) item is allocated, and the request is handed over to the MissQueue for processing. The MissQueue is responsible for sending an Acquire request to the L2 Cache to fetch the data for refill, and waiting for the hint signal returned by the L2 Cache. When the l2_hint arrives, a refill request is initiated to the MainPipe to select a replacement way and write the refilled data block into the storage unit. Simultaneously, the fetched refilled data is forwarded to the LoadUnit to complete the response. If the replaced block needs to be written back, a Release request is sent to L2 in the WritebackQueue to write it back. If the allocation of an MSHR item for a missed request fails, the DCache provides a signal indicating MSHR allocation failure, and the LoadUnit and LoadQueueReplay reschedule the load request accordingly.

Feature 2: Store Request Handling

For a standard Store request, the DCache receives a store instruction from the StoreBuffer. It uses the MainPipe pipeline to calculate the address and query the tag and meta to determine if there is a hit. If a cacheline is hit, the DCache data is updated directly and an acknowledgment is returned; if it is a miss, an MSHR item is allocated, the request is handed over to the MissQueue, and L2 is requested to provide the original target data line to be refilled into the DCache. The DCache then waits for the hint signal returned by the L2 Cache. When the l2_hint arrives, a refill request is initiated to the MainPipe to select a replacement way and write the refilled data block into the DCache storage unit. After completing the store operation on this data, an acknowledgment is returned to the StoreBuffer. If the replaced block needs to be written back, a Release request is sent to L2 in the WritebackQueue to write it back. If the allocation of an MSHR item for a missed request fails, the DCache provides a signal indicating MSHR allocation failure, and the StoreBuffer will reschedule the store request later.

Feature 3: Atomic Instruction Handling

Atomic instructions are completed by the DCache's MainPipe pipeline, which performs the instruction computation and read/write operations and returns a response. If data is missing, a request is similarly initiated to the MissQueue to fetch the data, after which the atomic instruction continues execution. For AMO instructions, the computation is completed first, and then the result is written. For LR/SC instructions, their reservation set is set/checked. During the execution of atomic instructions, the core will not issue other requests to the DCache (see the Memblock document).

Feature 4: Probe Request Handling

For a Probe request, the DCache receives the Probe request from the L2 Cache, enters the MainPipe pipeline to modify the permissions of the probed data block. After hitting, an acknowledgment is returned in the next cycle.

Feature 5: Replacement and Writeback

The DCache adopts a write-back and write-allocate write policy. A replacer module calculates which block will be replaced after a missed request is refilled. Configurable replacement policies include random, LRU, and PLRU, with PLRU being the default choice. After the replacement block is selected, it is placed in the WritebackQueue, and a Release request is issued to the L2 Cache. The missed request, after reading the target data block from L2, is filled into the corresponding Cacheline.