跳转至

MissUnit Submodule Documentation

MissUnit is responsible for handling ICache miss requests. It manages them through MSHRs and interacts with the L2 Cache via the Tilelink bus. It is also responsible for sending write requests to MetaArray and DataArray, and sending responses to MainPipe.

MissUnit Structure

MSHR Management

MissUnit manages fetch requests and prefetch requests separately using MSHRs. To prevent fetch MSHRs from not being fully released during a flush, the number of fetch MSHRs is set to 4, and the number of prefetch MSHRs is set to 10. A data and address separation design method is used; all MSHRs share a single set of data registers, and only the address information of the request is stored in the MSHR itself.

Request Enqueueing

MissUnit receives fetch requests from MainPipe and prefetch requests from IPrfetchPipe. Fetch requests can only be assigned to fetchMSHRs, and prefetch requests can only be assigned to prefetchMSHRs. Assignment uses a low index priority scheme during enqueueing. During enqueueing, the MSHRs are also queried. If the request already exists in the MSHRs, it is discarded. The external interface still indicates 'fire', but the request is not enqueued into the MSHR. During enqueueing, it requests waymask writing from the Replacer.

Acquire

When the bus to L2 is idle, an MSHR is selected for processing. Overall, fetchMSHRs have higher priority than prefetchMSHRs; prefetchMSHRs are only processed when there are no fetchMSHRs that need processing. For fetchMSHRs, a low index priority strategy is used. Because at most only two requests need processing simultaneously, and can only proceed when both requests are completed, the priority among fetchMSHRs is not critical. For prefetchMSHRs, considering the temporal order among prefetch requests, a First-In, First-Out (FIFO) priority strategy is adopted. A FIFO is used during enqueueing to record the enqueue order, and requests are processed according to this order.

Grant

It interacts with the Tilelink D channel via a state machine. The bandwidth to L2 is 32 bytes, requiring 2 transfers. Since different requests will not interleave, only one set of registers is needed to store data. When a transfer is completed, the corresponding MSHR is selected based on the transfer ID. Address, mask, and other information are read from the MSHR, this relevant information is written to SRAM, and the MSHR is simultaneously released.