Store Misaligned Access Unit StoreMisalignBuffer
Function Description
The StoreMisalignBuffer stores one misaligned Store instruction that crosses a 16-Byte boundary. The execution logic is a 7-state finite state machine. When an instruction is detected as misaligned and crossing a 16-Byte boundary in the StoreUnit, it requests to enter the StoreMisalignBuffer. The StoreMisalignBuffer latches this Store and splits it into two separate Store memory accesses (flows), which are then re-entered into the StoreUnit.
The StoreMisalignBuffer collects the Store memory accesses initiated by itself. After both Store memory accesses have completed execution, if the misaligned access does not cross a page boundary, the data is written back. Scalar misaligned write-back to the backend should occur when StoreUnit 1 has scalar write-back disabled. If this condition is not met, the StoreMisalignBuffer write-back to the backend is blocked. Vector misaligned write-back to the VSMergeBuffer should occur when StoreUnit 1 has vector scalar write-back disabled. If this condition is not met, the StoreMisalignBuffer write-back to the VSMergeBuffer is blocked.
For Stores that cross a 4K page boundary, we require that execution can only proceed when the instruction reaches the head of the Rob queue. If an older Store enters the StoreMisalignBuffer during this period, it will evict the current page-crossing 4K Store and set the needFlushPipe
flag to true.
When a Store finally writes back, we generate a redirect.
For vector operations, if a Store flow for a vector is evicted, the VSMergeBuffer is notified to mark the corresponding entry for that flow as needRsReplay
, causing the uop to be resent.
Feature 1: Supports splitting misaligned Stores that cross a 16-Byte boundary for memory access
Changes are made based on the completed flow. The state machine re-enters the s_req
state after the first flow writes back, sending the second flow.
If the first flow carries an exception and writes back to the StoreMisalignBuffer, it directly writes back to the backend with the exception information, without needing to execute the second flow.
Either flow may cause a replay for any reason upon write-back. The StoreMisalignBuffer chooses to resend that flow to the StoreUnit, regardless of the reason for the replay.
-
sb
instructions can never be misaligned. -
sh
is split into two correspondingsb
operations:
sw
is split differently depending on the address:
sd
is split differently depending on the address:
Feature 2: Supports Vector Misaligned Access
Vector misaligned flows are handled similarly to scalar misaligned flows. The difference is that vector write-back goes to the VSMergeBuffer, while scalar write-back goes directly to the backend.
Feature 3: Does not support misaligned Stores in non-Memory space
Misaligned Stores in non-Memory space are not supported. When a Store in non-Memory space becomes misaligned, a StoreAddrMisalign exception is generated.
Feature 4: Supports Page-Crossing Stores
Because Stores need to write to the Sbuffer, a page-crossing scenario results in two physical addresses. The physical address for the lower page can be stored in the StoreQueue, while the physical address for the higher page needs a separate place to be stored. We choose to store it in the StoreMisalignBuffer. Thus, for page-crossing Stores, we must wait for this instruction to commit from the Store Queue to the Sbuffer before clearing this entry in the StoreMisalignBuffer. Therefore, we provide the StoreQueue with the latched metadata and address from the current StoreMisalignBuffer, for the Store Queue's write-back use. Specifically, we will use related signals from the rob and StoreQueue to determine if the current Store metadata needs to be latched and held.
Overall Block Diagram
State Descriptions
State | Description |
---|---|
s_idle | Waits for a misaligned Store uop to enter |
s_split | Splits the misaligned Store |
s_req | Dispatches the split misaligned Store operations to the StoreUnit |
s_resp | Receives write-back response from the StoreUnit for the split flows |
s_wb | Writes back to the backend or VSMergeBuffer |
s_block | Blocks this instruction from dequeuing until the Store Queue writes the entry to the Sbuffer |
Key Ports
Port Name | Direction | Description |
---|---|---|
redirect | In | Redirect port |
req | In | Receives enqueue request from StoreUnit |
rob | In | Receives relevant metadata information from Rob |
splitStoreReq | Out | Sends the split flow's memory access request to the StoreUnit |
splitStoreResp | In | Receives the split flow's memory access response from the StoreUnit write-back |
writeBack | Out | Scalar misaligned write-back to the backend |
vecWriteBack | Out | Vector misaligned write-back to the VSMergeBuffer |
StoreOutValid | In | Indicates a Store instruction in the Store Unit is about to write back to the backend |
StoreVecOutValid | In | Indicates a Vector Store instruction in the Store Unit is about to write back to the VSMergeBuffer |
overwriteExpBuf | Out | Hanging (unused) |
sqControl | In/Out | Interface for interaction with the Store Queue |
toVecStoreMergeBuffer | Out | Sends flush-related information to the VSMergeBuffer |
Interface Timing
Interface timing is relatively simple, only providing textual descriptions.
Port Name | Description |
---|---|
redirect | Has Valid signal. Data is valid when Valid is high. |
req | Has Valid, Ready signals. Data is valid when Valid && Ready is high. |
rob | Does not have Valid signal, data is always considered valid, responds when the corresponding signal is generated. |
splitStoreReq | Has Valid, Ready signals. Data is valid when Valid && Ready is high. |
splitStoreResp | Has Valid signal. Data is valid when Valid is high. |
writeBack | Has Valid, Ready signals. Data is valid when Valid && Ready is high. |
vecWriteBack | Has Valid, Ready signals. Data is valid when Valid && Ready is high. |
StoreOutValid | Does not have Valid signal, data is always considered valid, responds when the corresponding signal is generated. |
StoreVecOutValid | Does not have Valid signal, data is always considered valid, responds when the corresponding signal is generated. |
overwriteExpBuf | Hanging (unused). |
sqControl | Does not have Valid signal, data is always considered valid, responds when the corresponding signal is generated. |
toVecStoreMergeBuffer | Does not have Valid signal, data is always considered valid, responds when the corresponding signal is generated. |