跳转至

Load Misalign Buffer

Function Description

The LoadMisalignBuffer stores 1 misaligned Load instruction that crosses a 16-Byte boundary. Its execution logic is a 7-state state machine. When an instruction is detected as misaligned and crossing a 16-Byte boundary in the LoadUnit, it requests entry into the LoadMisalignBuffer. The LoadMisalignBuffer latches this Load instruction and splits it into two separate Load accesses (flows) which are then re-issued into the LoadUnit.

The LoadMisalignBuffer collects the Load accesses it issues. After both Load accesses complete execution, it performs data concatenation and then sends a wakeup operation back to the LoadUnit. This operation does not actually enter the LoadUnit pipeline for execution, but merely triggers a wakeup signal and takes three cycles. After three cycles, the LoadMisalignBuffer receives a write-back request from the LoadUnit again, marked as coming from the wakeup operation. At this point, the LoadMisalignBuffer dequeues and truly writes back to the backend and bypasses.

Scalar misaligned write-back to the backend must occur when LoadUnit 1's scalar write-back is not enabled. If this condition is not met, the LoadMisalignBuffer's write-back to the backend is blocked. Vector misaligned write-back to the VLMergeBuffer must occur when LoadUnit 1's vector scalar write-back is not enabled. If this condition is not met, the LoadMisalignBuffer's write-back to the VLMergeBuffer is blocked.

Feature 1: Supports Splitting Misaligned Loads Crossing 16-Byte Boundaries

Behavior varies based on the already completed flow. The state machine re-enters the s_req state after the first flow writes back, to send the second flow. If the first flow carries an exception upon writing back to the LoadMisalignBuffer, it will directly carry the exception information and write back to the backend, without needing to execute the second flow. Any flow writing back might produce a replay for any reason. The LoadMisalignBuffer chooses to resend this flow to the LoadUnit, regardless of the reason for the replay.

  • lb instructions can never be misaligned.

  • lh is split into two corresponding lb operations:

alt text

  • lw split varies depending on the address splitting method:

alt text

  • ld split varies depending on the address splitting method:

alt text

Feature 2: Supports Vector Misalignment

Vector misaligned flows are handled the same way as scalar misalignment, with the difference being that vector write-back goes to VLMergeBuffer, while scalar write-back goes directly to the backend.

Feature 3: Does Not Support Misaligned Loads from Non-Memory Space

Misaligned Loads from non-Memory space are not supported. When a Load from non-Memory space is misaligned, it will generate a LoadAddrMisalign exception.

Overall Block Diagram

alt text

State Description

State Description
s_idle Waiting for a misaligned Load uop to enter
s_split Splitting the misaligned Load
s_req Issuing the split misaligned Load operations to LoadUnit
s_resp LoadUnit write-back
s_comb_wakeup_rep Combining results of the two misaligned Loads, issuing wakeup uop
s_wb Writing back to backend or VLMergeBuffer

Main Ports

Port Direction Description
redirect In Redirect port
req In Receives enqueue requests from LoadUnit
rob In Internally unused
splitLoadReq Out Sends split flow access requests to LoadUnit
splitLoadResp In Receives split flow access responses from LoadUnit
writeBack Out Scalar misaligned write-back to backend
vecWriteBack Out Vector misaligned write-back to VLMergeBuffer
loadOutValid In Load Unit has a Load instruction about to write back to backend
loadVecOutValid In Load Unit has a Vector Load instruction about to write back to VLMergeBuffer
overwriteExpBuf Out Unused
loadMisalignFull Out LoadMisalignBuffer full flag

Interface Timing

Interface timing is relatively simple, only textual descriptions are provided.

Port Description
redirect Has Valid. Data is valid synchronously with Valid.
req Has Valid, Ready. Data is valid synchronously with Valid && Ready.
rob Internally unused.
splitLoadReq Has Valid, Ready. Data is valid synchronously with Valid && Ready.
splitLoadResp Has Valid. Data is valid synchronously with Valid.
writeBack Has Valid, Ready. Data is valid synchronously with Valid && Ready.
vecWriteBack Has Valid, Ready. Data is valid synchronously with Valid && Ready.
loadOutValid Does not have Valid. Data is always considered valid. Corresponds to signal assertion being immediate response.
loadVecOutValid Does not have Valid. Data is always considered valid. Corresponds to signal assertion being immediate response.
overwriteExpBuf Unused.
loadMisalignFull Does not have Valid. Data is always considered valid. Corresponds to signal assertion being immediate response.