跳转至

WbDataPath

  • Version: V2R2
  • Status: OK
  • Date: 2025/01/20
  • commit: xxx

Terminology Explanation

Terminology Explanation
Abbreviation Full Name Description
v0 vector mask register Vector Mask Register
vl Vector length register Vector Length Register
ROB Reorder Buffer Reorder Buffer

Submodule List

Submodule List
Submodule Description
VldMergeUnit Vector Load Data Merging Unit
RealWBCollideChecker Writeback Arbiter

Function

The wbDataPath module includes the Vector Load Data Merging Unit (VldMergeUnit) and the Writeback Arbiter (RealWBCollideChecker). Its primary function is to receive output signals from various execution units, perform merging processing on the output signals of execution units with vector load capabilities, and finally output the processed signals.

Data Processing

The wbDataPath receives output signals from different execution units (Integer Execution Unit, Floating-point Execution Unit, Vector Execution Unit, Memory Access Unit), using these signals as the input signal fromExuPre. It filters out signals related to vector load operation VLoad from the input signal fromExuPre and forms a new sequence of elements from these filtered signals.

Multiple Vector Load Data Merging Unit modules are instantiated based on the index of the element sequence. The redirect signal, signals related to vector load operations, and the value from the vstart register are assigned to the Vector Load Data Merging Unit modules for processing.

Note: Since XiangShan flushes the pipeline, when vstart is not 0 and a vector memory access instruction is executed, the value of vstart in the CSR will be used as the first element for this vector instruction. When an exception occurs, the vstart for writeback data is a new value, so this vstart cannot be used as the starting value for vector memory access operations.

Filter out indices from the input signal fromExuPre that have the same parameters as the Vector Load Data Merging Unit module. Update the old vector load operation data in the input signal fromExuPre with the results processed by the Vector Load Data Merging Unit modules based on the indices, obtaining the processed data from the execution units, fromExu.

Arbiter

Since we have set up register files for vector v0 and vl registers, we also need to arbitrate writes to v0 and vl.

If the input signal's valid is active and the integer register file write enable is active, the integer write arbiter input is valid; if the input signal's valid is active and the floating-point register file write enable is active, the floating-point write arbiter input is valid; if the input signal's valid is active and the vector register file write enable is active, the vector write arbiter input is valid; if the input signal's valid is active and the v0 register file write enable is active, the v0 write arbiter input is valid; if the input signal's valid is active and the vl register file write enable is active, the vl write arbiter input is valid.

If the Vector Execution Unit writes back to the integer register file, the integer write arbiter input is delayed by one cycle. Only execution units with undetermined latency require the arbiter's result; the result data can be retained until the arbiter succeeds. For execution units with determined latency, if the request fails in the arbiter, the result data is permanently lost. Ports that do not write back to the physical register file are always ready, and the port with the highest priority is always ready.

Output

If the integer write arbiter input is valid and the integer write arbiter is ready, the data passes through the integer write arbiter and is output. If the floating-point write arbiter input is valid and the floating-point write arbiter is ready, the data passes through the floating-point write arbiter and is output. If the vector write arbiter input is valid and the vector write arbiter is ready, the data passes through the vector write arbiter and is output. If the v0 write arbiter input is valid and the v0 write arbiter is ready, the data passes through the v0 write arbiter and is output. If the vl write arbiter input is valid and the vl write arbiter is ready, the data passes through the vl write arbiter and is output.

The output data is sent to DataPath to be written to the register file in the next cycle; the output data is sent to the dispatch unit to set the status of the physical register file to ready for instruction dispatch; the output data is sent to the scheduler for writeback wakeup.

Writeback to ROB

Only functional units whose output successfully handshakes can have their output data written back to the ROB. This data is delayed by one cycle in the CtrlBlock, which then checks if the writeback data flushes the pipeline, triggers an exception, triggers a fire, or replays, before sending it to the ROB for ROB writeback.

Overall Block Diagram

WbDataPath Overall Block Diagram

Interface List

See interface documentation

Secondary Module VldMergeUnit

Function

The VldMergeUnit module is mainly used to handle the merging logic for vector load operations. It receives writeback data from the execution units, processes it through the VldMgu module for merging, and finally outputs the merged writeback data. This module uses the register wbReg to store intermediate data and selects whether to use the writeback data directly or the merged data based on the vlWen signal.

For uops where vl is modified by a first-only-fault instruction, the writeback data can be used directly.

Overall Block Diagram

Vector Load Functional Unit Merging Module

Secondary Module RealWBCollideChecker

Function

The main function of the RealWBCollideChecker module is to perform conflict checking and arbitration for write ports of writeback operations. It groups input ports, instantiates an arbiter for each output port, and connects the input and output ports to the arbiters to achieve write port arbitration.

Input/Output Mapping

First, the input elements are grouped by port, then the elements within each group are sorted by priority, and finally the grouped and sorted mapping inGroup is returned.

Arbiter Instantiation

Each arbiter is responsible for arbitrating one output port. If the mapping contains the current output port number x, a RealWBArbiter module is instantiated; otherwise, there is no corresponding arbiter for this output port.

Arbiter Input

Iterate through each arbiter. If the arbiter is not empty, connect the arbiter's input ports to the corresponding input groups in the mapping.

Arbiter

Priority arbiter, which selects one request with the highest priority from multiple input requests to respond to.

  1. By default, the lowest priority request is selected. When all requests are invalid, the last input (index n-1) is chosen as the default output.
  2. Iterate through input ports from the second lowest priority (n-2) to the highest priority (0). When the valid signal of a request i is active, update chosen to i and set the output data to the data of that request. Priority rule: smaller index (0 is the highest priority) means higher priority. The first valid request will overwrite assignments from subsequent requests.
  3. Generate control signals grant based on the valid signal sequence of all requests, indicating whether each request is granted. If the valid signal sequence length is 0, there are no requests; if the valid signal sequence length is 1, there is only one request, which is directly granted; if the valid signal sequence length is greater than 1, the first element uses its original value (the highest priority request does not need to check preceding conditions), and the control signal for each subsequent position is !(OR of all preceding requests), meaning the current request is granted only if all higher priority requests are not active.
  4. If a request is granted, the ready signal is determined by the downstream module's out.ready; if a request is invalid, ready is always active.
  5. If the lowest priority request is granted, the output is valid only if the last request is valid; otherwise, the output is directly valid (higher priority requests have already been granted).
Arbiter Output

Iterate through each output port. If the arbiter is not empty, connect the arbiter's output port to the corresponding output port, and the arbiter output port is always ready; if the arbiter is empty, the output port is 0.