Structs§
- loop through all control statements under “par” block to find # barriers needed and # members of each barrieradd all cells and groups neededloop through all control statements, find the statements with @sync attribute and replace them with seq {
; incr_barrier_0_; write_barrier_0_; wait_; restore_; } or seq { ; incr_barrier__; write_barrier__; wait_; wait_restore_; } - Compiles @sync without use of std_sync_reg Upon encountering @sync, it first instantiates a std_reg(1) for each thread(
bar
) and a std_wire(1) for each barrier (s
) It then continuously assigns the value of (s.in
) to 1’d1 guarded by the expression that all values ofbar
for threads under the barrier are set to 1’d1 Then it replaces the @sync control operator with seq { barrier; clear; }barrier
simply sets the value ofbar
to 1’d1 and then waits fors.out
to be upclear
resets the value ofbar
to 1’d0 for reuse of barrier Using this method, each thread only incurs 3 cycles of latency overhead for the barrier, and we theoretically won’t have a limit for number of threads under one barrier - A pass to detect cells that have been inlined into the top-level component and turn them into real cells marked with ir::BoolAttr::External.
- Turns memory cell primitives with the
@external(1)
attribute intoref
memory cells without the@external
attribute. - Removes all groups and inlines reads and writes from holes.
- Metadata stores a Map between each group name and data used in the metadata table (specified in PR #2022)
- Transforms all
par
intoseq
. When thecorrectness-checking
option is on, uses analysis::ControlOrder to get a sequentialization ofpar
such that the program still computes the same value, and errors out when there is no such sequentialization. - Unsharing registers reduces the amount of multiplexers used in the final design, trading them off for more memory.