Working with vendor EDA toolchains is never a fun experience. Something will almost certainly go wrong. If you're at Cornell, you can at least avoid installing the tools yourself by using our lab servers, Gorgonzola or Havarti.
- Synthesize Calyx-generated RTL designs to collect area and resource estimates.
- Compile Dahlia programs via C++ and Vivado HLS for comparison with the Calyx backend.
- Compile Calyx programs for actual execution in Xilinx emulation modes or on real FPGA hardware.
You can set
fud up to use either a local installation of the Xilinx tools or one on a remote server, via SSH.
The simplest way to use the Xilinx tools is to synthesize RTL or HLS designs to collect statistics about them. This route will not produce actual, runnable executables; see the next section for that.
fud uses extra dependencies to invoke the Xilinx toolchains.
Run the following command to install all required dependencies:
cd fud && flit install -s --deps all
Follow these instructions if you're attempting to run
vivado-hlson a server from your local machine. If you are working directly on a server with these tools, skip to the run instructions.
To set up to invoke the Xilinx tools over SSH, first tell
fud your username and hostname for the server:
# Vivado fud config stages.synth-verilog.ssh_host <hostname> fud config stages.synth-verilog.ssh_username <username> # Vivado HLS fud config stages.vivado-hls.ssh_host <hostname> fud config stages.vivado-hls.ssh_username <username>
The following commands enable remote usage of
vivado-hls by default:
fud config stages.synth-verilog.remote 1 fud config stages.vivado-hls.remote 1
The server must have
vivado_hls available on the remote machine's path. (If you need the executable names to be something else, please file an issue.)
To instead invoke the Xilinx tools locally, just let
fud run the
You can optionally tell
fud where these commands exist on your machine:
fud config stages.synth-verilog.exec <path> # update vivado path fud config stages.vivado-hls.exec <path> # update vivado_hls path
remote option for the stages to
0 ensure that
fud will always try to run the commands locally.
fud config stages.synth-verilog.remote 0 fud config stages.vivado-hls.remote 0
To run the entire toolchain and extract statistics from RTL synthesis, use the
resource-estimate target state.
fud e --to resource-estimate examples/futil/dot-product.futil
To instead obtain the raw synthesis results, use
To run the analogous toolchain for Dahlia programs via HLS, use the
hls-estimate target state:
fud e --to hls-estimate examples/dahlia/dot-product.fuse
There is also an
hls-files state for the raw results of Vivado HLS.
fud can also compile Calyx programs for actual execution, either in the Xilinx toolchain's emulation modes or for running on a physical FPGA.
This route involves generating an AXI interface wrapper for the Calyx program and invoking it using Xilinx's PYNQ interface.
As above, you can invoke the Xilinx toolchain locally or remotely, via SSH.
To set up SSH execution, you can edit your
config.toml to add settings like this:
[stages.xclbin] ssh_host = "havarti" ssh_username = "als485" remote = 1
To use local execution, just leave off the
remote = true line.
You can also set the Xilinx mode and target device:
[stages.xclbin] mode = "hw_emu" device = "xilinx_u50_gen3x16_xdma_201920_3"
The options for
hw_emu (simulation) and
hw (on-FPGA execution).
The device string above is for the Alveo U50 card, which we have at Cornell. The installed Xilinx card would typically be found under the directory
/opt/xilinx/platforms, where one would be able to find a device name of interest.
The first step in the Xilinx toolchain is to generate an
xclbin executable file.
Here's an example of going all the way from a Calyx program to that:
fud e examples/futil/dot-product.futil -o foo.xclbin --to xclbin
On our machines, compiling even a simple example like the above for simulation takes about 5 minutes, end to end. A failed run takes about 2 minutes to produce an error.
By default, the Xilinx tools run in a temporary directory that is deleted when
To instead keep the sandbox directory, use
-s xclbin.save_temps true.
You can then find the results in a directory named
fud-out-N for some number
Now that you have an
xclbin, the next step is to run it.
Roughly speaking, the command you need is just a
fud invocation that goes from the
xclbin stage to the
fud e foo.xclbin --from xclbin --to fpga -s fpga.data examples/dahlia/dot-product.fuse.data
Contrary to the name, the
fpga stage works for both emulation and on-FPGA execution---fud's
mode config option for this stage chooses which to use.
fpga.data config option provides a normal fud-style JSON data input file for the run.
Currently, you will need to have a bunch of environment variables set up to point to the Xilinx tools before running this fud command. For example, on our group's havarti server, you can do this:
source /scratch/opt/Xilinx/Vitis/2020.2/settings64.sh source /opt/xilinx/xrt/setup.sh export EMCONFIG_PATH=`pwd`
To prepare for hardware emulation of an xclbin compiled appropriately, run:
If preparing for actual hardware execution, ensure the
XCL_EMULATION_MODE environment variable is unset:
These steps source the setup scripts for both Vitis and XRT,
and set a special
EMCONFIG_PATH to your current directory so that fud can generate a special JSON configuration file for Xilinx emulation.
You also need to tell PYNQ whether you want
hw_emu (emulation) or the default on-device execution to occur by exporting or unsetting the
XCL_EMULATION_MODE environment variable.
Of course, it would be better if all this could come from fud's configuration itself instead of requiring you to set it up ahead of time;
issue #872 covers this work.
Use the fud options
-s fpga.waveform true -s fpga.save_temps true when emulating your program.
The first option instructs XRT to use the
batch debug mode and to dump a VCD, and the second asks fud not to delete the directory where the waveform files will appear.
Then, look in the resulting directory, which will be named
fud-out-* for some
In there, the Xilinx trace files you want are named
The VCD file is at
.run/*/hw_em/device0/binary_0/behav_waveform/xsim/dump.vcd or similar.
The first step is to generate input files. We need to generate:
- The RTL for the design itself, using the compile command-line flags
-b verilog --synthesis -p external. We name this file
- A Verilog interface wrapper, using
-b xilinx. We call this
- An XML document describing the interface, using
-b xilinx-xml. This file gets named
fud driver gathers these files together in a sandbox directory.
The next step is to run the Xilinx tools.
source <Vitis_install_path>/Vitis/2020.1/settings64.sh source /opt/xilinx/xrt/setup.sh
On some Ubuntu setups, you may need to update
You can check that everything is working by typing
vitis -version or
In the Xilinx toolchain, compilation to an executable bitstream (or simulation blob) appears to requires two steps:
taking your Verilog sources and creating an
.xo file, and then taking that and producing an
.xclbin “executable” file.
The idea appears to be a kind of metaphor for a standard C compilation workflow in software-land:
.xo is like a
.o object file, and
.xclbin contains actual executable code (bitstream or emulation equivalent), like a software executable binary.
Going from Verilog to
.xo is like “compilation” and going from
.xclbin is like “linking.”
However, this analogy is kind of a lie.
.xo file actually does very little work:
it just packages up the Verilog source code and some auxiliary files.
.xo is literally a zip file with that stuff packed up inside.
All the actual work happens during “linking,” i.e., going from
.xclbin using the
This situation is a poignant reminder of how impossible separate compilation is in the EDA world.
A proper analogy would involve separately compiling the Verilog into some kind of low-level representation, and then linking would properly smash together those separately-compiled objects.
Instead, in Xilinx-land, “compilation” is just simple bundling and “linking” does all the compilation in one monolithic step.
It’s kind of cute that the Xilinx toolchain is pretending the world is otherwise, but it’s also kind of sad.
Anyway, the only way to produce a
.xo file from RTL code appears to be to use Vivado (i.e., the actual
Nothing from the newer Vitis package currently appears capable of producing
.xo files from Verilog (although
v++ can produce these files during HLS compilation, presumably by invoking Vivado).
The main components in an
.xo file, aside from the Verilog source code itself, are two XML files:
kernel.xml, a short file describing the argument interfaces to the hardware design,
component.xml, a much longer and more complicated IP-XACT file that also has to do with the interface to the RTL.
We currently generate
kernel.xml ourselves (with the
xilinx-xml backend described above) and then use Vivado, via a Tcl script, to generate the IP-XACT file.
In the future, we could consider trying to route around using Vivado by generating the IP-XACT file ourselves, using a tool such as DUH.
The first step is to produce an
We also use a static Tcl script,
gen_xo.tcl, which is a simplified version of a script from Xilinx's Vitis tutorials.
The gist of this script is that it creates a Vivado project, adds the source files, twiddles some settings, and then uses the
package_xo command to read stuff from this project as an "IP directory" and produce an
The Vivado command line looks roughly like this:
vivado -mode batch -source gen_xo.tcl -tclargs xclbin/kernel.xo
That output filename after
-tclargs, unsurprisingly, gets passed to
Then, we take this
.xo and turn it into an
This step uses the
v++ tool, with a command line that looks like this:
v++ -g -t hw_emu --platform xilinx_u50_gen3x16_xdma_201920_3 --save-temps --profile.data all:all:all --profile.exec all:all:all -lo xclbin/kernel.xclbin xclbin/kernel.xo
v++ tool doesn't need any Tcl to drive it; all the action happens on the command line.