---+ Virtex-4 FX12 Design Usage

The following synthesis report was performed on the hardware tree as of git commit 31006e22845fb18d84880b1a5ed93d94874fe337 (Sept. 30, 2009).

Four device utilization reports follow (in sequence of CAD flow):

  • Synthesis report: Sources are synthesized into binary netlists; hierarchy is established
  • Detailed Synthesis report: same as above, but broken out into individual components
  • Mapping report: All logic is tech-mapped into a netlist of components that actually exist on the target silicon
  • Placement & Routing report: After locations for each block are estabilshed

The interesting part is the table, below, focusing on contended resource usage for each component in the design. The rest is useful for summarizing utilization, and for motivating the focus on fabric (4-LUTs and FFs) and block RAMs.

There's also a table at the bottom that summarizes what IP is missing from the design.

Synthesis Report

434 Device utilization summary:
435 ---------------------------
436 
437 Selected Device : 4vfx12sf363-10
438 
439  Number of Slices:                     7052  out of   5472   128% (*)
440  Number of Slice Flip Flops:           7710  out of  10944    70%
441  Number of 4 input LUTs:               9974  out of  10944    91%
442     Number used as logic:              9288
443     Number used as Shift registers:     398
444     Number used as RAMs:                288
445  Number of IOs:                         152
446  Number of bonded IOBs:                 139  out of    240    57%
447     IOB Flip Flops:                      44
448  Number of FIFO16/RAMB16s:               33  out of     36    91%
449     Number used as RAMB16s:              33
450  Number of GCLKs:                        14  out of     32    43%
451  Number of PPC405s:                       1  out of      1   100%
452  Number of DCM_ADVs:                      4  out of      4   100%
453  Number of DSP48s:                        4  out of     32    12%
454  Number of EMACs:                         1  out of      1   100%

Note that the overcommitted resource -- marked with (*) -- is the Slice allocation. Slices consist of 4-LUTs, flip flops, and local routing resources. If a single 4-LUT (out of many) is used in a given, that entire slice is also marked used. The synthesis report is just a dumb agglomeration of each submodule in the design -- during mapping, underused slices are merged and the design becomes implementable again.

Detailed Synthesis Report

What follows are synthesis reports for each IP block in the design. Since these are only synthesis reports, I've removed the slice utilization and focused on FFs, 4-LUTs, and block RAMs, and sorted the blocks roughly in order of utilization.

IP 4-LUTs Slice FFs Block RAMs Notes
clock_generator_0 10 / 10944 0% 6 / 10944 0% 0/36 0  
dcm_module_0 5 / 10944 0% 2 / 10944 0% 0/36 0  
proc_sys_reset_0 67 / 10944 0% 52 / 10944 0% 0/36 0  
com1 143 / 10944 1% 132 / 10944 1% 0/36 0  
xps_intc_0 166 / 10944 1% 153 / 10944 1% 0/36 0  
xps_bram_if_cntlr_1 224 / 10944 2% 181 / 10944 1% 0/36 0  
ppc405_0 223 / 10944 2% 138 / 10944 1% 0/36 0  
xps_spi_microsd 255 / 10944 2% 315 / 10944 2% 0/36 0  
xps_spi_hc 260 / 10944 2% 331 / 10944 3% 0/36 0  
plb 162 / 10944 1% 650 / 10944 5% 0/36 0  
plb_bram_if_cntlr_1_bram 0 / 10944 0% 0 / 10944 0% 2/36 5%  
flash_2mx16 712 / 10944 6% 926 / 10944 8% 0/36 0  
cryo_sequencer_0 1002 / 10944 9% 1376 / 10944 12% 2/36 5%  
ddr_sdram_32mx16 1829 / 10944 16% 1849 / 10944 16% 5/36 13%  
trimode_mac_gmii 1392 / 10944 12% 1702 / 10944 15% 0/36 0 MAC wrapper
xps_ll_fifo_0 737 / 10944 6% 1398 / 10944 12% 2/36 5% Used to connect the MAC to the processor
system 7710 / 10944 70% 9974 / 10944 91% 0/36 0 Total

The "total" column at the bottom is taken from the overall synthesis report; it almost (but doesn't quite) add up to the column contents. The exception is the number of BRAMs, which clearly doesn't add up (I know the tools will map ordinary logic onto unused BRAMs; perhaps that's what's happening here.)

Mapping Report

4768 Logic Utilization:
4769   Number of Slice Flip Flops:         7,151 out of  10,944   65%
4770     Number of Slice FFs used for
4771     DCM autocalibration logic:         28 out of   7,151    1%
4772   Number of 4 input LUTs:             8,633 out of  10,944   78%
4773     Number of LUTs used for
4774     DCM autocalibration logic:         16 out of   8,633    1%
4775       *See INFO below for an explanation of the DCM autocalibration logic
4776        added by Map
4777 Logic Distribution:
4778   Number of occupied Slices:          5,445 out of   5,472   99%
4779     Number of Slices containing only related logic:   5,445 out of   5,445 100% 
4780     Number of Slices containing unrelated logic:          0 out of   5,445   0% 
4781       *See NOTES below for an explanation of the effects of unrelated logic.
4782   Total Number of 4 input LUTs:       8,944 out of  10,944   81%
4783     Number used as logic:             8,036
4784     Number used as a route-thru:        311
4785     Number used for Dual Port RAMs:     288
4786       (Two LUTs used per Dual Port RAM)
4787     Number used as Shift registers:     309

These numbers are more reflective of the final design. Note that both FF and 4-LUT usage is close to 80%, and all slices are occupied with at least some logic.

Post-Routing Report

4854 Device Utilization Summary:
4855 
4856    Number of BUFGs                          13 out of 32     40%
4857    Number of BUFGCTRLs                       1 out of 32      3%
4858    Number of DCM_ADVs                        4 out of 4     100%
4859    Number of DSP48s                          4 out of 32     12%
4860    Number of EMACs                           1 out of 1     100%
4861    Number of IDELAYCTRLs                     5 out of 12     41%
4862       Number of LOCed IDELAYCTRLs            5 out of 5     100%
4863 
4864    Number of ILOGICs                        44 out of 320    13%
4865    Number of External IOBs                 151 out of 240    62%
4866       Number of LOCed IOBs                 151 out of 151   100%
4867 
4868    Number of JTAGPPCs                        1 out of 1     100%
4869    Number of OLOGICs                       110 out of 320    34%
4870    Number of PPC405_ADVs                     1 out of 1     100%
4871    Number of RAMB16s                        33 out of 36     91%
4872    Number of Slices                       5445 out of 5472   99%
4873       Number of SLICEMs                    339 out of 2736   12%

Here, again, all slices are occupied. We're low on block RAM primitives as well (91% are used.)

Missing IP

The following is a laundry-list of stuff that we would like to have in the design, but currently don't. (We can live without most of it.)

  • DMA support for the Ethernet MAC (currently using a slow FIFO driver that will have limited driver support in the future)
  • Timestamp receivers for IRIG-B and Timesync
  • Anything connected to VME (e.g. support for uSD cards on the time distribution board)
  • Support for COM2
  • Watchdog support

From the DFMUX design synthesis, here are the approximate sizes of the missing blocks:

IP 4-LUTs Slice FFs Block RAMs Notes
wtl_irig_decoder_0 1119 / 10944 10% 527 / 10944 5% 0/36 0% Yikes!
wtl_timesync_client_0 488 / 10944 4% 435 / 10944 4% 0/36 0  
wtl_watchdog_wrapper_0 841 / 10944 8% 486 / 10944 4% 0/36 0 Ouch!
com2 143 / 10944 1% 132 / 10944 1% 0/36 0  

The killer here seems to be large bit vectors. For example, the IRIG decoder uses a number of 64-bit counters; 1000 flipflops disappears in a hurry that way. I have little doubt that we could optimize these blocks if needed.


This topic: CryoElectronics > WebHome > FX12Usage Topic revision: r2 - 2009-10-09 - GraemeSmecher
© 2020 Winterland Cosmology Lab, McGill University, Montréal, Québec, Canada