---+ Virtex-4 FX12 Design Usage
The following synthesis report was performed on the hardware tree as of git commit
31006e22845fb18d84880b1a5ed93d94874fe337
(Sept. 30, 2009).
Four device utilization reports follow (in sequence of CAD flow):
- Synthesis report: Sources are synthesized into binary netlists; hierarchy is established
- Detailed Synthesis report: same as above, but broken out into individual components
- Mapping report: All logic is tech-mapped into a netlist of components that actually exist on the target silicon
- Placement & Routing report: After locations for each block are estabilshed
The interesting part is the table, below, focusing on contended resource usage for each component in the design. The rest is useful for summarizing utilization, and for motivating the focus on fabric (4-LUTs and FFs) and block RAMs.
There's also a table at the bottom that summarizes what IP is missing from the design.
Synthesis Report
434 Device utilization summary:
435 ---------------------------
436
437 Selected Device : 4vfx12sf363-10
438
439 Number of Slices: 7052 out of 5472 128% (*)
440 Number of Slice Flip Flops: 7710 out of 10944 70%
441 Number of 4 input LUTs: 9974 out of 10944 91%
442 Number used as logic: 9288
443 Number used as Shift registers: 398
444 Number used as RAMs: 288
445 Number of IOs: 152
446 Number of bonded IOBs: 139 out of 240 57%
447 IOB Flip Flops: 44
448 Number of FIFO16/RAMB16s: 33 out of 36 91%
449 Number used as RAMB16s: 33
450 Number of GCLKs: 14 out of 32 43%
451 Number of PPC405s: 1 out of 1 100%
452 Number of DCM_ADVs: 4 out of 4 100%
453 Number of DSP48s: 4 out of 32 12%
454 Number of EMACs: 1 out of 1 100%
Note that the overcommitted resource -- marked with (*) -- is the Slice allocation. Slices consist of 4-LUTs, flip flops, and local routing resources. If a single 4-LUT (out of many) is used in a given, that entire slice is also marked used. The synthesis report is just a dumb agglomeration of each submodule in the design -- during mapping, underused slices are merged and the design becomes implementable again.
Detailed Synthesis Report
What follows are synthesis reports for each IP block in the design. Since these are only synthesis reports, I've removed the slice utilization and focused on FFs, 4-LUTs, and block RAMs, and sorted the blocks roughly in order of utilization.
IP |
4-LUTs |
Slice FFs |
Block RAMs |
Notes |
clock_generator_0 |
10 / 10944 |
0% |
6 / 10944 |
0% |
0/36 |
0 |
|
dcm_module_0 |
5 / 10944 |
0% |
2 / 10944 |
0% |
0/36 |
0 |
|
proc_sys_reset_0 |
67 / 10944 |
0% |
52 / 10944 |
0% |
0/36 |
0 |
|
com1 |
143 / 10944 |
1% |
132 / 10944 |
1% |
0/36 |
0 |
|
xps_intc_0 |
166 / 10944 |
1% |
153 / 10944 |
1% |
0/36 |
0 |
|
xps_bram_if_cntlr_1 |
224 / 10944 |
2% |
181 / 10944 |
1% |
0/36 |
0 |
|
ppc405_0 |
223 / 10944 |
2% |
138 / 10944 |
1% |
0/36 |
0 |
|
xps_spi_microsd |
255 / 10944 |
2% |
315 / 10944 |
2% |
0/36 |
0 |
|
xps_spi_hc |
260 / 10944 |
2% |
331 / 10944 |
3% |
0/36 |
0 |
|
plb |
162 / 10944 |
1% |
650 / 10944 |
5% |
0/36 |
0 |
|
plb_bram_if_cntlr_1_bram |
0 / 10944 |
0% |
0 / 10944 |
0% |
2/36 |
5% |
|
flash_2mx16 |
712 / 10944 |
6% |
926 / 10944 |
8% |
0/36 |
0 |
|
cryo_sequencer_0 |
1002 / 10944 |
9% |
1376 / 10944 |
12% |
2/36 |
5% |
|
ddr_sdram_32mx16 |
1829 / 10944 |
16% |
1849 / 10944 |
16% |
5/36 |
13% |
|
trimode_mac_gmii |
1392 / 10944 |
12% |
1702 / 10944 |
15% |
0/36 |
0 |
MAC wrapper |
xps_ll_fifo_0 |
737 / 10944 |
6% |
1398 / 10944 |
12% |
2/36 |
5% |
Used to connect the MAC to the processor |
system |
7710 / 10944 |
70% |
9974 / 10944 |
91% |
0/36 |
0 |
Total |
The "total" column at the bottom is taken from the overall synthesis report; it almost (but doesn't quite) add up to the column contents. The exception is the number of BRAMs, which clearly doesn't add up (I know the tools will map ordinary logic onto unused BRAMs; perhaps that's what's happening here.)
Mapping Report
4768 Logic Utilization:
4769 Number of Slice Flip Flops: 7,151 out of 10,944 65%
4770 Number of Slice FFs used for
4771 DCM autocalibration logic: 28 out of 7,151 1%
4772 Number of 4 input LUTs: 8,633 out of 10,944 78%
4773 Number of LUTs used for
4774 DCM autocalibration logic: 16 out of 8,633 1%
4775 *See INFO below for an explanation of the DCM autocalibration logic
4776 added by Map
4777 Logic Distribution:
4778 Number of occupied Slices: 5,445 out of 5,472 99%
4779 Number of Slices containing only related logic: 5,445 out of 5,445 100%
4780 Number of Slices containing unrelated logic: 0 out of 5,445 0%
4781 *See NOTES below for an explanation of the effects of unrelated logic.
4782 Total Number of 4 input LUTs: 8,944 out of 10,944 81%
4783 Number used as logic: 8,036
4784 Number used as a route-thru: 311
4785 Number used for Dual Port RAMs: 288
4786 (Two LUTs used per Dual Port RAM)
4787 Number used as Shift registers: 309
These numbers are more reflective of the final design. Note that both FF and 4-LUT usage is close to 80%, and all slices are occupied with at least some logic.
Post-Routing Report
4854 Device Utilization Summary:
4855
4856 Number of BUFGs 13 out of 32 40%
4857 Number of BUFGCTRLs 1 out of 32 3%
4858 Number of DCM_ADVs 4 out of 4 100%
4859 Number of DSP48s 4 out of 32 12%
4860 Number of EMACs 1 out of 1 100%
4861 Number of IDELAYCTRLs 5 out of 12 41%
4862 Number of LOCed IDELAYCTRLs 5 out of 5 100%
4863
4864 Number of ILOGICs 44 out of 320 13%
4865 Number of External IOBs 151 out of 240 62%
4866 Number of LOCed IOBs 151 out of 151 100%
4867
4868 Number of JTAGPPCs 1 out of 1 100%
4869 Number of OLOGICs 110 out of 320 34%
4870 Number of PPC405_ADVs 1 out of 1 100%
4871 Number of RAMB16s 33 out of 36 91%
4872 Number of Slices 5445 out of 5472 99%
4873 Number of SLICEMs 339 out of 2736 12%
Here, again, all slices are occupied. We're low on block RAM primitives as well (91% are used.)
Missing IP
The following is a laundry-list of stuff that we would like to have in the design, but currently don't. (We can live without most of it.)
- DMA support for the Ethernet MAC (currently using a slow FIFO driver that will have limited driver support in the future)
- Timestamp receivers for IRIG-B and Timesync
- Anything connected to VME (e.g. support for uSD cards on the time distribution board)
- Support for COM2
- Watchdog support
From the DFMUX design synthesis, here are the approximate sizes of the missing blocks:
IP |
4-LUTs |
Slice FFs |
Block RAMs |
Notes |
wtl_irig_decoder_0 |
1119 / 10944 |
10% |
527 / 10944 |
5% |
0/36 |
0% |
Yikes! |
wtl_timesync_client_0 |
488 / 10944 |
4% |
435 / 10944 |
4% |
0/36 |
0 |
|
wtl_watchdog_wrapper_0 |
841 / 10944 |
8% |
486 / 10944 |
4% |
0/36 |
0 |
Ouch! |
com2 |
143 / 10944 |
1% |
132 / 10944 |
1% |
0/36 |
0 |
|
The killer here seems to be large bit vectors. For example, the IRIG decoder uses a number of 64-bit counters; 1000 flipflops disappears in a hurry that way. I have little doubt that we could optimize these blocks if needed.
This topic: CryoElectronics
> WebHome > FX12Usage
Topic revision: r2 - 2009-10-09 - GraemeSmecher