FC2カウンター FPGAの部屋 finn をやってみる10(tfc_end2end_example.ipynb その5)
FC2ブログ

FPGAやCPLDの話題やFPGA用のツールの話題などです。 マニアックです。 日記も書きます。

FPGAの部屋

FPGAの部屋の有用と思われるコンテンツのまとめサイトを作りました。Xilinx ISEの初心者の方には、FPGAリテラシーおよびチュートリアルのページをお勧めいたします。

finn をやってみる10(tfc_end2end_example.ipynb その5)

finn をやってみる9(tfc_end2end_example.ipynb その4)”の続き。

前回は、 end2end_example の tfc_end2end_example.ipynb の 3. Vivado HLS and IPI の Synthesizing HLS to IP Blocks まで実行して、docker でコンテナに入り、Vivado HLS プロジェクトを覗き始めた。今回は、その続きで、Vivado HLS のプロジェクトを見ていこう。

code_gen_ipgen_StreamingFCLayer_Batch_0_av35euos に入ったところからスタートだ。
sol1 ディレクトリに移動して、syn と syn/report ディレクトリの内容と impl_ip ディレクトリの内容を見た。
impl/ip ディレクトリには、IP の圧縮形式の xilinx_com_hls_StreamingFCLayer_Batch_0_1_0.zip が見えた。
finn_76_200604.png

syn/report の StreamingFCLayer_Batch_0_csynth.rpt を示す。

================================================================
== Vivado HLS Report for 'StreamingFCLayer_Batch_0'
================================================================
* Date:           Tue Jun  2 19:58:23 2020

* Version:        2019.2 (Build 2698951 on Thu Oct 24 19:15:34 MDT 2019)
* Project:        project_StreamingFCLayer_Batch_0
* Solution:       sol1
* Product family: zynq
* Target device:  xc7z020-clg400-1


================================================================
== Performance Estimates
================================================================
+ Timing: 
    * Summary: 
    +--------+----------+----------+------------+
    |  Clock |  Target  | Estimated| Uncertainty|
    +--------+----------+----------+------------+
    |ap_clk  | 10.00 ns | 8.488 ns |   1.25 ns  |
    +--------+----------+----------+------------+

+ Latency: 
    * Summary: 
    +---------+---------+----------+----------+-----+-----+---------+
    |  Latency (cycles) |  Latency (absolute) |  Interval | Pipeline|
    |   min   |   max   |    min   |    max   | min | max |   Type  |
    +---------+---------+----------+----------+-----+-----+---------+
    |       70|       70| 0.700 us | 0.700 us |   70|   70|   none  |
    +---------+---------+----------+----------+-----+-----+---------+

    + Detail: 
        * Instance: 
        +--------------------------------+----------------------+---------+---------+----------+----------+-----+-----+---------+
        |                                |                      |  Latency (cycles) |  Latency (absolute) |  Interval | Pipeline|
        |            Instance            |        Module        |   min   |   max   |    min   |    max   | min | max |   Type  |
        +--------------------------------+----------------------+---------+---------+----------+----------+-----+-----+---------+
        |grp_Matrix_Vector_Activa_fu_28  |Matrix_Vector_Activa  |       67|       67| 0.670 us | 0.670 us |   67|   67|   none  |
        +--------------------------------+----------------------+---------+---------+----------+----------+-----+-----+---------+

        * Loop: 
        N/A



================================================================
== Utilization Estimates
================================================================
* Summary: 
+-----------------+---------+-------+--------+-------+-----+
|       Name      | BRAM_18K| DSP48E|   FF   |  LUT  | URAM|
+-----------------+---------+-------+--------+-------+-----+
|DSP              |        -|      -|       -|      -|    -|
|Expression       |        -|      -|       0|      2|    -|
|FIFO             |        -|      -|       -|      -|    -|
|Instance         |        -|      -|    2530|  25464|    -|
|Memory           |        -|      -|       -|      -|    -|
|Multiplexer      |        -|      -|       -|     45|    -|
|Register         |        -|      -|       5|      -|    -|
+-----------------+---------+-------+--------+-------+-----+
|Total            |        0|      0|    2535|  25511|    0|
+-----------------+---------+-------+--------+-------+-----+
|Available        |      280|    220|  106400|  53200|    0|
+-----------------+---------+-------+--------+-------+-----+
|Utilization (%)  |        0|      0|       2|     47|    0|
+-----------------+---------+-------+--------+-------+-----+

+ Detail: 
    * Instance: 
    +--------------------------------+----------------------+---------+-------+-
-----+-------+-----+
    |            Instance            |        Module        | BRAM_18K| DSP48E| 
 FF  |  LUT  | URAM|
    +--------------------------------+----------------------+---------+-------+-
-----+-------+-----+
    |grp_Matrix_Vector_Activa_fu_28  |Matrix_Vector_Activa  |        0|      0| 
 2530|  25464|    0|
    +--------------------------------+----------------------+---------+-------+-
-----+-------+-----+
    |Total                           |                      |        0|      0| 
 2530|  25464|    0|
    +--------------------------------+----------------------+---------+-------+-
-----+-------+-----+

    * DSP48E: 
    N/A

    * Memory: 
    N/A

    * FIFO: 
    N/A

    * Expression: 
    +-----------------------------------------------+----------+-------+---+----
+------------+------------+
    |                 Variable Name                 | Operation| DSP48E| FF| LUT
| Bitwidth P0| Bitwidth P1|
    +-----------------------------------------------+----------+-------+---+----
+------------+------------+
    |grp_Matrix_Vector_Activa_fu_28_out_V_V_TREADY  |    and   |      0|  0|   2
|           1|           1|
    +-----------------------------------------------+----------+-------+---+----
+------------+------------+
    |Total                                          |          |      0|  0|   2
|           1|           1|
    +-----------------------------------------------+----------+-------+---+----
+------------+------------+

    * Multiplexer: 
    +------------------------+----+-----------+-----+-----------+
    |          Name          | LUT| Input Size| Bits| Total Bits|
    +------------------------+----+-----------+-----+-----------+
    |ap_NS_fsm               |  27|          5|    1|          5|
    |in0_V_V_TREADY_int      |   9|          2|    1|          2|
    |weights_V_V_TREADY_int  |   9|          2|    1|          2|
    +------------------------+----+-----------+-----+-----------+
    |Total                   |  45|          9|    3|          9|
    +------------------------+----+-----------+-----+-----------+

    * Register: 
    +---------------------------------------------+---+----+-----+-----------+
    |                     Name                    | FF| LUT| Bits| Const Bits|
    +---------------------------------------------+---+----+-----+-----------+
    |ap_CS_fsm                                    |  4|   0|    4|          0|
    |grp_Matrix_Vector_Activa_fu_28_ap_start_reg  |  1|   0|    1|          0|
    +---------------------------------------------+---+----+-----+-----------+
    |Total                                        |  5|   0|    5|          0|
    +---------------------------------------------+---+----+-----+-----------+



================================================================
== Interface
================================================================
* Summary: 
+--------------------+-----+-----+--------------+--------------------------+----
----------+
|      RTL Ports     | Dir | Bits|   Protocol   |       Source Object      |    
C Type    |
+--------------------+-----+-----+--------------+--------------------------+----
----------+
|ap_clk              |  in |    1| ap_ctrl_none | StreamingFCLayer_Batch_0 | ret
urn value |
|ap_rst_n            |  in |    1| ap_ctrl_none | StreamingFCLayer_Batch_0 | ret
urn value |
|in0_V_V_TDATA       |  in |   56|     axis     |          in0_V_V         |    
pointer   |
|in0_V_V_TVALID      |  in |    1|     axis     |          in0_V_V         |    
pointer   |
|in0_V_V_TREADY      | out |    1|     axis     |          in0_V_V         |    
pointer   |
|weights_V_V_TDATA   |  in |  784|     axis     |        weights_V_V       |    
pointer   |
|weights_V_V_TVALID  |  in |    1|     axis     |        weights_V_V       |    
pointer   |
|weights_V_V_TREADY  | out |    1|     axis     |        weights_V_V       |    
pointer   |
|out_V_V_TDATA       | out |   16|     axis     |          out_V_V         |    
pointer   |
|out_V_V_TVALID      | out |    1|     axis     |          out_V_V         |    
pointer   |
|out_V_V_TREADY      |  in |    1|     axis     |          out_V_V         |    
pointer   |
+--------------------+-----+-----+--------------+--------------------------+----
----------+


次に、 /tmp/finn_dev_masaaki/code_gen_ipgen_StreamingFCLayer_Batch_0_av35euos の ソースコードの top_StreamingFCLayer_Batch_0.cpp の関数定義文を示す。

void StreamingFCLayer_Batch_0(
                    hls::stream<ap_uint<49>> &in0,
                    hls::stream<ap_uint<784>> &weights,
                    hls::stream<ap_uint<16>> &out
                    )


これを合成した HDL の内の VHDL ファイルを示す。
/tmp/finn_dev_masaaki/code_gen_ipgen_StreamingFCLayer_Batch_0_av35euos/project_StreamingFCLayer_Batch_0/sol1/syn/vhdl ディレクトリの StreamingFCLayer_Batch_0_StreamingFCLayer_Batch_0.vhd の entity 文だけを引用する。

entity StreamingFCLayer_Batch_0_StreamingFCLayer_Batch_0 is
port (
    ap_clk : IN STD_LOGIC;
    ap_rst_n : IN STD_LOGIC;
    in0_V_V_TDATA : IN STD_LOGIC_VECTOR (55 downto 0);
    in0_V_V_TVALID : IN STD_LOGIC;
    in0_V_V_TREADY : OUT STD_LOGIC;
    weights_V_V_TDATA : IN STD_LOGIC_VECTOR (783 downto 0);
    weights_V_V_TVALID : IN STD_LOGIC;
    weights_V_V_TREADY : OUT STD_LOGIC;
    out_V_V_TDATA : OUT STD_LOGIC_VECTOR (15 downto 0);
    out_V_V_TVALID : OUT STD_LOGIC;
    out_V_V_TREADY : IN STD_LOGIC );
end;


次に、 code_gen_ipgen_StreamingDataWidthConverter_Batch_0_nzm_awqb を見てみよう。
finn_77_200605.png

hls_syn_StreamingDataWidthConverter_Batch_0.tcl や ソースコードの top_StreamingDataWidthConverter_Batch_0.cpp などがある。

/tmp/finn_dev_masaaki/code_gen_ipgen_StreamingDataWidth
Converter_Batch_0_nzm_awqb/project_StreamingDataWidthConverter_Batch_0/sol1 に移動して、syn ディレクトリや syn/report ディレクトリを見た。
finn_78_200605.png

StreamingDataWidthConverter_Batch_0_csynth.rpt を見てみよう。

================================================================
== Vivado HLS Report for 'StreamingDataWidthConverter_Batch_0'
================================================================
* Date:           Tue Jun  2 19:55:07 2020

* Version:        2019.2 (Build 2698951 on Thu Oct 24 19:15:34 MDT 2019)
* Project:        project_StreamingDataWidthConverter_Batch_0
* Solution:       sol1
* Product family: zynq
* Target device:  xc7z020-clg400-1


================================================================
== Performance Estimates
================================================================
+ Timing: 
    * Summary: 
    +--------+----------+----------+------------+
    |  Clock |  Target  | Estimated| Uncertainty|
    +--------+----------+----------+------------+
    |ap_clk  | 10.00 ns | 5.723 ns |   1.25 ns  |
    +--------+----------+----------+------------+

+ Latency: 
    * Summary: 
    +---------+---------+-----------+-----------+-----+-----+---------+
    |  Latency (cycles) |   Latency (absolute)  |  Interval | Pipeline|
    |   min   |   max   |    min    |    max    | min | max |   Type  |
    +---------+---------+-----------+-----------+-----+-----+---------+
    |        7|        7| 70.000 ns | 70.000 ns |    7|    7|   none  |
    +---------+---------+-----------+-----------+-----+-----+---------+

    + Detail: 
        * Instance: 
        +----------------------------------+------------------------+---------+---------+-----------+-----------+-----+-----+---------+
        |                                  |                        |  Latency (cycles) |   Latency (absolute)  |  Interval | Pipeline|
        |             Instance             |         Module         |   min   |   max   |    min    |    max    | min | max |   Type  |
        +----------------------------------+------------------------+---------+---------+-----------+-----------+-----+-----+---------+
        |grp_StreamingDataWidthCo_1_fu_26  |StreamingDataWidthCo_1  |        4|        4| 40.000 ns | 40.000 ns |    4|    4|   none  |
        +----------------------------------+------------------------+---------+---------+-----------+-----------+-----+-----+---------+

        * Loop: 
        N/A



================================================================
== Utilization Estimates
================================================================
* Summary: 
+-----------------+---------+-------+--------+-------+-----+
|       Name      | BRAM_18K| DSP48E|   FF   |  LUT  | URAM|
+-----------------+---------+-------+--------+-------+-----+
|DSP              |        -|      -|       -|      -|    -|
|Expression       |        -|      -|       0|      2|    -|
|FIFO             |        -|      -|       -|      -|    -|
|Instance         |        -|      -|      65|    241|    -|
|Memory           |        -|      -|       -|      -|    -|
|Multiplexer      |        -|      -|       -|     36|    -|
|Register         |        -|      -|       5|      -|    -|
+-----------------+---------+-------+--------+-------+-----+
|Total            |        0|      0|      70|    279|    0|
+-----------------+---------+-------+--------+-------+-----+
|Available        |      280|    220|  106400|  53200|    0|
+-----------------+---------+-------+--------+-------+-----+
|Utilization (%)  |        0|      0|   ~0   |   ~0  |    0|
+-----------------+---------+-------+--------+-------+-----+

+ Detail: 
    * Instance: 
    +----------------------------------+------------------------+---------+-------+----+-----+-----+
    |             Instance             |         Module         | BRAM_18K| DSP48E| FF | LUT | URAM|
    +----------------------------------+------------------------+---------+-------+----+-----+-----+
    |grp_StreamingDataWidthCo_1_fu_26  |StreamingDataWidthCo_1  |        0|      0|  65|  241|    0|
    +----------------------------------+------------------------+---------+-------+----+-----+-----+
    |Total                             |                        |        0|      0|  65|  241|    0|
    +----------------------------------+------------------------+---------+-------+----+-----+-----+

    * DSP48E: 
    N/A

    * Memory: 
    N/A

    * FIFO: 
    N/A

    * Expression: 
    +-------------------------------------------------+----------+-------+---+----+------------+------------+
    |                  Variable Name                  | Operation| DSP48E| FF| LUT| Bitwidth P0| Bitwidth P1|
    +-------------------------------------------------+----------+-------+---+----+------------+------------+
    |grp_StreamingDataWidthCo_1_fu_26_out_V_V_TREADY  |    and   |      0|  0|   2|           1|           1|
    +-------------------------------------------------+----------+-------+---+----+------------+------------+
    |Total                                            |          |      0|  0|   2|           1|           1|
    +-------------------------------------------------+----------+-------+---+----+------------+------------+

    * Multiplexer: 
    +--------------------+----+-----------+-----+-----------+
    |        Name        | LUT| Input Size| Bits| Total Bits|
    +--------------------+----+-----------+-----+-----------+
    |ap_NS_fsm           |  27|          5|    1|          5|
    |in0_V_V_TREADY_int  |   9|          2|    1|          2|
    +--------------------+----+-----------+-----+-----------+
    |Total               |  36|          7|    2|          7|
    +--------------------+----+-----------+-----+-----------+

    * Register: 
    +-----------------------------------------------+---+----+-----+-----------+
    |                      Name                     | FF| LUT| Bits| Const Bits|
    +-----------------------------------------------+---+----+-----+-----------+
    |ap_CS_fsm                                      |  4|   0|    4|          0|
    |grp_StreamingDataWidthCo_1_fu_26_ap_start_reg  |  1|   0|    1|          0|
    +-----------------------------------------------+---+----+-----+-----------+
    |Total                                          |  5|   0|    5|          0|
    +-----------------------------------------------+---+----+-----+-----------+



================================================================
== Interface
================================================================
* Summary: 
+----------------+-----+-----+--------------+-------------------------------------+--------------+
|    RTL Ports   | Dir | Bits|   Protocol   |            Source Object            |    C Type    |
+----------------+-----+-----+--------------+-------------------------------------+--------------+
|ap_clk          |  in |    1| ap_ctrl_none | StreamingDataWidthConverter_Batch_0 | return value |
|ap_rst_n        |  in |    1| ap_ctrl_none | StreamingDataWidthConverter_Batch_0 | return value |
|in0_V_V_TDATA   |  in |   16|     axis     |               in0_V_V               |    pointer   |
|in0_V_V_TVALID  |  in |    1|     axis     |               in0_V_V               |    pointer   |
|in0_V_V_TREADY  | out |    1|     axis     |               in0_V_V               |    pointer   |
|out_V_V_TDATA   | out |    8|     axis     |               out_V_V               |    pointer   |
|out_V_V_TVALID  | out |    1|     axis     |               out_V_V               |    pointer   |
|out_V_V_TREADY  |  in |    1|     axis     |               out_V_V               |    pointer   |
+----------------+-----+-----+--------------+-------------------------------------+--------------+


project_StreamingDataWidthConverter_Batch_0 は 7 クロックしかかかっていない。
たしかこれは、ストリームの幅を変えると言っていたので、ソースコードの top_StreamingDataWidthConverter_Batch_0.cpp の関数宣言部分を見てみよう。

void StreamingDataWidthConverter_Batch_0(hls::stream > &in0, hls::stream > &out)


16 ビット幅から 8 ビット幅に変換しているようだ。

合成された VHDL ファイルの StreamingDataWidthConverter_Batch_0_StreamingDataWidthConverter_Batch_0.vhd の entity 部分を見てみよう。

entity StreamingDataWidthConverter_Batch_0_StreamingDataWidthConverter_Batch_0 i
s
port (
    ap_clk : IN STD_LOGIC;
    ap_rst_n : IN STD_LOGIC;
    in0_V_V_TDATA : IN STD_LOGIC_VECTOR (15 downto 0);
    in0_V_V_TVALID : IN STD_LOGIC;
    in0_V_V_TREADY : OUT STD_LOGIC;
    out_V_V_TDATA : OUT STD_LOGIC_VECTOR (7 downto 0);
    out_V_V_TVALID : OUT STD_LOGIC;
    out_V_V_TREADY : IN STD_LOGIC );
end;


in0_V_V_TDATA が 16 ビット幅で、out_V_V_TDATA が 8 ビット幅だった。

/tmp/finn_dev_masaaki/code_gen_ipgen_StreamingDataWidthConverter_Batch_0_nzm_awqb/project_StreamingDataWidthConverter_Batch_0/sol1/impl/ip ディレクトリに行くと、IP の ZIP 圧縮ファイルの xilinx_com_hls_StreamingDataWidthConverter_Batch_0_1_0.zip が生成されていた。
finn_79_200605.png
  1. 2020年06月05日 06:23 |
  2. finn
  3. | トラックバック:0
  4. | コメント:0

コメント

コメントの投稿


管理者にだけ表示を許可する

トラックバック URL
https://marsee101.blog.fc2.com/tb.php/4904-6222ef18
この記事にトラックバックする(FC2ブログユーザー)