From 48154cb6f452d3bdb4da36cc267b4b6c45588dc9 Mon Sep 17 00:00:00 2001
From: Clifford Wolf
+Project IceStorm aims at documenting the bitstream format of Lattice iCE40
+FPGAs and providing simple tools for analyzing and creating bitstream files.
+This is work in progress.
+
+The span-4 and span-12 wires are the main interconnect resource in iCE40 FPGAs. They "span" (have a length of)
+4 or 12 cells in horizontal or vertical direction.
+
+The bits marked routing in the bitstream do enable switches (transfer gates) that can
+be used to connect wire segments bidirectionally to each other in order to create larger
+segments. The bits marked buffer in the bitstream enable tristate buffers that drive
+the signal in one direction from one wire to another. Both types of bits exist for routing between
+span-wires. See the auto generated documentation for the LOGIC Tile configuration bits for details.
+
+Only directional tristate buffers are used to route signals between the span-wires and the logic cells.
+
+The image on the right shows the horizontal span-4 wires of a logic or ram cell (click to enlarge).
+
+On the left side of the cell there are 48 connections named sp4_h_l_0 to sp4_h_l_47. The lower 36 of those
+wires are connected to sp4_h_r_12 to sp4_h_r_47 on the right side of the cell. (IceStorm normalizes this
+wire names to sp4_h_r_0 to sp4_h_r_35. Note: the Lattice tools use a different normalization scheme
+for this wire names.) The wires connecting the left and right horizontal span-4 ports are pairwise crossed-out.
+
+The wires sp4_h_l_36 to sp4_h_l_47 terminate in the cell, so do the wires sp4_h_r_0 to sp4_h_r_11.
+
+This wires "span" 4 cells, i.e. they connect 5 cells if you count the cells on
+both ends of the wire.
+
+For example, the wire sp4_h_r_0 in cell (x, y) has the following names:
+
+Project IceStorm – LOGIC Tile Documentation
+
+Span-4 and Span-12 Wires
+
+Span-4 Horizontal
+
+
+
+
+
+
+Cell Coordinates sp4_h_l_* wire name sp4_h_r_* wire name
+x, y - sp4_h_r_0
+x+1, y sp4_h_l_0 sp4_h_r_13
+x+2, y sp4_h_l_13 sp4_h_r_24
+x+3, y sp4_h_l_24 sp4_h_r_37
+x+4, y sp4_h_l_37 -
+The image on the right shows the veritical span-4 wires of a logic or ram cell (click to enlarge). +
+ ++Similar to the horizontal span-4 wires there are 48 connections on the top (sp4_v_t_0 to sp4_v_t_47) and +48 connections on the bottom (sp4_v_b_0 to sp4_v_b_47). The wires sp4_v_t_0 to sp4_v_t_35 +are connected to sp4_v_b_12 to sp4_v_b_47 (with pairwise crossing out). Wire names are normalized +to sp4_v_b_12 to sp4_v_b_47. +
+ ++But in addition to that, each cell also has access to sp4_v_b_0 to sp4_v_b_47 of its right neighbour. +This are the wires sp4_r_v_b_0 to sp4_r_v_b_47. So over all a single veritical span-4 wire +connects 9 cells. For example, the wire sp4_v_b_0 in cell (x, y) has the following names: +
+ ++
| Cell Coordinates | sp4_v_t_* wire name | sp4_v_b_* wire name | sp4_r_v_b_* wire name |
|---|---|---|---|
| x, y | - | sp4_v_b_0 | - |
| x, y-1 | sp4_v_t_0 | sp4_v_b_13 | - |
| x, y-2 | sp4_v_t_13 | sp4_v_b_24 | - |
| x, y-3 | sp4_v_t_24 | sp4_v_b_37 | - |
| x, y-4 | sp4_v_t_37 | - | - |
| x-1, y | - | - | sp4_r_v_b_0 |
| x-1, y-1 | - | - | sp4_r_v_b_13 |
| x-1, y-2 | - | - | sp4_r_v_b_24 |
| x-1, y-3 | - | - | sp4_r_v_b_37 |
+Similar to the span-4 wires there are also longer horizontal and vertical span-12 wires. +
+ ++There are 24 connections sp12_v_t_0 to sp12_v_t_23 on the top of the +cell and 24 connections sp12_v_b_0 to sp12_v_b_23 on the bottom of the +cell. The wires sp12_v_t_0 to sp12_v_t_21 are connected to +sp12_v_b_2 to sp12_v_b_23 (with pairwise crossing out). The connections +sp12_v_b_0, sp12_v_b_1, sp12_v_t_22, and sp12_v_t_23 +terminate in the cell. Wire names are normalized to sp12_v_b_2 to sp12_v_b_23. +
+ ++There are also 24 connections sp12_h_l_0 to sp12_h_l_23 on the left of the +cell and 24 connections sp12_h_r_0 to sp12_h_r_23 on the right of the +cell. The wires sp12_h_l_0 to sp12_h_l_21 are connected to +sp12_h_r_2 to sp12_h_r_23 (with pairwise crossing out). The connections +sp12_h_r_0, sp12_h_r_1, sp12_h_l_22, and sp12_h_l_23 +terminate in the cell. Wire names are normalized to sp12_v_r_2 to sp12_h_r_23. +
+ ++The local tracks are the gateway to the logic cell inputs. Signals from the span-wires +and the logic cell ouputs of the eight neighbour cells can be routed to the local tracks and +signals from the local tracks can be routed to the logic cell inputs. +
+ ++Each logic tile has 32 local tracks. They are organized in 4 groups of 8 wires each: +local_g0_0 to local_g3_7. +
+ ++The span wires, global signals, and neighbour outputs can be routed to the local tracks. But not +every of those signals can be routed to every of the local tracks. Instead there is a different +mix of 16 signals for each local track. +
+ ++The buffer driving the local track has 5 configuration bits. One enable bit and 4 bits that select +the input wire. For example for local_g0_0 (copy&paste from the bitstream doku): +
+ ++
| B0[14] | +B1[14] | +B1[15] | +B1[16] | +B1[17] | +Function | Source-Net | Destination-Net |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 1 | buffer | sp4_r_v_b_24 | local_g0_0 |
| 0 | 0 | 0 | 1 | 1 | buffer | sp12_h_r_8 | local_g0_0 |
| 0 | 0 | 1 | 0 | 1 | buffer | neigh_op_bot_0 | local_g0_0 |
| 0 | 0 | 1 | 1 | 1 | buffer | sp4_v_b_16 | local_g0_0 |
| 0 | 1 | 0 | 0 | 1 | buffer | sp4_r_v_b_35 | local_g0_0 |
| 0 | 1 | 0 | 1 | 1 | buffer | sp12_h_r_16 | local_g0_0 |
| 0 | 1 | 1 | 0 | 1 | buffer | neigh_op_top_0 | local_g0_0 |
| 0 | 1 | 1 | 1 | 1 | buffer | sp4_h_r_0 | local_g0_0 |
| 1 | 0 | 0 | 0 | 1 | buffer | lutff_0/out | local_g0_0 |
| 1 | 0 | 0 | 1 | 1 | buffer | sp4_v_b_0 | local_g0_0 |
| 1 | 0 | 1 | 0 | 1 | buffer | neigh_op_lft_0 | local_g0_0 |
| 1 | 0 | 1 | 1 | 1 | buffer | sp4_h_r_8 | local_g0_0 |
| 1 | 1 | 0 | 0 | 1 | buffer | neigh_op_bnr_0 | local_g0_0 |
| 1 | 1 | 0 | 1 | 1 | buffer | sp4_v_b_8 | local_g0_0 |
| 1 | 1 | 1 | 0 | 1 | buffer | sp12_h_r_0 | local_g0_0 |
| 1 | 1 | 1 | 1 | 1 | buffer | sp4_h_r_16 | local_g0_0 |
+Then the signals on the local tracks can be routed to the input pins of the logic cells. Like before, +not every local track can be routed to every logic cell input pin. Instead there is a different mix +of 16 local track for each logic cell input. For example for lutff_0/in_0: +
+ ++
| B0[26] | +B1[26] | +B1[27] | +B1[28] | +B1[29] | +Function | Source-Net | Destination-Net |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 1 | buffer | local_g0_0 | lutff_0/in_0 |
| 0 | 0 | 0 | 1 | 1 | buffer | local_g2_0 | lutff_0/in_0 |
| 0 | 0 | 1 | 0 | 1 | buffer | local_g1_1 | lutff_0/in_0 |
| 0 | 0 | 1 | 1 | 1 | buffer | local_g3_1 | lutff_0/in_0 |
| 0 | 1 | 0 | 0 | 1 | buffer | local_g0_2 | lutff_0/in_0 |
| 0 | 1 | 0 | 1 | 1 | buffer | local_g2_2 | lutff_0/in_0 |
| 0 | 1 | 1 | 0 | 1 | buffer | local_g1_3 | lutff_0/in_0 |
| 0 | 1 | 1 | 1 | 1 | buffer | local_g3_3 | lutff_0/in_0 |
| 1 | 0 | 0 | 0 | 1 | buffer | local_g0_4 | lutff_0/in_0 |
| 1 | 0 | 0 | 1 | 1 | buffer | local_g2_4 | lutff_0/in_0 |
| 1 | 0 | 1 | 0 | 1 | buffer | local_g1_5 | lutff_0/in_0 |
| 1 | 0 | 1 | 1 | 1 | buffer | local_g3_5 | lutff_0/in_0 |
| 1 | 1 | 0 | 0 | 1 | buffer | local_g0_6 | lutff_0/in_0 |
| 1 | 1 | 0 | 1 | 1 | buffer | local_g2_6 | lutff_0/in_0 |
| 1 | 1 | 1 | 0 | 1 | buffer | local_g1_7 | lutff_0/in_0 |
| 1 | 1 | 1 | 1 | 1 | buffer | local_g3_7 | lutff_0/in_0 |
+The 8 global nets on the iCE40 can be routed to the local track via the glb2local_0 to glb2local_3 +nets using a similar two-stage process. The logic block clock-enable and set-reset inputs can be driven +directly from one of 4 global nets or from one of 4 local tracks. The logic block clock input can be driven +from any of the global nets and from a few local tracks. See the bitstream documentation for details. +
+ ++Each logic tile has a logic block containing 8 logic cells. Each logic cell contains a 4-input LUT, a carry +unit and a flip-flop. Clock, clock enable, and set/reset inputs are shared along the 8 logic cells. So is the +bit that configures positive/negative edge for the flip flops. But the three configuration bits that specify if +the flip flop should be used, if it is set or reset by the set/reset input, and if the set/reset is synchronous +or asynchrouns exist for each logic cell individually. +
+ ++Each LUT i has four input wires lutff_i/in_0 to lutff_i/in_3. Input +lutff_i/in_3 can be configured to be driven by the carry output of the previous logic cell, +or by carry_in_mux in case of i=0. Input lutff_i/in_2 can be configured to +be driven by the output of the previous LUT for i>0. The LUT uses its 4 input signals to +calculate lutff_i/out. +
+ ++The carry unit calculates lutff_i/cout = lutff_i/in_1 + lutff_i/in_2 + lutff_(i-1)/cout > 1. In case of i=0, carry_in_mux is used as third input. carry_in_mux can be configured to be constant 0, 1 or the lutff_7/cout signal from the logic tile below. +
+ ++Part of the functionality described above is documented as part of the routing +bitstream documentation (see the buffers for luttff_ inputs). The NegClk +bit switches all 8 FFs in the tile to negative edge mode. The CarryInSet +bit drives the carry_in_mux high (it defaults to low when not driven via the buffer from +carry_in). +
+ ++The remaining functions of the logic cell are configured via the LC_i bits. This +are 20 bit per logic cell. We have arbitrarily labeld those bits as follows: +
+ ++
| Label | LC_0 | LC_1 | LC_2 | LC_3 | LC_4 | LC_5 | LC_6 | LC_7 |
|---|---|---|---|---|---|---|---|---|
| LC_i[0] | B0[36] | B2[36] | B4[36] | B6[36] | B8[36] | B10[36] | B12[36] | B14[36] |
| LC_i[1] | B0[37] | B2[37] | B4[37] | B6[37] | B8[37] | B10[37] | B12[37] | B14[37] |
| LC_i[2] | B0[38] | B2[38] | B4[38] | B6[38] | B8[38] | B10[38] | B12[38] | B14[38] |
| LC_i[3] | B0[39] | B2[39] | B4[39] | B6[39] | B8[39] | B10[39] | B12[39] | B14[39] |
| LC_i[4] | B0[40] | B2[40] | B4[40] | B6[40] | B8[40] | B10[40] | B12[40] | B14[40] |
| LC_i[5] | B0[41] | B2[41] | B4[41] | B6[41] | B8[41] | B10[41] | B12[41] | B14[41] |
| LC_i[6] | B0[42] | B2[42] | B4[42] | B6[42] | B8[42] | B10[42] | B12[42] | B14[42] |
| LC_i[7] | B0[43] | B2[43] | B4[43] | B6[43] | B8[43] | B10[43] | B12[43] | B14[43] |
| LC_i[8] | B0[44] | B2[44] | B4[44] | B6[44] | B8[44] | B10[44] | B12[44] | B14[44] |
| LC_i[9] | B0[45] | B2[45] | B4[45] | B6[45] | B8[45] | B10[45] | B12[45] | B14[45] |
| LC_i[10] | B1[36] | B3[36] | B5[36] | B7[36] | B9[36] | B11[36] | B13[36] | B15[36] |
| LC_i[11] | B1[37] | B3[37] | B5[37] | B7[37] | B9[37] | B11[37] | B13[37] | B15[37] |
| LC_i[12] | B1[38] | B3[38] | B5[38] | B7[38] | B9[38] | B11[38] | B13[38] | B15[38] |
| LC_i[13] | B1[39] | B3[39] | B5[39] | B7[39] | B9[39] | B11[39] | B13[39] | B15[39] |
| LC_i[14] | B1[40] | B3[40] | B5[40] | B7[40] | B9[40] | B11[40] | B13[40] | B15[40] |
| LC_i[15] | B1[41] | B3[41] | B5[41] | B7[41] | B9[41] | B11[41] | B13[41] | B15[41] |
| LC_i[16] | B1[42] | B3[42] | B5[42] | B7[42] | B9[42] | B11[42] | B13[42] | B15[42] |
| LC_i[17] | B1[43] | B3[43] | B5[43] | B7[43] | B9[43] | B11[43] | B13[43] | B15[43] |
| LC_i[18] | B1[44] | B3[44] | B5[44] | B7[44] | B9[44] | B11[44] | B13[44] | B15[44] |
| LC_i[19] | B1[45] | B3[45] | B5[45] | B7[45] | B9[45] | B11[45] | B13[45] | B15[45] |
+LC_i[8] is the CarryEnable bit. This bit must be set if the carry logic is used. +
+ ++LC_i[9] is the DffEnable bit. It enables the output flip-flop for the LUT. +
+ ++LC_i[18] is the Set_NoReset bit. When this bit is set then the set/reset signal will set, not reset the flip-flop. +
+ ++LC_i[19] is the AsyncSetReset bit. When this bit is set then the set/reset signal is asynchronous to the clock. +
+ ++The LUT implements the following truth table: +
+ ++
| in_3 | in_2 | in_1 | in_0 | out |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | LC_i[4] |
| 0 | 0 | 0 | 1 | LC_i[14] |
| 0 | 0 | 1 | 0 | LC_i[15] |
| 0 | 0 | 1 | 1 | LC_i[5] |
| 0 | 1 | 0 | 0 | LC_i[6] |
| 0 | 1 | 0 | 1 | LC_i[16] |
| 0 | 1 | 1 | 0 | LC_i[17] |
| 0 | 1 | 1 | 1 | LC_i[7] |
| 1 | 0 | 0 | 0 | LC_i[3] |
| 1 | 0 | 0 | 1 | LC_i[13] |
| 1 | 0 | 1 | 0 | LC_i[12] |
| 1 | 0 | 1 | 1 | LC_i[2] |
| 1 | 1 | 0 | 0 | LC_i[1] |
| 1 | 1 | 0 | 1 | LC_i[11] |
| 1 | 1 | 1 | 0 | LC_i[10] |
| 1 | 1 | 1 | 1 | LC_i[0] |
+LUT inputs that are not connected to anything are driven low. The set/reset +signal is also driven low if not connected to any other driver, and the clock +enable signal is driven high when left unconnected. +
+ -- cgit v1.2.3