The
use of system on chip (SOC) has increased exponentially due to high integration
of a number of IP on a single chip. The higher number of IP needs higher number
of bus based interconnections. The bus based interconnection leads to a
parallel communication which is not efficient for bandwidth, latency and power
consumption. To solve this problem a switching network is used, called Network
On Chip(NOC). The complexity and the technology scale increase the occurrence
of intermittent and transient faults.
In
order to run a fault-tolerant system smoothly the first thing to be done is to
detect the location of the faults. The fault detection mechanism should also be
able to distinguish transient faults from permanent faults. In order to detect
transient link errors the methods used are error coding techniques viz. cyclic
redundancy check (CRC) and parity codes. To detect permanent errors in NoC
there is an in-line test method to test each adjacent pair of wires and a
syndrome storing-based error detection method based on evaluation of
consecutive code syndromes at the receiver and there are also few works
focusing on detecting transient faults and permanent faults at the meantime.
There
are mainly three techniques to handle transient faults in NoC and they are
Automatic repeat request (ARQ), Forward error correction (FEC), and Hybrid ARQ
(HARQ). Also transient faults can be handled at both link-level and transport
level. In ARQ-based error control, it is found to have errors the packet is
retransmitted. They are retransmitted until it is received error free packet.
The error detection is usually implemented through a cyclic redundancy check
(CRC). For a simple error detecting, the code is applied to the packet before
transmitting, and at the receiver side a checksum will be calculated to ensure
that no error has occurred. The packet is retransmitted, if the checksum does
not add up to the right value.
·
114 bits, contains a
34 bits head and an 80 bits payload. A valid bit (V) is used to mark a packet
valid or not. Relative addressing is used for the source and destination
address fields (SA and DA) which are 12 bits respectively. The HC field (9
bits) records the number of hops the packet has been routed.
·
No. of input should be
equal to the no. of output.
A
2-hop fault information transmission mechanism isused to reduce the average hop
counts. In the 2-hop fault information transmission mechanism, four additional
signals (fault from[d] (1 bit), fault to[d] (1 bit), FoN from[d] (3bits),
FoN to[d] (3 bits)), which are 8 bits in total for each direction of a
switch and they are used to transmit fault information. Each switch is not only
responsible for transmitting its own link status to four neighbours but also
collecting the link status from its three neighbours and transmitting to the
fourth neighbour. For example, switch A can get the status of 16 links within 2
hops.
Fault information transmission mechanism |
The
signal FoN to[d] collected by the current switch is a 3-bit vector to
denote link status along the other three directions except d and is
transmitted to the neighbour along d.
(Research Associate at Silicon Mentor)
No comments:
Post a Comment