Static Networks
Hi all,
Previously we discussed about some of the dynamic networks which include Cross-bar networks, Multiported Memories, Multistage networks in the class .There are other dynamic networks too like Benes network, Baseline network etc. But today we discussed a network which is rather a static network . We have already discussed one static network which was Linear static network (we've considered this model during Prefix computation ).
Hypercube Interconnection Network
Systems were designed to reduce the amount of time required for processors to perform simple message routing and Hypercube and Mesh are among two of the popular interconnection schemes. Here we are going to discuss about the Hypercube Interconnection Network.
· A k-dimensional hypercube contains 2k processors (nodes).
· Each processing node contains a switch.
· The nodes are numbered from 0 to 2k – 1, usually in binary.
· Two nodes are separated by a single link if the binary values of their numbers differ by one bit i.e. they are neighbors.
Following fig. shows the various dimensional hypercubes along with the binary equivalents of their nodes.
Note:- As can be seen from the fig.,the no. of bits for the binary equivalent are equal to the log(no.of processors(N)) i.e logN which is also the degree of a processor.
Now we discuss the Hypercube n/w with an eg. of a 4D n/w and analyzing various factors such as:-
· Latency
· Scalability
· Diameter
· Bisection Width
· No. of Links
· Dimension
Later on we’ll calculate the overall complexity of matrix multiplication algo.
· Latency:- logN (maximum path taken for processing).
· Scalability:- These kind of static networks are not scalable as their degree are changing but linear networks have constant degree(except corner nodes).
· Diameter:- 2*logN (diameter tells about the communication overhead. It is the minimum number of steps it takes for one processor to send a message to the processor that is the farthest away).
· Bisection width:- it tells how well two parts of a interconnection are connected.
In linear network bisection width is constant but here it depends upon the no of processors i.e., it is equal to N/2.
· No of Links:- N*logN/2
· Dimention of hypercube of NxN processors :-
total no. of nodes = N*N
dimension = 2*logN
Multiplication of two 4x4 matrices
Using the above interconnection technique, we have to majorly perform the following 5 steps(in bracktes time required is specified) :-
1).Broadcast A (N*logN)
2).Transpose B ( N )
3).Broadcast B (N*logN)
4).Multplication and addition of rows and columns on N*N processors
(particularly N multiplications and N-1 additions)
5).Rearrange the matrix to get required elements(N*logN)
In Broadcasting A we have to perform the following:
First apply R1 and R2 at 2 and 7 (diagonal edge) and R3 and R4
at 8 and 13 repectively.
1st stage - send the rows to their respective 4th neighbour(other neighbours will remain masked)
2nd stage - now simultaneously broadcast the information to their respective 3rd neighbours.
This simplifies the control unit as there is a single control line for all the processors.
Hence, in log(N) (in general) stages we will have R1, R2, R3, R4 at the front face of the cube i.e, at 0th ,1st , 2nd and 3rd processor we have R1 ,R2, R3, R4 respectively.
Similarly on other faces too.
After transposing the columns of matrix B we will broadcast the transpose of C1, C2, C3, and C4 on their respective processors. Let C1, C2, C3 and C4 represents transpose of columns.
First apply C1 and C2 at 0 and 1 and C3 and C4 at 8 and 12 repectively.
1st stage - send the columns to their respective 1st neighbour(other neighbours will remain masked)
2nd stage - now simultaneously broadcast the information to their respective 2nd neighbours.
After this we'll have C1 at all processors of front face of cube.
Now multiplication and addition of R1 and transpose of C1 will be performed on 0th processor , of R2 and C1 on 1st processor, of R3 and C1 on 2nd processor, of R4 and C1 on 4th processor... this will happen simultaneously on other processors too for other columns.
Now we rearrange the elements of hypercube to form the final resultant matrix.
Conclusion
Considering all the above steps, we get the overall complexity to be O(N*logN).
The value obtained when we considered N^3 processors was O(logN).
Here broadcasting is done in order to utilize parallelism and hence the factor logN comes into the picture as an overhead.
Submitted by:-
Sweety Yadav and Uzma.
During analysis of PRAM model considering the same example of matrix multiplication logN term comes into the picture cuz of the step of accumulation of multiplications but here it is bcuz of latency which is logN here.
ReplyDeleteAn important issues in the design of interconnection networks for massively parallel computers is scalability. A new scalable interconnection network topology, called Double-Loop Hypercube (DLH), is proposed. The DLH network combines the positive features of the hypercube topology, such as small diameter, high connectivity, symmetry and simple routing, and the scalability and constant node degree of a new double-loop topology. The DLH network can maintain a constant node degree regardless of the increase in the network size.
ReplyDeleteHypercube Network :- Classically this is a multidimensional mesh of processors with exactly two processors in each dimension. An example of such a system is the Intel iPSC/860 computer. Some new projects incorporate the idea of several procesors in each node which results in fat hypercube, i.e. indirect network topology. An example is the SGI/Cray Origin2000 computer.
ReplyDeleteSeveral Modifications of Hypercubes are :
ReplyDeleteFolded Hypercubes : where more links are established b/w nodes.
Extended Hypercubes : by addition of nodes
Cube-Connected cycle where another structure is mapped on Hypercube
Banyan Hypercube : a synthesis of Banyans and Hypercubes
Modular Hypercube : where degree of each node is (2n-1) compared to n in hypercubes.
http://www.ece.uic.edu/~ajohar/paper7.pdf
Diameter for hypercube made of N processors should be logN and not 2logN as mentioned in the post.
ReplyDeleteI have 2 doubts :
ReplyDelete1. How is the diameter 2logN for N no of processors? Max length of the minimum length paths is 4 logN for N=16 processors which is b/w 0 and 15.
2. In log N steps why can't we broadcast rows in parallel ,i mean R1 to its fourth neighbour and R2 to its fourth neighbour at the same time? So that overall complexity is O(log N)
could somebody explain?
Ravi, i think i've cleared your first doubt . But your second doubt is same as mine . Hope we'll get our doubt cleared soon . May be it is to reduce the complexity of control lines
DeleteBy the definition of diameter, here we can say that diameter diameter will be the path taken to send info from 0th procesor to 15th processor(which are farthest away ). So one possible path that can be taken is 0-1-5-7-15 as processing can directly be done with its neighbours only. Hence 2*logN but with assumptions we can say it is logN . This is what i've understand . Corrections are welcome
ReplyDeleteI want to make on thing clear that is the above mentioned diameter is specifically for matrix multiplication i.e, 2logN where NXN is the dimension of the matrix and if we take N as the total number of processor then it will be logN but way of finding the largest of the shortest path be the same .
ReplyDeleteA major drawback of the hypercube network is its lack of scalability..which limits its use in building large size systems out of small size systems with little changes in configuration...AND
ReplyDeleteIt is more difficult to design and fabricate the nodes of the hypercube because of the large fan out..
@uzma So isn't the diameter logN (N=16, logN=4 and diameter is also 4 as mentioned by you) ? I think you are confusing N to be the dimension of the hypercube which is 4 here. N is the number of processors in the hypercube which is equal to square of dimension of hypercube(n), N=n^2 so logN=2*logn
ReplyDeleteYes thats what i've said in my privous comment but what we have written isn't wrong as u can see we've clearly mentioned tha total number of nodes = N*N hence it is 2*logN. And yes i agree this is confusing. We should have used two symbols n and N for the purpose.
ReplyDeleteA little doubt...as cube connected networks are restricted to have degree three...does this restriction degrades the performance of cube network..??
ReplyDeleteFrom this one thing is clear that cube networks have higher scalability than hypercube..furthermore,cube networks have larger diameter than hypercube..