COE United Parallel Processing Discussions: SUMMARY-13/01/14

FLYNN'S CLASSIFICATION

Flynn's classification distinguishes multi-processor computer architectures according to how they can be classified along the two independent dimensions of Instruction Stream and Data Stream. Each of these dimensions can have only one of two possible states: Single or Multiple.

Today we discussed about :

1. Single instruction, single data (SISD)

2. Single instruction, multiple data (SIMD)

1. SISD

Single instruction : only one instruction stream is being passed from the memory to the control unit (CU).

Single data : only one data stream is being passed between memory and the processing unit (PU) in both the directions.

This is a serial computer.

A single time shared bus can be used for both instruction stream and data stream.

Harvard Architecture : The CU and the PU both have their own respective memories and not a single memory as shown. Another possibility is that the cache memory be different for both but main memory is common. But a truly Harvard architecture would be the one in which both the main memory and cache memory are different for the CU and PU.

2. SIMD

Single instruction : only one instruction stream is being passed from the memory to the control unit.

Multiple data : each processing unit can operate on a different data element.

It is excellent for processing arrays. Addition and subtraction of arrays is faster in this type of architecture because the result of each processing unit is independent of the result of other unit. But in array multiplication the dependency between the different processing units increases (because the result of each multiplication is added with each other) and hence parallelism decreases.

Fine grain parallelism : Parallelizing arrays with no dependencies.

With a single memory, only one processing unit can access the memory as the memory has only one read write port. To solve this problem, every PU can have its own private memory module (PMM) but then the control unit will need to have a host processor to control these memory modules.

The host processor is connected to all the private memory modules, and if exchanging between the PUs is required they need to go back to the host processor.

Another modification to the above model is that there can be a inter-connection network between the PUs and the PMMs so that instead of one processor having access to only one PMM, they can have access to all the PMMs. An alternative to this modification is that each PMM can have several read write ports.

Now the question is who tells the host processor that this piece of data has to be transferred to a particular PU?

The compiler and the operating system decide which data will go to which PU. Therefore, more network instructions are required in the ISA. The O.S has to be smarter to resolve memory conflicts in this case. Therefore, parallelism decreases because if there is a memory conflict then it has to be addressed serially. The compiler has to do the automatic vectorization and language support is also needed i.e. a language with parallel constructs (instead of for i=0 to n, here we would use for all i 0 to n). The compiler should be able to detect dependencies and be aware of the hardware and then do the vectorization. The compiler has to be aware of the dependencies from the beginning. A retargatable compiler is the one which needs the source code and the hardware. They take their target as the input.

So, what all do we need to move from SISD to SIMD:

New ISA, new hardware, new language writer, new compiler design, smarter operating system and a new programmer to think parallely.

By Sonakshi and Saloni Kapoor

6 comments:

Shampa chakraverty13 January 2014 at 14:13
Well, Dependencies are an inevitable part of any program - they are in fact needed. Fine grain parallelism implies that the size of the code that is parallelized is small - say tens of instructions. Sometimes even single instructions as in case of array addition - then it is ILP or Instr Level //ism (provided ofcourse we had a large no of processors).
Point i want to make is: the existence of dependencies does not necessarily render fine level parallelism invalid - though it will surely serialize some operations.
The existence of dependencies does not render
Unknown13 January 2014 at 23:23
ok ma,am. I had one doubt that is any other application except arrays where SIMD would be useful? and where is SISD used and how is it beneficial?
Shampa chakraverty14 January 2014 at 22:52
SISD is your normal single CPU processor we have been using since years! It has served humanity well.

No indeed! SIMD is used primarily for vector processing. - because it is designed to work in a LOCK_STEP fashion - every PU does the same thing in perfect sync. So it is useful for multimedia processing. Say you want to concert a pic into its negative or dim the whole pic - then the computations are very much same for each pixel of the image. SIMD processors r d perfect platform here.

Historically, there has been another application (will discuss in class).
Shashank Gupta27 February 2014 at 23:54
1)why is control memory needed? i mean i understand may be if the processor is micro programmed controlled...but why will it be needed if its hardwired?
Shampa chakraverty28 February 2014 at 06:53
There is no mention of a control memory here.... where did u read it?

In SIMD, there is a single control unit which fetches and decodes instructions and then passes on the control signals to all the PUs (like an orchestra master).

You r right - it is usually hardwired control. It is somewhat more complex than an SISD control unit because it to tackle extra instructions such as Gather, Scatter, Mask. But see the economy - there is a single control unit for so many PUs!
Shashank Gupta6 March 2014 at 03:42
there was something written about the harvard architecture under SISD!!! i was reading that..so why would there be a control unit if it is hardwired controlled?

Monday, 13 January 2014

SUMMARY-13/01/14

6 comments: