May 5, 2012

MPP

First of all MPP stands for Massively parallel processing. In the course we are developing a Cluster, but an MPP is another way to develop a distributed memory computer system that consists of many individual nodes.

MPP is a type of computing that uses many separate CPUs running in parallel to execute a single program. Like I wrote before a MPP is a distributed memory computer system that consists of many individual nodes, each node is an independent computer (each node has at least one processor), its own memory, and a link to the network that connects all the nodes together.

MPP systems have been designed to go up to hundreds, in some cases thousands, of processors.

Each node runs its own copy of the OS kernel or microkernel. The nodes communicate by passing messages, using standards such as Message Passing Interface (MPI).

In this class of computing, all of the processing elements are connected together to be one very large computer. In this type of computing a processor cannot directly access the physical memory located in a remote node. The programmer or the compiler has to instruct the machine to transfer data from one node to another node on need basis. Faster and well controlled interconnects in MPPs have led to some attempts in providing a shared memory look-alike programming model on these machines. However, these attempts suffer from scalability and availability concerns.

At this point you may be wondering about the difference between a cluster and a MPP, well here ate the differences...

  • In a cluster various components or layers can change relatively independently of each other, while components in MPP systems are much more tightly integrated. For example, a cluster administrator can choose to upgrade the interconnect just by adding new network interface cards (NICs) and switches to the cluster. On the other hand, in most cases the administrator for an MPP system cannot do such upgrades without upgrading the whole machine.
  • Cluster management tools and parallel programming libraries can be optimized independent of the changes in the node hardware itself. This results in more mature and reliable cluster middleware software as compared to the system software layer in an MPP class system, which requires at least a major rewrite with each generation of the system hardware.
  • An MPP usually has a single system serial number used for software licensing and support tracking. Clusters have multiple serial numbers, one for each of their nodes.
  • In MPP system, many computers are physically housed in the same chassis. A cluster system is physically dispersed.
  • The main difference is that a cluster is a set of computers with one or more processors,while a station is a single MPP machine composed of many processors
In some ways an MPP could come to regard as a cluster of processors rather than a cluster of machines

In MPP operation, the problem is broken up into separate pieces, which are processed simultaneously. In order to use MPP effectively, an information processing problem must be breakable into pieces that can all be solved simultaneously. In scientific environments, certain simulations and mathematical problems can be split apart and each part processed at the same time.

In an MPP system, a file system is commonly shared across the network. This configuration allows program files to be shared instead of installed on individual nodes in the system.

In an MPP system work distribution between nodes is a vital consideration when designing any application. It should take into account the synchronization of data between nodes, and all communication between them must be done explicitly by calls to message passing mechanism.


Nowadays, the MPPs are the world's largest computers. Have hundreds or thousands of processors connected to hundreds of gigabytes of memory


Sources


1 comment: