SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
CHAPTER-4
PARALLEL PROGRAMMING
LANGUAGES By Dr. Heman Pathak
Associate Professor
KGC - Dehradun
Parallel Processes
Multicomputer
• A multicomputer is constructed out of multiple
computers and an interconnection network. The
processors on different computers interact by
passing messages to each other.
Multiprocessor
• It is a computer system with two or more CPUs. It is
highly integrated system in which all CPUs share
access to a single global memory.
2
Parallel Processes
Uniform Memory Access (UMA)
• It is a shared memory architecture in which
access time to a memory location is
independent of which processor makes the
request or which memory chip contains the
transferred data.
3
PROGRAMMING PARALLEL PROCESSES
Every parallel language must address certain issues, either explicitly
or implicitly
 There must be a way to create parallel processes
 there must be a way to coordinate the activities of these processes
 when processes exchange results, they must communicate and
synchronize with each other.
 Communication and synchronization can be accomplished by sharing
variables or by message passing.
PROGRAMMING PARALLEL PROCESSES
Communication and synchronization can be accomplished by sharing
variables or by message passing. There are two methods of
synchronization:
synchronization for
precedence
It guarantees that one
event does not begin until
another event has finished
synchronization for
mutual exclusion.
It guarantees that only one
process at a time enters a
critical section of code
where a data structure to
be shared is manipulated
Barrier Synchronization
Barrier
P1
P2
P3
P4
Barrier
P1
P2
P3
P4
P1
P2
P3
P4
time
Barrier
four processes
approach the
barrier
all except P4
arrive
Once all
arrive, they
continue
Mutual Exclusion Synchronization
Process 1
Process 2
An Illustrative Example for UMA
let's consider the problem of computing variance for a list of real numbers. The
values given are r1, r2, r3 ,..., rn
Variance = 𝒓𝒊 − 𝒎 𝟐/𝒏 where m = 𝒓𝒊/𝒏
 Parallel Architecture used is UMA with 4 Processors
 n real numbers are stored in shared memory
 Four variables are also stored in shared memory
 one will contain the grand total
 another the mean
 a third the global sum of squares
 a fourth the variance
An Illustrative Example for UMA
 Four processes are created one for each processors.
 Each process has a local temporary variable.
 Each process adds its share of the n values
 Accumulating the sum in its local temporary variable.
 When the process is through computing its subtotal, it adds its
subtotal to the shared variable, accumulating the grand total.
 Since multiple processes are accessing the same global variable,
that portion of code is a critical section, and the processes must
enforce mutual exclusion.
An Illustrative Example for UMA
An Illustrative Example for UMA
 A barrier synchronization step, inserted in the algorithm after the critical
section, ensures that no process continues until all processes have added
their subtotals to the grand total.
 In sequential code one process computes the average by dividing the
grand total by n.
 To compute the global sum of squares, the processes go through a process
similar to that which computes the global sum.
 Again, the processes must enforce mutual exclusion.
 After another barrier synchronization, one process computes the variance
by dividing the sun of squares by the number of values.
An Illustrative Example for Multicomputer
Solve the same problem on a four-node multicomputer
 There is no shared memory; the n values are
distributed among the local memories of the
nodes.
 Each node process has four variables:
 two to accumulate sums
 one to store the mean
 and another to store the variance
 Each node process initializes the two accumulator
to 0.
 At this point every process has a subtotal; the
four subtotals must be combined to find the
grand total.
An Illustrative Example for Multicomputer
7
• 0
8
• 1
11
• 2
10
• 3
15 15
21 21
36 36
36 36
An Illustrative Example for Multicomputer
The four subtotals must be
combined to find the grand total.
 After two exchange-and-add
steps, every process has the grand
total.
 Every process can divide the
grand total by n to determine the
mean.
An Illustrative Example for Multicomputer
A similar set of steps allows the
processes to compute the variance of
the list values.
Every process has the result. One of the
processes can pass the answer back to
program running on the front end, which
then de-allocates the hypercube,
surrendering access to the nodes.
A Sample Application
Integration: Find the area under the curve
4/( I +x2) between 0 and 1= 
 The interval [0, 1] is divided into n
subintervals of width I/n.
 For each these intervals the algorithm
computes the area of a rectangle whose
height is such that the curve intersects the top
of the rectangle at its midpoint.
 The sum of the areas of the n rectangles
approximates the area under the curve.
A Sample Application
 This algorithm is data-parallel.
 Since the areas of all the rectangles
can be
 computed simultaneously.
 Computing the area of each rectangle
requires the same amount of work:
hence load balancing is insignificant.
 If the language requires to divide the
work among the processors, it can be
done easily.
FORTRAN 90
Parallel Programming Language
FORTRAN 90
In 1978 the ANSI-accredited technical
committee, X3J3 began working on a
new version of the FORTRAN language.
In the early l990s the resulting language,
Fortran 90, was adopted as an ISO and
ANSI standard.
FORTRAN 90
Fortran 90 is a superset of FORTRAN 77. It includes all the features of FORTRAN 77, plus
 Array operations
 Improved facilities for numerical computations
 Syntax to allow processors to support short integers, packed logicals, very large
character sets, and high-precision real and complex numbers
 User-defined data types, structures, and pointers
 Dynamic storage allocation
 Modules to support abstract data types
 Internal procedures and recursive procedures
 Improvements to input-output facilities
 New control structures
 New intrinsic procedures
 Terminal-oriented source form
FORTRAN 90
The committee also marked many language features as obsolete,
including
 arithmetic IF
 some DO construct variations,
 assigned Go TO,
 assigned formats
 and the H edit descriptor
The next revision of the FORTRAN standard may
not contain these features.
FORTRAN 90 Programmer‘s Model
The Fortran 90 programmer has a model of parallel computation similar to a
PRAM. A CPU and a vector unit share a Single memory.
 The CPU executes sequential
instructions, accessing variables
stored in the shard memory.
 To execute parallel operations,
the CPU controls the vector unit.
 Which also stores and fetches
data to and from the shared
memory
FORTRAN 90 Language Features
Fortran 90 gives the programmer the ability to specify the type of variables through type
declaration statements such as
• REAL A, B, C
• INTEGER I
Each type may have several kinds. For example, a real variable may be stored in 4 bytes or
8 bytes. The Fortran 90 programmer may specify explicitly the kind, as well as the type of
the variable, as in the following example:
• REAL ( KIND=LONG) PI
Fortran 90 introduces the notion of an array constant. For example.
• (/ 2, 3, 5, 7,11 /)
denotes a one dimensional array with five elements. It is possible to construct an array of
higher dimension by declaring an array constant, then changing its dimension with the
RESHAPE function.
FORTRAN 90 Language Features
An implied Do notation can simplify the specification of any constants.
For example, the array constant
• (/ 2, 4, 6, 8,10 /) may be specified as (/ (I, I = 2,10, 2) /)
Fortran 90 also allows operations on arrays. When applied to an array, the unary intrinsic
operators + and - return an array of the same dimensions, where the elements in the result
array are found by applying the operator to the corresponding elements in the operand
array.
Numerical, relational, and logical binary intrinsic operators can manipulate arrays having the
same dimensions. Each element in the result array is found by applying the operator to the
corresponding elements in the operand arrays.
A binary intrinsic operator can also manipulate an array and a scalar variable, resulting in
an array of the same dimensions as the array operand.
FORTRAN 90 Language Features
For example, given the array declarations
• REAL, DIMENSION(100,50) : : X, Y
• REAL, DIMENSION(100) : : Z
the following are examples of legal array expressions:
X + Y Array of shape(100,50), elements X(I,J) + Y(I,J)
X + 1.0 Array of shape(100,50), elements X(I,J) + 1.0
X .EQ. Y Value .True. If X(I,J) .EQ. Y(I,J) and .FALSE. otherwise
X(1:100,3) +Z Array of shape(100), elements X(I,3) + Z(I)
FORTRAN 90 Language Features
Sometimes it is important to be able to perform an
operation on a subset of the Array elements. The WHERE
statement allows the programmer to specify which array
elements are lo be active. For example, the statement
WHERE (A > 0.0) A = SORT (A)
replaces every positive element of A with its square root.
FORTRAN 90 Language Features
The WHERE statement divides the array elements into two sets, first performing
one or more array assignments on the elements for which the expression is true,
then performing one or more array assignments on the elements for which the
expression is false. The syntax of most general form of WHERE statement is-
WHERE(logical-array-expression)
array-assignment-statements
ELSEWHERE
array-assignment-statements
END WHERE
Finally new transformational functions allow the reduction of an array into a scalar
value. For example, the function SUM returns the sum of the elements of the any
passed to it as an argument.
FORTRAN 90 Language Features
SEQUENT C
Parallel Programming Language
Sequent C
 Sequent computers run the DYNIX operating system, a version of UNIX
tailored for the multiprocessor environment.
 In addition to the operating-system calls typically found in a UNIX
system, DYNIX provides a set of routines to facilitate parallel processing.
 The commercial parallel programming languages the Sequent hardware
uses are simple extensions of sequential languages that allow
programmers to declare shared variables that interact via mutual
exclusion and barrier synchronization.
 The resulting languages are primitive.
Sequent C - Shared Data
 Parallel processes on the Sequent coordinate their activities by accessing
shared data structures.
 The keyword shared placed before a global variable declaration,
indicates that all processes are to share a single instance of that
variable.
 For example, if a 10-element global array is declared int a [10], then
every active process has its own copy of the array; if one process
modifies a value in its copy of a, no other process's value will change.
 On the other hand, if the array is declared shared int a [10] , then all
active processes share a single instance of the array, and changes made
by one process can be detected by the other processes.
Sequent C - Parallel Processing Function
 A program begins execution as a single process. This process is
responsible for executing those parts of the program that are
inherently sequential.
 the original process forks a number of other processes, each
process performing its share of the work.
 The total number of processes accessing shared data cannot
exceed the number of physical processors less one. Because there
are at least as many CPUs as active processes, each process may
execute on its own CPU.
 This allows a major reduction in the execution time, assuming that
the computer is not executing any other jobs.
Sequent C - Parallel Processing Function
 When control reaches an inherently sequential
portion of the computation, only the original
process executes the code; the remaining processes
wait until control reaches another portion of the
computation that can be divided into pieces and
executed concurrently. The program cycles through
these two modes until termination.
 Parallel programs executing on the Sequent
alternate between sequential and parallel
execution.
 The transition from parallel to sequential execution
is always delimited by a barrier synchronization.
In addition, data dependencies may
require the insertion of barrier
synchronizations within parallel code.
Sequent C - Parallel Processing Function
Sequent C - Parallel Processing Function
Sequent's microtasking library has seven key functions:
1. m_set_procs(p):
The parent process initializes to value p a shared variable that
controls the number of processes created by a subsequent call to
m_fork.
The value of p can not exceed the number of physical processors
in the system minus one.
The function also initializes barriers and locks.
Sequent C - Parallel Processing Function
2. m_fork(name[,arg,...]):
The parent process creates a number of child processes,
The parent process and the child processes begin executing
function name with the arguments (if any) also specified by the
call to m_fork.
After all the processes (the parent and all the children) have
completed execution of function name, the parent process
resumes execution with the code after m_fork, while the child
processes busy wait until the next call to m_fork.
Sequent C - Parallel Processing Function
2. m_fork(name[,arg,...]):
The parent
process
creates p
number of
child
processes
All processes
begin
executing
function name
The parent
process
resumes
execution with
the code after
m_fork
The child
processes
busy wait until
the next call
to m_fork.
Therefore, the first call to m_fork is more expensive than subsequent
calls, because only the first call entails process creation.
Sequent C - Parallel Processing Function
3. m_get_myid: A process calls function m_get_myid to get its
unique process number. If the total number of active processes
is p, then the process number of the parent is 0, while the
process numbers of the child processes range from 1 to p-1.
4. m_get_numprocs: Function m_get_numprocs returns the
number of processes executing in parallel. Given the total
number of processes and its own process number, a process
can determine which portion of a computation is its
responsibility.
Sequent C - Parallel Processing Function
5. m_lock, m_unlock: Functions m_lock and m_unlock
ensure mutually exclusive execution of the code that
the two calls surround. Once a process has entered a
block of code delimited by m_lock and m_unlock, no
other process may enter until the first process has left.
7. m_kill_procs: Function m_kill_procs kills the child
processes created by the first call to m_fork.
Sequent C - Monitor
 Most parallel algorithms implemented on multiprocessors require a
process to perform a series of operations on a shared data structure, as
if it were an atomic operation.
 For example, a process may need to fetch the value at the beginning of
a linked list and advance the list pointer to the next list element.
 When the hardware cannot perform the entire series of operations as an
atomic operation, the process must have some way to enforce mutual
exclusion, keeping all other processes from referencing the resource while
it is being modified. The piece of code in which mutual exclusion must be
enforced is called a critical section.
Sequent C - Monitor
One way to structure accesses to shared resources is by using a monitor.
 A monitor consists of variables representing the state of some resource,
procedures that implement operations on it, and initialization code.
 The values of the variables are initialized before any procedure in the
monitor is called; these values are retained between procedure invocations
and may be accessed only by procedures in the monitor.
 Monitor procedures resemble ordinary procedures in the programming
language with one significant exception. The execution of the procedures in
the same monitor is guaranteed to be mutually exclusive. Hence monitors
are a structured way of implementing mutual exclusion.
Sequent C - Monitor
Sequent C - Monitor
Sequent C - Monitor
Programming languages that support monitors include Concurrent
Pascal (Brinch Hansen 1975, 1977) and Modula (Wirth 1977a,
1977b, 1977c). Even if your parallel programming language does
not support monitors, you can implement one yourself. For example, in
the Sequent C language, you can implement a monitor by declaring a
shared lock variable for each resource, putting an s_lock statement
that accesses the variable at the start of each procedure, and putting
an s_unlock statement at the end of each procedure. You also must
have enough self-discipline to use only these procedures to access the
shared resource.

Contenu connexe

Tendances

Semophores and it's types
Semophores and it's typesSemophores and it's types
Semophores and it's typesNishant Joshi
 
Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithmssathish sak
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architectureArpan Baishya
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing ModelAdlin Jeena
 
Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Bhavik Vashi
 
Instruction level parallelism
Instruction level parallelismInstruction level parallelism
Instruction level parallelismdeviyasharwin
 
Optimization of basic blocks
Optimization of basic blocksOptimization of basic blocks
Optimization of basic blocksishwarya516
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system modelHarshad Umredkar
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationMohammed Bennamoun
 
Computer architecture addressing modes and formats
Computer architecture addressing modes and formatsComputer architecture addressing modes and formats
Computer architecture addressing modes and formatsMazin Alwaaly
 
Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Mukesh Chinta
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemAnamika Singh
 

Tendances (20)

Semophores and it's types
Semophores and it's typesSemophores and it's types
Semophores and it's types
 
Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithms
 
Sequential consistency model
Sequential consistency modelSequential consistency model
Sequential consistency model
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
Semaphores
SemaphoresSemaphores
Semaphores
 
Chapter 4 pc
Chapter 4 pcChapter 4 pc
Chapter 4 pc
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Parallel processing (simd and mimd)
Parallel processing (simd and mimd)
 
Instruction level parallelism
Instruction level parallelismInstruction level parallelism
Instruction level parallelism
 
Optimization of basic blocks
Optimization of basic blocksOptimization of basic blocks
Optimization of basic blocks
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Lecture 3 threads
Lecture 3   threadsLecture 3   threads
Lecture 3 threads
 
Cache coherence ppt
Cache coherence pptCache coherence ppt
Cache coherence ppt
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
 
Course outline of parallel and distributed computing
Course outline of parallel and distributed computingCourse outline of parallel and distributed computing
Course outline of parallel and distributed computing
 
Computer architecture addressing modes and formats
Computer architecture addressing modes and formatsComputer architecture addressing modes and formats
Computer architecture addressing modes and formats
 
Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
5 csp
5 csp5 csp
5 csp
 

Similaire à Chapter 4: Parallel Programming Languages

Performance analysis and randamized agoritham
Performance analysis and randamized agorithamPerformance analysis and randamized agoritham
Performance analysis and randamized agorithamlilyMalar1
 
Complier design
Complier design Complier design
Complier design shreeuva
 
Intermediate code optimization Unit-4.pdf
Intermediate code optimization Unit-4.pdfIntermediate code optimization Unit-4.pdf
Intermediate code optimization Unit-4.pdfHimanshu883663
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithmTrupti Agrawal
 
Unit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdfUnit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdfVinayak247538
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and AlgorithmDhaval Kaneria
 
Cupdf.com introduction to-data-structures-and-algorithm
Cupdf.com introduction to-data-structures-and-algorithmCupdf.com introduction to-data-structures-and-algorithm
Cupdf.com introduction to-data-structures-and-algorithmTarikuDabala1
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and AlgorithmDhaval Kaneria
 
Performance measures
Performance measuresPerformance measures
Performance measuresDivya Tiwari
 
Intel x86 and ARM Data types
Intel x86 and ARM Data typesIntel x86 and ARM Data types
Intel x86 and ARM Data typesRowena Cornejo
 
Rapport_Cemracs2012
Rapport_Cemracs2012Rapport_Cemracs2012
Rapport_Cemracs2012Jussara F.M.
 
TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016Andrii Babii
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxjohnsmith96441
 
VCE Unit 01 (2).pptx
VCE Unit 01 (2).pptxVCE Unit 01 (2).pptx
VCE Unit 01 (2).pptxskilljiolms
 

Similaire à Chapter 4: Parallel Programming Languages (20)

Performance analysis and randamized agoritham
Performance analysis and randamized agorithamPerformance analysis and randamized agoritham
Performance analysis and randamized agoritham
 
Complier design
Complier design Complier design
Complier design
 
Algorithm.pptx
Algorithm.pptxAlgorithm.pptx
Algorithm.pptx
 
Algorithm.pptx
Algorithm.pptxAlgorithm.pptx
Algorithm.pptx
 
Intermediate code optimization Unit-4.pdf
Intermediate code optimization Unit-4.pdfIntermediate code optimization Unit-4.pdf
Intermediate code optimization Unit-4.pdf
 
genalg
genalggenalg
genalg
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
 
Unit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdfUnit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdf
 
Co&al lecture-07
Co&al lecture-07Co&al lecture-07
Co&al lecture-07
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
 
Cupdf.com introduction to-data-structures-and-algorithm
Cupdf.com introduction to-data-structures-and-algorithmCupdf.com introduction to-data-structures-and-algorithm
Cupdf.com introduction to-data-structures-and-algorithm
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
 
Performance measures
Performance measuresPerformance measures
Performance measures
 
Intel x86 and ARM Data types
Intel x86 and ARM Data typesIntel x86 and ARM Data types
Intel x86 and ARM Data types
 
Rapport_Cemracs2012
Rapport_Cemracs2012Rapport_Cemracs2012
Rapport_Cemracs2012
 
TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
 
VCE Unit 01 (2).pptx
VCE Unit 01 (2).pptxVCE Unit 01 (2).pptx
VCE Unit 01 (2).pptx
 
DATA STRUCTURE.pdf
DATA STRUCTURE.pdfDATA STRUCTURE.pdf
DATA STRUCTURE.pdf
 
DATA STRUCTURE
DATA STRUCTUREDATA STRUCTURE
DATA STRUCTURE
 

Plus de Heman Pathak

Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection NetworkHeman Pathak
 
Central processing unit
Central processing unitCentral processing unit
Central processing unitHeman Pathak
 
Registers and counters
Registers and countersRegisters and counters
Registers and countersHeman Pathak
 
Sequential Circuit
Sequential CircuitSequential Circuit
Sequential CircuitHeman Pathak
 
Combinational logic 2
Combinational logic 2Combinational logic 2
Combinational logic 2Heman Pathak
 
Combinational logic 1
Combinational logic 1Combinational logic 1
Combinational logic 1Heman Pathak
 
Simplification of Boolean Function
Simplification of Boolean FunctionSimplification of Boolean Function
Simplification of Boolean FunctionHeman Pathak
 
Chapter 2: Boolean Algebra and Logic Gates
Chapter 2: Boolean Algebra and Logic GatesChapter 2: Boolean Algebra and Logic Gates
Chapter 2: Boolean Algebra and Logic GatesHeman Pathak
 
Chapter 7: Matrix Multiplication
Chapter 7: Matrix MultiplicationChapter 7: Matrix Multiplication
Chapter 7: Matrix MultiplicationHeman Pathak
 
Elementary Parallel Algorithms
Elementary Parallel AlgorithmsElementary Parallel Algorithms
Elementary Parallel AlgorithmsHeman Pathak
 
Chapter 5: Mapping and Scheduling
Chapter  5: Mapping and SchedulingChapter  5: Mapping and Scheduling
Chapter 5: Mapping and SchedulingHeman Pathak
 
Cost optimal algorithm
Cost optimal algorithmCost optimal algorithm
Cost optimal algorithmHeman Pathak
 
Parallel Algorithm for Graph Coloring
Parallel Algorithm for Graph Coloring Parallel Algorithm for Graph Coloring
Parallel Algorithm for Graph Coloring Heman Pathak
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel AlgorithmsHeman Pathak
 
Chapter 1 - introduction - parallel computing
Chapter  1 - introduction - parallel computingChapter  1 - introduction - parallel computing
Chapter 1 - introduction - parallel computingHeman Pathak
 

Plus de Heman Pathak (15)

Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection Network
 
Central processing unit
Central processing unitCentral processing unit
Central processing unit
 
Registers and counters
Registers and countersRegisters and counters
Registers and counters
 
Sequential Circuit
Sequential CircuitSequential Circuit
Sequential Circuit
 
Combinational logic 2
Combinational logic 2Combinational logic 2
Combinational logic 2
 
Combinational logic 1
Combinational logic 1Combinational logic 1
Combinational logic 1
 
Simplification of Boolean Function
Simplification of Boolean FunctionSimplification of Boolean Function
Simplification of Boolean Function
 
Chapter 2: Boolean Algebra and Logic Gates
Chapter 2: Boolean Algebra and Logic GatesChapter 2: Boolean Algebra and Logic Gates
Chapter 2: Boolean Algebra and Logic Gates
 
Chapter 7: Matrix Multiplication
Chapter 7: Matrix MultiplicationChapter 7: Matrix Multiplication
Chapter 7: Matrix Multiplication
 
Elementary Parallel Algorithms
Elementary Parallel AlgorithmsElementary Parallel Algorithms
Elementary Parallel Algorithms
 
Chapter 5: Mapping and Scheduling
Chapter  5: Mapping and SchedulingChapter  5: Mapping and Scheduling
Chapter 5: Mapping and Scheduling
 
Cost optimal algorithm
Cost optimal algorithmCost optimal algorithm
Cost optimal algorithm
 
Parallel Algorithm for Graph Coloring
Parallel Algorithm for Graph Coloring Parallel Algorithm for Graph Coloring
Parallel Algorithm for Graph Coloring
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Chapter 1 - introduction - parallel computing
Chapter  1 - introduction - parallel computingChapter  1 - introduction - parallel computing
Chapter 1 - introduction - parallel computing
 

Dernier

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 

Dernier (20)

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 

Chapter 4: Parallel Programming Languages

  • 1. CHAPTER-4 PARALLEL PROGRAMMING LANGUAGES By Dr. Heman Pathak Associate Professor KGC - Dehradun
  • 2. Parallel Processes Multicomputer • A multicomputer is constructed out of multiple computers and an interconnection network. The processors on different computers interact by passing messages to each other. Multiprocessor • It is a computer system with two or more CPUs. It is highly integrated system in which all CPUs share access to a single global memory. 2
  • 3. Parallel Processes Uniform Memory Access (UMA) • It is a shared memory architecture in which access time to a memory location is independent of which processor makes the request or which memory chip contains the transferred data. 3
  • 4. PROGRAMMING PARALLEL PROCESSES Every parallel language must address certain issues, either explicitly or implicitly  There must be a way to create parallel processes  there must be a way to coordinate the activities of these processes  when processes exchange results, they must communicate and synchronize with each other.  Communication and synchronization can be accomplished by sharing variables or by message passing.
  • 5. PROGRAMMING PARALLEL PROCESSES Communication and synchronization can be accomplished by sharing variables or by message passing. There are two methods of synchronization: synchronization for precedence It guarantees that one event does not begin until another event has finished synchronization for mutual exclusion. It guarantees that only one process at a time enters a critical section of code where a data structure to be shared is manipulated
  • 8. An Illustrative Example for UMA let's consider the problem of computing variance for a list of real numbers. The values given are r1, r2, r3 ,..., rn Variance = 𝒓𝒊 − 𝒎 𝟐/𝒏 where m = 𝒓𝒊/𝒏  Parallel Architecture used is UMA with 4 Processors  n real numbers are stored in shared memory  Four variables are also stored in shared memory  one will contain the grand total  another the mean  a third the global sum of squares  a fourth the variance
  • 9. An Illustrative Example for UMA  Four processes are created one for each processors.  Each process has a local temporary variable.  Each process adds its share of the n values  Accumulating the sum in its local temporary variable.  When the process is through computing its subtotal, it adds its subtotal to the shared variable, accumulating the grand total.  Since multiple processes are accessing the same global variable, that portion of code is a critical section, and the processes must enforce mutual exclusion.
  • 11. An Illustrative Example for UMA  A barrier synchronization step, inserted in the algorithm after the critical section, ensures that no process continues until all processes have added their subtotals to the grand total.  In sequential code one process computes the average by dividing the grand total by n.  To compute the global sum of squares, the processes go through a process similar to that which computes the global sum.  Again, the processes must enforce mutual exclusion.  After another barrier synchronization, one process computes the variance by dividing the sun of squares by the number of values.
  • 12. An Illustrative Example for Multicomputer Solve the same problem on a four-node multicomputer  There is no shared memory; the n values are distributed among the local memories of the nodes.  Each node process has four variables:  two to accumulate sums  one to store the mean  and another to store the variance  Each node process initializes the two accumulator to 0.  At this point every process has a subtotal; the four subtotals must be combined to find the grand total.
  • 13. An Illustrative Example for Multicomputer 7 • 0 8 • 1 11 • 2 10 • 3 15 15 21 21 36 36 36 36
  • 14. An Illustrative Example for Multicomputer The four subtotals must be combined to find the grand total.  After two exchange-and-add steps, every process has the grand total.  Every process can divide the grand total by n to determine the mean.
  • 15. An Illustrative Example for Multicomputer A similar set of steps allows the processes to compute the variance of the list values. Every process has the result. One of the processes can pass the answer back to program running on the front end, which then de-allocates the hypercube, surrendering access to the nodes.
  • 16. A Sample Application Integration: Find the area under the curve 4/( I +x2) between 0 and 1=   The interval [0, 1] is divided into n subintervals of width I/n.  For each these intervals the algorithm computes the area of a rectangle whose height is such that the curve intersects the top of the rectangle at its midpoint.  The sum of the areas of the n rectangles approximates the area under the curve.
  • 17. A Sample Application  This algorithm is data-parallel.  Since the areas of all the rectangles can be  computed simultaneously.  Computing the area of each rectangle requires the same amount of work: hence load balancing is insignificant.  If the language requires to divide the work among the processors, it can be done easily.
  • 19. FORTRAN 90 In 1978 the ANSI-accredited technical committee, X3J3 began working on a new version of the FORTRAN language. In the early l990s the resulting language, Fortran 90, was adopted as an ISO and ANSI standard.
  • 20. FORTRAN 90 Fortran 90 is a superset of FORTRAN 77. It includes all the features of FORTRAN 77, plus  Array operations  Improved facilities for numerical computations  Syntax to allow processors to support short integers, packed logicals, very large character sets, and high-precision real and complex numbers  User-defined data types, structures, and pointers  Dynamic storage allocation  Modules to support abstract data types  Internal procedures and recursive procedures  Improvements to input-output facilities  New control structures  New intrinsic procedures  Terminal-oriented source form
  • 21. FORTRAN 90 The committee also marked many language features as obsolete, including  arithmetic IF  some DO construct variations,  assigned Go TO,  assigned formats  and the H edit descriptor The next revision of the FORTRAN standard may not contain these features.
  • 22. FORTRAN 90 Programmer‘s Model The Fortran 90 programmer has a model of parallel computation similar to a PRAM. A CPU and a vector unit share a Single memory.  The CPU executes sequential instructions, accessing variables stored in the shard memory.  To execute parallel operations, the CPU controls the vector unit.  Which also stores and fetches data to and from the shared memory
  • 23. FORTRAN 90 Language Features Fortran 90 gives the programmer the ability to specify the type of variables through type declaration statements such as • REAL A, B, C • INTEGER I Each type may have several kinds. For example, a real variable may be stored in 4 bytes or 8 bytes. The Fortran 90 programmer may specify explicitly the kind, as well as the type of the variable, as in the following example: • REAL ( KIND=LONG) PI Fortran 90 introduces the notion of an array constant. For example. • (/ 2, 3, 5, 7,11 /) denotes a one dimensional array with five elements. It is possible to construct an array of higher dimension by declaring an array constant, then changing its dimension with the RESHAPE function.
  • 24. FORTRAN 90 Language Features An implied Do notation can simplify the specification of any constants. For example, the array constant • (/ 2, 4, 6, 8,10 /) may be specified as (/ (I, I = 2,10, 2) /) Fortran 90 also allows operations on arrays. When applied to an array, the unary intrinsic operators + and - return an array of the same dimensions, where the elements in the result array are found by applying the operator to the corresponding elements in the operand array. Numerical, relational, and logical binary intrinsic operators can manipulate arrays having the same dimensions. Each element in the result array is found by applying the operator to the corresponding elements in the operand arrays. A binary intrinsic operator can also manipulate an array and a scalar variable, resulting in an array of the same dimensions as the array operand.
  • 25. FORTRAN 90 Language Features For example, given the array declarations • REAL, DIMENSION(100,50) : : X, Y • REAL, DIMENSION(100) : : Z the following are examples of legal array expressions: X + Y Array of shape(100,50), elements X(I,J) + Y(I,J) X + 1.0 Array of shape(100,50), elements X(I,J) + 1.0 X .EQ. Y Value .True. If X(I,J) .EQ. Y(I,J) and .FALSE. otherwise X(1:100,3) +Z Array of shape(100), elements X(I,3) + Z(I)
  • 26. FORTRAN 90 Language Features Sometimes it is important to be able to perform an operation on a subset of the Array elements. The WHERE statement allows the programmer to specify which array elements are lo be active. For example, the statement WHERE (A > 0.0) A = SORT (A) replaces every positive element of A with its square root.
  • 27. FORTRAN 90 Language Features The WHERE statement divides the array elements into two sets, first performing one or more array assignments on the elements for which the expression is true, then performing one or more array assignments on the elements for which the expression is false. The syntax of most general form of WHERE statement is- WHERE(logical-array-expression) array-assignment-statements ELSEWHERE array-assignment-statements END WHERE Finally new transformational functions allow the reduction of an array into a scalar value. For example, the function SUM returns the sum of the elements of the any passed to it as an argument.
  • 30. Sequent C  Sequent computers run the DYNIX operating system, a version of UNIX tailored for the multiprocessor environment.  In addition to the operating-system calls typically found in a UNIX system, DYNIX provides a set of routines to facilitate parallel processing.  The commercial parallel programming languages the Sequent hardware uses are simple extensions of sequential languages that allow programmers to declare shared variables that interact via mutual exclusion and barrier synchronization.  The resulting languages are primitive.
  • 31. Sequent C - Shared Data  Parallel processes on the Sequent coordinate their activities by accessing shared data structures.  The keyword shared placed before a global variable declaration, indicates that all processes are to share a single instance of that variable.  For example, if a 10-element global array is declared int a [10], then every active process has its own copy of the array; if one process modifies a value in its copy of a, no other process's value will change.  On the other hand, if the array is declared shared int a [10] , then all active processes share a single instance of the array, and changes made by one process can be detected by the other processes.
  • 32. Sequent C - Parallel Processing Function  A program begins execution as a single process. This process is responsible for executing those parts of the program that are inherently sequential.  the original process forks a number of other processes, each process performing its share of the work.  The total number of processes accessing shared data cannot exceed the number of physical processors less one. Because there are at least as many CPUs as active processes, each process may execute on its own CPU.  This allows a major reduction in the execution time, assuming that the computer is not executing any other jobs.
  • 33. Sequent C - Parallel Processing Function  When control reaches an inherently sequential portion of the computation, only the original process executes the code; the remaining processes wait until control reaches another portion of the computation that can be divided into pieces and executed concurrently. The program cycles through these two modes until termination.  Parallel programs executing on the Sequent alternate between sequential and parallel execution.  The transition from parallel to sequential execution is always delimited by a barrier synchronization. In addition, data dependencies may require the insertion of barrier synchronizations within parallel code.
  • 34. Sequent C - Parallel Processing Function
  • 35. Sequent C - Parallel Processing Function Sequent's microtasking library has seven key functions: 1. m_set_procs(p): The parent process initializes to value p a shared variable that controls the number of processes created by a subsequent call to m_fork. The value of p can not exceed the number of physical processors in the system minus one. The function also initializes barriers and locks.
  • 36. Sequent C - Parallel Processing Function 2. m_fork(name[,arg,...]): The parent process creates a number of child processes, The parent process and the child processes begin executing function name with the arguments (if any) also specified by the call to m_fork. After all the processes (the parent and all the children) have completed execution of function name, the parent process resumes execution with the code after m_fork, while the child processes busy wait until the next call to m_fork.
  • 37. Sequent C - Parallel Processing Function 2. m_fork(name[,arg,...]): The parent process creates p number of child processes All processes begin executing function name The parent process resumes execution with the code after m_fork The child processes busy wait until the next call to m_fork. Therefore, the first call to m_fork is more expensive than subsequent calls, because only the first call entails process creation.
  • 38. Sequent C - Parallel Processing Function 3. m_get_myid: A process calls function m_get_myid to get its unique process number. If the total number of active processes is p, then the process number of the parent is 0, while the process numbers of the child processes range from 1 to p-1. 4. m_get_numprocs: Function m_get_numprocs returns the number of processes executing in parallel. Given the total number of processes and its own process number, a process can determine which portion of a computation is its responsibility.
  • 39. Sequent C - Parallel Processing Function 5. m_lock, m_unlock: Functions m_lock and m_unlock ensure mutually exclusive execution of the code that the two calls surround. Once a process has entered a block of code delimited by m_lock and m_unlock, no other process may enter until the first process has left. 7. m_kill_procs: Function m_kill_procs kills the child processes created by the first call to m_fork.
  • 40. Sequent C - Monitor  Most parallel algorithms implemented on multiprocessors require a process to perform a series of operations on a shared data structure, as if it were an atomic operation.  For example, a process may need to fetch the value at the beginning of a linked list and advance the list pointer to the next list element.  When the hardware cannot perform the entire series of operations as an atomic operation, the process must have some way to enforce mutual exclusion, keeping all other processes from referencing the resource while it is being modified. The piece of code in which mutual exclusion must be enforced is called a critical section.
  • 41. Sequent C - Monitor One way to structure accesses to shared resources is by using a monitor.  A monitor consists of variables representing the state of some resource, procedures that implement operations on it, and initialization code.  The values of the variables are initialized before any procedure in the monitor is called; these values are retained between procedure invocations and may be accessed only by procedures in the monitor.  Monitor procedures resemble ordinary procedures in the programming language with one significant exception. The execution of the procedures in the same monitor is guaranteed to be mutually exclusive. Hence monitors are a structured way of implementing mutual exclusion.
  • 42. Sequent C - Monitor
  • 43. Sequent C - Monitor
  • 44. Sequent C - Monitor Programming languages that support monitors include Concurrent Pascal (Brinch Hansen 1975, 1977) and Modula (Wirth 1977a, 1977b, 1977c). Even if your parallel programming language does not support monitors, you can implement one yourself. For example, in the Sequent C language, you can implement a monitor by declaring a shared lock variable for each resource, putting an s_lock statement that accesses the variable at the start of each procedure, and putting an s_unlock statement at the end of each procedure. You also must have enough self-discipline to use only these procedures to access the shared resource.