ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Parallel & Distributed Computing
    COMP3139
    Progress0 / 33 topics
    Topics
    1. Introduction to Parallel and Distributed Systems2. Why Use Parallel and Distributed Systems?3. Speedup and Amdahl's Law4. Hardware Architectures: Multi Processors (Shared Memory)5. Hardware Architectures: Networks of Workstations (Distributed Memory)6. Hardware Architectures: Clusters (Latest Variation)7. Software Architectures: Threads and Shared Memory8. Software Architectures: Processes and Message Passing9. Software Architectures: Distributed Shared Memory (DSM)10. Software Architectures: Distributed Shared Data (DSD)11. Parallel Algorithms12. Concurrency and Synchronization13. Data and Work Partitioning14. Common Parallelization Strategies15. Granularity16. Load Balancing17. Examples of Parallel Algorithms: Parallel Search18. Examples of Parallel Algorithms: Parallel Sorting19. Shared-Memory Programming20. Threads in Shared-Memory Programming21. P Threads22. Locks and Semaphores23. Distributed-Memory Programming24. Message Passing25. Map Reduce26. Distributed-Memory Programming with PI27. Google's Map Reduce28. Hadoop29. Other Parallel Programming Systems30. Tread Marks31. Distributed Shared Memory32. Aurora: Scoped Behavior and Abstract Data Types33. S Enterprise: Process Templates
    COMP3139›Distributed-Memory Programming
    Parallel & Distributed ComputingTopic 23 of 33

    Distributed-Memory Programming

    7 minread
    1,256words
    Intermediatelevel

    Distributed-Memory Programming

    In distributed-memory systems, multiple computers or processors work together to solve a problem, but each has its own private memory. Unlike shared-memory systems where all processors can directly access the same memory space, in a distributed-memory system, each processor has its own memory, and communication between processors is necessary to share data.

    Distributed-memory systems are typically used in large-scale systems like clusters, supercomputers, or cloud computing platforms, where multiple nodes (computers) work together to perform parallel processing.

    Distributed-memory programming involves designing software that can effectively manage the exchange of information between processes running on different nodes, ensuring synchronization, load balancing, and minimizing communication overhead.


    Key Concepts in Distributed-Memory Programming

    1. Message Passing
      Since there is no shared memory in distributed systems, processes communicate via message passing. This is a fundamental aspect of distributed-memory programming. Each process sends and receives messages (data) to/from other processes.

    2. Processes vs. Threads
      In distributed-memory systems, programming is often done with processes rather than threads. Each process runs in its own address space and cannot access the memory of other processes directly. Processes need to explicitly send data to each other over a network.

    3. Communication Models
      The two main models for message passing in distributed-memory systems are:

      • Point-to-Point Communication: A message is sent from one process to another directly.
      • Collective Communication: Involves communication between a group of processes (e.g., broadcasting data to all processes or gathering data from all processes).

    1. Message Passing Interface (MPI)

    The Message Passing Interface (MPI) is the standard library for message-passing programming in distributed-memory systems. It provides a set of functions that allow processes to send and receive messages across a distributed network. MPI is widely used for high-performance computing (HPC) applications, where tasks are split across multiple nodes.

    MPI Basics:

    • Point-to-point communication: Sending messages from one process to another.

      • MPI_Send(): Send a message.
      • MPI_Recv(): Receive a message.
    • Collective communication: Communication that involves more than one process.

      • MPI_Bcast(): Broadcast a message to all processes in a communicator.
      • MPI_Gather(): Gather data from all processes and bring it to one process.
      • MPI_Scatter(): Distribute data from one process to all processes.
    • Synchronization: MPI provides mechanisms for synchronizing processes during communication.

      • MPI_Barrier(): Blocks all processes until every process reaches the barrier.

    Example: Simple MPI Program

    #include <mpi.h>
    #include <stdio.h>
    
    int main(int argc, char* argv[]) {
        int rank, size;
    
        // Initialize MPI environment
        MPI_Init(&argc, &argv);
    
        // Get the rank of the process and total number of processes
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
    
        // Example of point-to-point communication: send from rank 0, receive by rank 1
        if (rank == 0) {
            int data = 42;
            MPI_Send(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);  // Send data to rank 1
            printf("Process 0 sent data: %d\n", data);
        } else if (rank == 1) {
            int received_data;
            MPI_Recv(&received_data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);  // Receive data from rank 0
            printf("Process 1 received data: %d\n", received_data);
        }
    
        // Finalize MPI environment
        MPI_Finalize();
        return 0;
    }
    

    In this example:

    • MPI_Send() sends data from process 0 to process 1.
    • MPI_Recv() receives the data in process 1.
    • MPI_Comm_rank() retrieves the rank (ID) of the process.
    • MPI_Comm_size() retrieves the total number of processes.

    2. Communication in Distributed-Memory Systems

    Communication in distributed-memory systems is typically achieved through sockets, MPI, or other specialized communication libraries.

    Types of Communication:

    • Unidirectional Communication: One-way communication from a sender to a receiver.
    • Bidirectional Communication: Two-way communication between sender and receiver, where both can send and receive messages.

    Important Considerations:

    • Latency: The time it takes for a message to travel from sender to receiver. High latency can reduce performance, so it is crucial to minimize communication delays.
    • Bandwidth: The amount of data that can be transmitted in a given time. Higher bandwidth allows for faster communication between processes.
    • Network Topology: The structure of the network connecting the processes. Optimizing the topology can reduce communication overhead.

    3. Data Partitioning and Distribution

    In a distributed-memory system, the data used by a program is often partitioned or distributed across multiple nodes. How you partition the data influences the efficiency of the program, as well as the amount of communication required between processes.

    Data Partitioning Strategies:

    • Block Partitioning: Each process gets a contiguous block of data.
    • Cyclic Partitioning: The data is distributed across processes in a round-robin manner.
    • Block-Cyclic Partitioning: A combination of block and cyclic, useful for handling both data locality and load balancing.

    Example of Data Partitioning:

    Consider a simple matrix multiplication where a large matrix is divided into blocks, and each block is processed by a different process.

    • Matrix A: Each process gets a block of rows.
    • Matrix B: Each process gets a block of columns.
    • Matrix C: Each process computes a partial result and then the results are combined at the end.

    Efficient data partitioning minimizes the amount of data that needs to be communicated between processes.


    4. Synchronization and Barrier

    In distributed-memory programming, synchronization is crucial to ensure that processes work together in a coordinated way.

    Common Synchronization Techniques:

    • Barriers: All processes wait at a barrier until all processes reach that point, at which time they can all proceed. This ensures that all processes are synchronized before moving to the next stage.

      • Example in MPI: MPI_Barrier(MPI_COMM_WORLD);
    • Locks and Semaphores: These synchronization mechanisms can also be used in distributed-memory systems, though they are more commonly associated with shared-memory programming. In distributed systems, more sophisticated mechanisms may be used to avoid deadlocks and race conditions.


    5. Challenges in Distributed-Memory Programming

    • Communication Overhead: Communication between nodes in a distributed system is slower than accessing shared memory in a single node. Minimizing the number of messages sent and the size of the messages is crucial for performance.

    • Load Balancing: In distributed systems, it’s essential to ensure that each node performs approximately the same amount of work. Uneven distribution of work can result in some nodes being idle while others are overloaded, reducing overall efficiency.

    • Fault Tolerance: Distributed systems are more prone to failures, so building fault-tolerant systems that can recover from node failures or communication problems is important.

    • Scalability: As the number of processes or nodes increases, managing communication and data distribution becomes more complex. Algorithms need to scale efficiently to ensure that performance does not degrade with larger systems.


    6. Use Cases for Distributed-Memory Programming

    • Supercomputing and High-Performance Computing (HPC): Large-scale scientific simulations and data analysis, like weather prediction, molecular dynamics simulations, and simulations of physical systems, often require distributed-memory systems.

    • Cloud Computing: In cloud computing, distributed-memory systems allow for processing large datasets across multiple machines or nodes. Services like Amazon EC2 and Google Cloud Compute rely on distributed-memory systems to provide parallel processing capabilities.

    • Big Data and Machine Learning: Distributed-memory systems are often used in big data processing (e.g., Hadoop, Spark) where massive datasets are divided across multiple machines for parallel processing.


    Conclusion

    Distributed-memory programming is essential for scaling applications that need to run on multiple machines or processors. It relies on message passing to exchange data between independent processes, and requires careful management of data partitioning, communication, and synchronization to achieve good performance. Tools like MPI are widely used to implement distributed-memory applications in high-performance computing environments, and understanding the challenges of communication overhead, load balancing, and fault tolerance is key to developing efficient and scalable distributed applications.

    Previous topic 22
    Locks and Semaphores
    Next topic 24
    Message Passing

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time7 min
      Word count1,256
      Code examples0
      DifficultyIntermediate