Multicore Parallelism

The Power limit (how power goes up exponentially) forced chip designers to switch to multiple processors (cores) instead of just one more powerful one

New challenges

How to utilize all the cores
How to keep all the cores busy

Calculating Performance

Potential is the speedup if 100% of the program is parallelizable
Use Amdahl’s Law to calculate the percentage needed to parallelize for a given speedup

Types of Memory

Centralized Shared-Memory
- Properties
  - Processors mutate shared memory
  - Each processor has their own private Caches
  - Processors have shared IO
- Good for 4-16 processors
- Main memory has a symmetric relationship to all processors and uniform access time from any processor
  - SMP: symmetric shared-memory multiprocessors
  - UMA: uniform memory access architectures
Distributed-Memory
- Properties
  - Each processor gets their own memory and IO
  - Processors have to communicate to each other
  - Good for a ton of processors
- Shared address space
  - Same physical address refers to same memory location
  - DSM: Distributed Shared-Memory Architectures
  - NUMA: Non-uniform memory access since the access time depends on the location of the data
Cache coherence problems
- With write back, the value in the cache is not written back to main memory until it is needed
- So other processors might read a stale value from the main memory
- Solutions
  - Centralized: Snoopy based protocol
    - The processors watch what other processors are doing
  - Distributed: Directory based protocol

Liam's UF Notes

Explorer

Multicore Parallelism

New challenges

Calculating Performance

Types of Memory

Graph View

Table of Contents

Backlinks

Explorer