CS 473/573: Parallel Computing

Winter 2015

Meeting Times

Lect: 12:00 - 12:50 MTuW, HB 112

Labs: 12:00 - 12:50 Th, HB 203


Dr. Razvan Andonie, HB 219-B, Office hours


Joe Lemley


Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw Hill, 2004



Parallel computation is becoming pervasive in all levels of computing, from massively parallel supercomputers used in large scale computational science, to multiprocessor servers supporting transaction processing and the World Wide Web. The major issues raised in each of the core areas of computer science (e.g., algorithms, systems, languages, architecture, etc.) become even more interesting when considered in the context of parallel computing. Hence, this course challenges students to apply in a new context the concepts and tools they have studied in earlier computer science courses. This hands-on course will also introduce students to a topic of fundamental importance to a wide variety of application areas.

Topics & Learning Outcomes

Motivations for parallel processing

Parallel computer architectures

Multicore programming with OpenMP

Programming on massively parallel architectures (GPUs) with CUDA

Message passing programming with MPI

Performance analysis

Fundamental algorithms: backtracking, branch-and-bound, divide and conquer, sorting, searching

Applications: data mining, artificial intelligence, computational intelligence, scientific computing

On completion of this course, the student will have: i) designed and analyzed algorithms that execute efficiently on parallel computers, ii) implemented distributed programs using the Message Passing Interface (MPI), iii) implemented multicore programs using OpenMP, and iv) implemented programs for GPU’s in CUDA.


Exams (2 - 40% each)


Final Project



The HWs and Lab projects are not graded.

Grade Distribution

95 - 100   A

90 - 94     A-

87 - 89     B+

83 - 86     B

80 - 82     B-

77 - 79     C+

73 - 76     C

70 - 72     C-

67 - 69     D+

63 - 66     D

60 - 62     D-

0  - 59      F


If you must miss an exam, contact your instructor prior to the exam to schedule a time to make it up. Late submission of assignments is generally not accepted. No partial credit for late assignments will be offered.

Lectures & Projects

The slides for lectures, additional materials (including several MPI examples from Quinn's textbook), and the projects can be found in the shared directory.


We are going to use MPICH2, OpenMP, and CUDA + Thrust.

1.       MPICH2 is an implementation of the Message-Passing Interface (MPI). MPICH2 is distributed as source (with an open-source, freely available license), on several platforms, including Linux and Windows. Download MPICH and have a look at the MPICH documents. Introduction to MPI is an online course created by the PACS Training Group; it can be downloaded and printed. is the MPI Forum.

2.     The OpenMP C and C++ application program interface lets you write applications that effectively use multiple processors. Visual C++ supports the OpenMP 2.0 standard. If you want to program in OpenMP under Windows, read this tutorial from Microsoft: OpenMP and C++. More information about OpenMP can be found at and

3.     Information on CUDA (CUDA Toolkit and Guides) can be downloaded from You can run CUDA only on NVidia’s GPUs. The GPUs in HB209 are GeForce 9500 GT, compatible with CUDA.

4.     Thrust is a C++ template library for CUDA based on the Standard Template Library (STL). Thrust allows you to implement high performance parallel applications with minimal programming effort through a high-level interface that is fully interoperable with CUDA C. If you are using CUDA 4.0 or better then Thrust is already installed on your system.

5.       Theano  is a Python library that lets you to define, optimize, and evaluate mathematical expressions, especially ones with multi-dimensional arrays.

Lab Hardware in HB 203: Computers have Intel core I5 quad core processors, on which you can practice OpenMP multicore programming, and NVidia GeForce 9500 GT GPUs, which are CUDA compatible.


Parallel Computing Developer Center. Microsoft’s Parallel Computing Platform

Nan's Parallel Computing Page. This list contains links related to parallel computing

Top 500 Supercomputer sites

Additional Reading

The Landscape of Parallel Computing Research: A View from Berkeley. A very important technical report.

Designing and Building Parallel Programs, an online book by Ian Foster, Addison-Wesley, 1995

Pacheco, P.S., An Introduction to Parallel Programming, Elsevier, 2011.

Wilkinson B. and M. Allen, Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, Prentice Hall, 2005.

Gropp, W., E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface, MIT Press, 1999.

Grama, A., A. Gupta, G. Karypis, and V. Kumar, Introduction to Parallel Computing, 2nd ed., Addison-Wesley, 2003

Ghosh, S. Distributed Systems - An Algorithmic Approach, Chapman & Hall/CRC, 2007. An excellent textbook on distributed systems, with an emphasis on algorithms.

Chapman, B., G. Jost, and R. V. D. Pas, Using OpenMP Portable Shared Memory Parallel Programming, MIT Press, 2008.

David B. Kirk and Wen-mei W. Hwu. Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2010.

Sanders, J., Kandrot, E. CUDA by Example, Addison-Wesley, 2011

Course Schedule










Motivation and History

Ch. 1



Parallel Algorithm Design




Parallel Algorithm Design: Boundary Value Problem, Finding the Maximum


HW 1: Ex. 3.12


Shared-Memory Programming with OpenMP

Ch. 17

HW 1 due; HW 2: Ex. 3.13


No School, M. L. King




Shared-Memory Programming with OpenMP

Ch. 17

HW 2 due; HW 3: Ex. 3.14


Shared-Memory Programming with OpenMP

Ch. 17

HW 3 due; HW 4: Ex. 3.15



CUDA Resources on Neve

HW 4 due



Thrust Quick Start Guide


Faculty Development Day




Message-Passing Programming

Ch. 4



Message-Passing Programming

Ch. 4



The Sieve of Eratosthenes

Ch. 5



The Sieve of Eratosthenes

Ch. 5



Floyd's Algorithm; Discuss HW 5

Ch. 6

HW 5: Ex. 6.1


Exam I




No School, Presidents Day




Performance Analysis

Ch. 7

HW 5 due


Matrix-Vector Multiplication

Ch. 8



Document Classification

Ch. 9



Document Classification, Discuss Final Projects

Ch. 9



Combinatorial Search: Divide and Conquer, Backtracking

Ch. 16



Combinatorial Search: Branch and Bound

Ch. 16



Combinatorial Search: Game Trees, Alpha-Beta Search

Ch. 16


Sorting: Quicksort



Sorting: Hyperquicksort, Parallel Sorting by Regular Sampling




Monte Carlo Methods

Ch. 10



Monte Carlo Methods

Ch. 10



Final Exam (noon – 1:30)

Laboratory Schedule



Item Due


Read Open MP Setup on Neve.


Project 1 (OpenMP): Ex. 17.12



Read CUDA Setup on Neve.


In the NVidia folder, browse CUDA samples, read the Getting Started Guide, and run examples.


Project 2 (CUDA): implement Project 1 in CUDA.

Project 1 due


Project 2

Project 2 due


Read MPICH on Windows 7 with MS Visual Studio 2010

Project 3 (MPI): on Neve

Project 3 due


Project 4 (MPI): Ex. 4.4. Use program SATI.c on Neve


Project 5 (MPI): Ex. 4.8

Project 4 due


Project 6 (MPI): Ex 6.11 (a, c). Use program Sieve.c on Neve

Project 5 due


Project 7 (MPI): on Neve; difficult!


Final Project: on Neve

Project 6 due


Final Project

Project 7 due


Final Project

Final Project due

Honor Code: All work turned in for credit, including exams and all components of the project, are to be the work of the student whose name is on the exam or project. For all project components, the student can receive assistance from individuals other than the instructor only to ascertain the cause of errors. Thus you can get help if you need it to figure out why something doesn't work. You just can't get help from anyone, other than the instructor or TA, to figure out how to make something work. All solutions turned in for credit are to be your individual work and should demonstrate your problem solving skills, not someone else's. Help each other understand and debug the programming assignments. However, you should write the code for your programs yourself. Writing it yourself is the only way you will learn. Since everyone is writing their own code, no two programs should be the same or so similar that I could convert one to the other by a simple mechanical transformation (e.g. changing variable names and comments). I consider this plagiarism and a violation of academic code. The following text should appear on all assignments: “I pledge that I have neither given nor received help from anyone other than the instructor for all program components included here.

First violation: Students must meet with the instructor. In most cases, the grade will be split between the authors of the copied programs. Second violation: Students will receive no credit for the assignment, an incident letter will be placed on file in the Computer Science Department, and the matter referred to the Computer Science Department Chair.

Class Attendance: Class attendance is expected and recorded.

ADA Statement: Students with disabilities who wish to set up academic adjustment in this class should give me a copy of their "Confirmation of Eligibility for Academic Adjustment" from the Disability Support Services Office as soon as possible so we can discuss how the approved adjustment will be implemented in this class. Students without this form should contact the Disability Support Services Office, Buillon 205 or or 963-2171.

Caveat: The schedule and procedures for this course are subject to change. It is the student's responsibility to learn of and adjust to changes.