-
Notifications
You must be signed in to change notification settings - Fork 0
Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.
en-tropyc/CUDA
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
simple_examples/README
========================
Subdirectories
---------------
1/ vector_add.cu -> vector addition
2/ matrix_mul.cu -> matrix multiplication
3/ moveArrays.cu
CUDA, Supercomputing for the Masses: Part 1
4/ incrementArray.cu
CUDA, Supercomputing for the Masses: Part 2
5/ reverseArray_multiblock.cu
CUDA, Supercomputing for the Masses: Part 3
Error handling and global memory performance limitations
6/ arrayReversal_multiblock_fast.cu
CUDA, Supercomputing for the Masses: Part 3
Error handling and global memory performance limitations
7/ memset.cu -> Memory banswith test
8/ simpleCUDA.cu
This simple code sample demonstrates how to perform a simple linear
algebra operation using CUDA, single precision axpy:
y[i] = alpha*x[i] + y[i] for x,y in R^N and a scalar alpha
http://mags.acm.org/queue/20080304/
9/ atomic2.cu -> compute the index of first nonzero entry of an array
About
Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published