The 13 Dwarves of Computing

From TAMUQ Research Computing User Documentation Wiki
Jump to navigation Jump to search


Dense linear algebra (e.g. BLAS, ML Principal Component Analysis)

Data is stored in dense matrices or vectors and access is often via unit-level strides. Typical algorithm would be Cholesky decomposition for symmetric systems or Gaussian elimination for non-symmetric systems. In terms of mathematical operations, this involves the classic vector and matrix operations, traditionally divided into Level 1 (vector/vector), Level 2 (matrix/vector), and Level 3 (matrix/matrix) operations.

Applications are various:

  1. Linear algebra: LAPACK, ATLAS
  2. Clustering algorithms / Data-mining: StreamCluster, K-means

Normally implemented by loops, and in many cases easy to parallelize.

Sparse linear algebra (e.g. SpMV, ML Principal Component Analysis)

Data is stored in compressed format as it largely consists of zeros and is therefore accessed via an index-based load. Typical algorithm would be Conjugate Gradient or any of the Krylov methods. Requires multiplications involving matrices composed primarily of zeroes. Computations can be done much more efficiently by moving the non-zero elements around the diagonal of the matrix.

Applications:

  1. Finite element analysis
  2. Partial differential equations

Spectral methods (e.g. FFT)

Data is in frequency domain and requires a transform to convert to spatial/temporal domain. They are typified by, but not restricted to, FFT. Involves solving certain differential equations, often involving the use of the Fast Fourier Transform. Spectral methods can be used to solve ordinary differential equations (ODEs), partial differential equations (PDEs) and eigenvalue problems involving differential equations.

Applications:

  1. Fluid Dynamics
  2. Quantum Mechanics
  3. Weather Prediction

There are various FFT implementations for each hardware architecture.

N-body methods (e.g. Barnes-Hut)

Data consists of discrete particle bodies that interact with each other and/or the “environment”. An N-body simulation is a simulation of a dynamical system of particles, usually under the influence of physical forces, such as gravity. Computations can be done both ways (A influences B the same B influences A) and the state of the whole system is updated after each round. The basic algorithm is O(N^2). Optimizations for larger systems are possible by neighbour-administration and leaving far-away particles out of the computation. Run-time approach-selection is desirable.

Applications:

  1. Astronomy: cosmology (e.g. formation of galaxies)
  2. Computational Chemistry: Molecular Dynamics (e.g. protein folding), Molecular modeling
  3. Physics: Fluid Dynamics, Plasma Physics

Structured grids (e.g. Cactus)

Represented by a regular grid. Points on grid are conceptually updated together via equations linking them to other grids. There is high spatial locality. Updates may be in place or between two versions of the grid. In a structured or regular grid all the elements in the grid have the same dimensions. Think squares and blocks.

Applications:

  1. Image Processing: Gaussian image blurring
  2. Physics Simulations: transient thermal differential equation solver
  3. Finite Element Method

Unstructured grids (e.g. ABAQUS, FLUENT)

These include all grids that are not regular grids. Different elements have different number of neighbors. Data is stored in terms of the locality and connectivity to other data. Points on the grid are conceptually updated together, but updates require multiple levels of redirection. This algorithmic class has a lot of overlap with backtracking.

Applications:

  1. Computational fluid dynamics
  2. Belief propagation

Map reduce methods (e.g. Monte Carlo)

These are embarrassingly parallel problems, such as Monte Carlo methods, where calculations are independent of each other, so nearly no communication is required between processes. In essence we are talking about a single function that executes in parallel on independent data sets, with outputs that are eventually combined to form a single or small number of results. In the case of huge data-sets and compute-intensive algorithms, GPUs can be used in combination with big data solutions like Hadoop to attack this set of problems.

Applications:

  1. Monte-Carlo: computation of pi, portfolio analysis, collision simulation, sequence alignment
  2. Distributed searching

As communication between the processes is minimal, some of these problems can see tremendous speedups when implemented on GPUs.

Combinational logic (e.g. encryption, ML hashing)

These algorithms generally involve performing simple operations on very large amounts of data often exploiting bit-level parallelism. For example, computing Cyclic Redundancy Codes (CRC) is critical to ensure integrity and RSA encryption for data security.

Applications:

  1. Computing checksums, CRCs
  2. Encryption & decryption
  3. Hashing
  4. Hamming weight

Graph traversal (e.g. Quicksort, ML decision trees)

Graph traversal is the problem of visiting all the nodes in a graph in a particular manner, updating and/or checking their values along the way. Tree traversal is a special case of graph traversal. It typically involves indirect table lookups and little computation.

Applications:

  1. Search: depth-first search, breadth-first search, finding all nodes within one connected component
  2. Sorting: quick-sort
  3. Serialization/deserialization
  4. Maze generation
  5. Collision detection

Dynamic programming

This is an algorithmic technique that computes solutions by solving simpler overlapping subproblems. Many dynamic programming problems operate by filling in a grid that represents the solution space, where one location in the grid holds the final answer.

Applications:

  1. Graph problems: Floyd’s AllPairs, shortest path, Bellman-Ford algorithm
  2. Sequence alignment: Needleman-Wunsch, Smith-Waterman

As the word “dynamic” implies, better performance is achieved when tuned during run-time.

Backtrack and branch+bound

These involve solving various search and global optimization problems for intractably large spaces. Branch and bound algorithms work on the divide and conquer principle: the search space is subdivided into smaller subregions (“branching”), and bounds are found on all the solutions contained in each subregion under consideration. Finds an optimal solution by recursively dividing the feasible region into subdomains, and then pruning subproblems that are suboptimal.

Put another way, this consists of building up all possible solutions and eliminating invalid ones (most often in one step), as there is no overview of all possible solutions at the outset. It is effectively a depth-first search of a problem space and therefore a special case of dynamic programming, but with a strong accent on the step-wise creation of the solution-space.

Applications:

  1. Puzzles: N-queens, crosswords, verbal arithmetic, Sudoku, Peg Solitaire.
  2. Traveling salesman
  3. Knapsack, subset sum, and partition problems
  4. Integer Linear Programming
  5. Boolean Satisfiability
  6. Combinatorial Optimisation

Probabilistic graphical models (e.g. Neural networks, Bayesian networks)

A graph that combines uncertainty (probabilities) and logical structure (independence constraints) to compactly represent complex, real-world phenomena. Applications involve graphs that represent random variables as nodes and conditional dependencies as edges.

Applications:

  1. Bayesian networks: belief networks, probabilistic networks, causal network, knowledge maps
  2. Hidden Markov models
  3. Neural networks

Finite state machine

A mathematical model of computation used to design both computer programs and sequential logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states.

Applications:

  1. Video Decoding, Parsing, Compression
  2. Data Mining
  3. Find Repeating Patterns