Start of topic | Skip to actions

An Application of Staging and Dependent Types on Codes with Sparse Matrices

Efficient routines for both dense and sparse linear algebra have been extensively studied. In dense linear algebra, BLAS and LAPACK have standardized the interface for many different kinds of operators. Further, most manufacturers provided highly optimized versions of these routines. Unfortunately, the same sort of tools are not available for sparse matrices.

Sparse matrices are far trickier to compute with than dense matrices. For example, adding two dense matrices is simple. Both matrices have the same size and require the same amount of memory. They will generate a third matrix that has also the same size and will also require the same amount of memory. Conversely, with sparse matrices, all three may require a different amount of memory. This prevents us from optimizing many routines. However, it may be possible to determine some of this information at compile time using a clever use of dependent types.

Before we begin, let us define two different, common ways to store a sparse matrix. Consider the following matrix

1.1 2.2 0   0   0
0   3.3 0   0   4.4
5.5 0   6.6 0   0
0   0   0   7.7 0
0   8.8 0   0   0
The target format uses three different arrays to store the matrix. One array stores the row indices, one the column, and the last the actual values. For our example above, we would store the matrix as
i j x
-------
0 0 1.1
2 0 5.5
0 1 2.2
1 1 3.3
4 1 8.8
2 2 6.6
3 3 7.7
1 4 4.4
Notice, that the order we store our elements is not necessarily unique. Although, for performance reasons it is very important that we sort by either the row or column indices. This format is simple to use, but slightly inefficient. A more compact way to store a matrix is compressed row form. In this format, there are three arrays as before. The first stores the rows indices, in order, by column. The second stores the actual values. The final array stores the index in the array of row indices where are column begins. For example, our above matrix is stored as
i p x
-------
0 0 1.1
2   5.5
0 2 2.2
1   3.3
4   8.8
2 5 6.6
3   7.7
1 7 4.4
  8
The advantage of this format is that is uses slightly less memory than the target format. It also insures that our indices are sorted in a sane way, so many algorithms will be faster.

If we wanted to type matrices in compressed row form, how much information would we need? At first glance, we would need to store the size of the matrix (m,n) and the number of nonzero elements. Now consider algebraic operators on these matrices such as addition. If we add two matrices together, what's their resulting type? How many nonzeros will we have? Of course, that depends on the specific sparsity pattern. For example, ignoring the value of the individual elements

[x 0 x]   [x x x]   [x x x]
[x 0 0] + [x 0 0] = [x 0 0]
[0 x 0]   [x 0 0]   [x x 0]
For our purposes we will ignore the fact that elements could cancel and become zero. In practice, this is impossible to determine due to roundoff error. Thus, in order to type matrices, we must store their size, the number of nonzeros, and their sparsity pattern.

Why would we want to do this? When writing algorithms, we frequently need to iterate over the number of nonzero elements. If we do not know this size a priori, can our compilers optimize as effectively? With staging and dependent types, we should be able to determine these values and optimize accordingly.

Of course, determine the sparsity pattern for more complicated operations is difficult, but possible. For example, what's the resulting sparsity pattern for matrix multiplication? What about after a Choleski factorization?


End of topic
Skip to actions | Back to top
Creative Commons LicenseThis work is licensed under a Creative Commons Attribution 2.5 License. Please follow our citation guidelines.