Tuesday, October 2nd, 2012
Colloquium – 4:15 p.m.
Big Data and Data Analytics: The Kronecker Product
Abstract High-dimensional modeling is becoming ubiquitous across the sciences and engineering because of advances in sensor technology and storage technology. Computationally-oriented researchers no longer have to avoid what were once intractably large, tensor-structured data sets. Collecting and storing large datasets of sensor data, social network data, fMRI medical data is easier than ever with commodity, terabyte disks. This data explosion creates deep research challenges that require scalable, tensor-based algorithms. A tensor is an element of a tensor product of vector spaces. Up to a choice of bases, this can be represented as a multidimensional array of numerical values upon which algebraic operations generalizing matrix operations can be performed. In this representation, the entries in a k-th order tensor are identified by a k-tuple of subscripts, e.g., A(i1, i2, i3, i4). A matrix is a second-order tensor. A vector is a first-order tensor. A scalar is a tensor of order zero. Multidimensional FFT computations have a lot in common with tensor contractions in that both are (a) rich in matrix-vector products, (b) highly parallelizable, and (c) plagued with all kinds of data-locality obstacles. Moreover, in both computational settings the Kronecker product has a prominent role to play. Currently, the design of high-performance tensor software requires (a) an ability to reason at the index-level about the constituent contractions and the order of their evaluation and (b) an ability to reason at the block matrix level in order to expose fast, underlying Kronecker product-like operations. Progress in numerical multilinear algebra will be inhibited without the development of languages and systems that provide high-level support for this type of computational thinking.
This talk will present a new theorem that optimizes multiple Kronecker Products on advanced high performance heterogeneous computers and it will reveal the need for new hardware to support the calculation of integer addresses reliant on an outer-plus parallel integer operation.
Speaker Biography Dr. Mullin has over 30 years experience in tensor/array based high performance computing. Her degrees in Mathematics, Physics, and Computer Science are complemented by professional experience in algorithms, languages, compilers, operating systems, and computer hardware in industry, government, and academia. She is an NSF Presidential Fellow and served at both NSF and DOE as a Program Officer promoting and funding tensor based research. Her dissertation research, A Mathematics of Arrays and Psi Calculus has been used to design and verify both hardware and software. While at MIT Lincoln Laboratory she helped to provide optimizations for codes used in standard mission critical software. At IBM, her patent provides optimizations for sparse arrays used in LU decomposition. She developed high performance computing courses and curriculums at numerous universities including those taught presently at UA. She is a graduate of Syracuse University and studied with Turing Award winners Ken Iverson, Alan Perlis and J Alan Robinson. Her resent accomplishments include organizing NSF workshops to promote awareness of the pervasive use of tensors and DOE workshops to discuss abstract machines to reason about optimal performance of tensor based computing for big data. Her recent sabbatical was spent on a fellowship to the University at Glasgow and lecturing in Italy, Paris, and Hong Kong on her recent theorem for optimizations of multiple Kronecker Products. Her current research interests include identifying other important tensor algorithms that need similar optimizations.