Date of Award
2014
Document Type
Dissertation
Degree Name
Ph.D. in Engineering Science
Department
Computer and Information Science
First Advisor
Philip J. Rhodes
Second Advisor
Greg Easson
Third Advisor
Byunghyun Jang
Relational Format
dissertation/thesis
Abstract
The size of spatial scientific datasets is steadily increasing due to improvements in instruments and availability of computational resources. Scientific datasets today are often far too large to fit into a single machine's memory or even a single disk. However, much of the research on efficient storage and access to spatial datasets has focused on large multidimensional arrays. In contrast, unstructured grids consisting of collections of implices (e.g. triangles or tetrahedra) present special challenges that have received less attention. Data values found at the vertices of the simplices may be dispersed throughout a datafile, producing especially poor disk locality. Partitioning multidimensional arrays across several machines or disks has become increasingly necessary. However, relatively little work has been done for unstructured grids. We address this important problem of poor locality in two major ways. First, we reorganize the unstructured grid to improve locality in both the dataset space and in the data file on disk using a specialized chunking approach that maintains the spatial neighborhood relationships inherent in the unstructured grid. This reorganization produces significant gains in performance by reducing the number of accesses made to the data file. We examine the effects of different chunking configurations on data retrieval performance. A major motivation for reorganizing the unstructured grid is to allow the application of iteration aware prefetching. Second, we describe a prefetching method that takes advantage of prior knowledge of the user's access pattern. Applying this prefetching method to unstructured grids produces further performance gains over and above the gains seen from reorganization alone. In addressing the poor locality, we investigated partitioning unstructured grids at the disk level and its effect on overall system performance. We build upon this and investigate the effect of an in-core partitioning performed on top of the existing disk level partitioning. We also examine the performance benefits of declustering unstructured grids across several disks. Given this declustered dataset, we describe and explore a parallel data retrieval method that takes advantage of prior knowledge of a user access pattern. Our test results demonstrate very significant performance gains. Lastly, we present guidelines for choosing effective partitionings of datasets when the access pattern is known in advance.
Recommended Citation
Akande, Oyindamola, "An Efficient Storage And Retrieval Mechanism For Large Unstructured Grids" (2014). Electronic Theses and Dissertations. 952.
https://egrove.olemiss.edu/etd/952
Concentration/Emphasis
Emphasis: Computer Science