SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Poster 128: Identifying Time Series Similarity in Large-Scale Earth System Datasets

Authors: Payton Linton (Youngstown State University), William Melodia (Youngstown State University), Alina Lazar (Youngstown State University), Deborah Agarwal (Lawrence Berkeley National Laboratory), Ludovico Bianchi (Lawrence Berkeley National Laboratory), Devarshi Ghoshal (Lawrence Berkeley National Laboratory), Kesheng Wu (Lawrence Berkeley National Laboratory), Gilberto Pastorello (Lawrence Berkeley National Laboratory), Lavanya Ramakrishnan (Lawrence Berkeley National Laboratory)

Abstract: Scientific data volumes are growing every day and instrument configurations, quality control and software updates result in changes to the data. This study focuses on developing algorithms that detect changes in time series datasets in the context of the Deduce project. We propose a combination of methods that include dimensionality reduction and clustering to evaluate similarity measuring algorithms. This methodology can be used to discover existing patterns and correlations within a dataset. The current results indicate that the Euclidean Distance metric provides the best results in terms of internal cluster validity measures for multi-variable analyses of large-scale earth system datasets. The poster will include details on our methodology, results, and future work.

Best Poster Finalist (BP): no

Poster: PDF
Poster summary: PDF

Back to Poster Archive Listing