Big Climate Data Computing and Analytics

Earth observations, model simulations, and climate reanalysis produce vast amounts of climate data. The unprecedented data volume and intrinsic complexity of geospatial statistics and analysis requires efficient analysis to investigate global problems such as climate change, natural disasters, diseases, and other environmental issues. However, this requirement poses grand challenges due to the unprecedented data volume and intrinsic complexity of geospatial statistics and analysis. Addressing these challenges requires efficient data management strategies, complex parallel algorithms and scalable computing systems.

This project aims to address these challenges by developing an efficient spatiotemporal indexing approach (Li et al., 2016), an innovative query analytical framework (Li et al., 2017), and a scalable online visual analytical system called SOVAS for interactive big climate data analysis (Li et al., 2019). The Spatiotemporal Indexing Approach (SIA) (Li et al., 2016) was adopted by NASA as one of the key technologies in their Data Analytics and Storage System (DASS). The SIA is also extended and applied to build a a hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data (Fei at al., 2020).


Publications:

Li, Z., Hu, F., Schnase, J. L., Duffy, D. Q., Lee, T., Bowen, M. K., & Yang, C. (2016). A spatiotemporal indexing approach for efficient processing of big array-based climate data with MapReduce. International Journal of Geographical Information Science, 31(1), 17-35.

Li Z., Huang Q., Carbone G., Hu F. (2017)  A High Performance Query Analytical Framework for Supporting Data-intensive Climate Studies, Computers, Environment and Urban Systems, 62(3), 210-221

Li Z., Huang Q., Jiang Y., Hu F., (2019) SOVAS: A Scalable Online Visual Analytic System for Big Climate Data Analysis, International Journal of Geographic Information Science, doi: 10.1080/13658816.2019.1605073

Hu, F., Yang, C., Jiang, Y., Li, Y., Song, W., Duffy, D. Q., … & Lee, T. (2020). A hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data. International Journal of Digital Earth13(3), 410-428.

SOVAS Website: https://gidbusc.github.io/SCOVAS

Translate »