Skip to content

license

Introduction to Cloud Native Geospatial

Image title

Instructors(s):

Tyson Lee Swetnam PhD , Carlos Lizárraga-Celaya PhD , Jeffrey Gillan PhD

About This Course

The future of geographic information systems (GIS) and geoinformatics is in the cloud. Geospatial data are some of the largest and most important data produced today. Working with the massive volume of GIS data that are now hosted on commercial cloud requires GIS specialists and researchers to take a cloud-native approach to working with their data.

Geospatial data formats are evolving toward being completely cloud-native, meaning that they can be instantly searched, queried, and analyzed on the cloud using cyberGIS without the expectation that they be downloaded to a local workstation or laptop for analysis. There is already a growing ecosystem of software applications for working with these data which are open-source and open-access.


Here are the topics we will cover:


Throughout, we will emphasize Open Science principles such as FAIR and CARE, and highlight primarily Open Source tools.


Presentation Slides

This content was delivered at the 2023 Arizona Geographic Information Council meeting in Prescott, AZ in August 2023. Check out the presentation slides here.

Recorded Presentation

Let's Use the Cloud!

GIS and Remote Sensing have become essential tools in many industries and fields of study. The amount of data being collected, analyzed, and available on the web is growing exponentially. So much so that the traditional model of downloading data to your individual desktop computers (for analysis and storage) is becoming a major limitation.

Cloud Native Geospatial aims to shift the 'Download' model by moving many aspects of data storage, sharing, and compute onto the web using cloud infrastructure. These advances hold the potential to foster collaborations, promote data-driven discovery, drive scientific innovation, increase transparency and improve reproducibility.

Data Storage

In a Cloud Native model, geospatial data should be stored in cloud object storage and be available to anyone through a public url. Commercial object storage providers include Amazon S3, Google Cloud Storage, and Microsoft Azure Blob Storage. For the academic and research communities, storage options include Cyverse Data Store and Open Storage Network.

Utilizing existing cloud storage infrastructure eliminates the need for data providers to maintain their own servers (and APIs) and allows them to focus more on their mission. It empowers individuals to easily share their data on the web while eliminating costly local storage. Cloud storage is also a great solution for never losing your data due to hardware failure. Abernathey et al. (2021) provides a excellent overview of the benefits of cloud storage.

Data Sharing

Sharing of geospatial data from cloud storage can be greatly improved with the use of Cloud Native Formats. These formats are designed to be used in the cloud and are built for http streaming. This means that users can view and analyze data without downloading the entire dataset. Analagously, this is like going from the original Napster model of downloading music to the Spotify model of streaming music.

There is a cloud native format to fit almost any geospatial data type. For example, GeoJSON is a cloud native format for vector data, Cloud Optimized GeoTIFF (COG) is a cloud native format for raster data, and Cloud Optimized Point Cloud (COPC) is a cloud native format for point cloud data. Zarr is cloud native formats that can be used for multi-dimensional raster data.

geojson

cog

stac

zarr xarray copc



Another effort to improve data sharing is the SpatioTemporal Asset Catalog (STAC). It is a json based metadata and API standard for geospatial data. It's goal is to make geospatial data more easily worked with, indexed, and discovered.

Cloud Compute

Cloud Computing is all about moving geospatial analysis and computation from your local machine to a remote machine in the cloud. This approach has several advantages over traditional desktop computing.

Advantages of Cloud Computing

Advantages Include:

  • With cloud computing, you can avoid the upfront cost and complexity of owning and maintaining your own IT infrastructure
  • Cloud computing allows groups or individuals to scale up (or down) their operations quickly as their computing needs change
  • Cloud computing allows users to access their data and applications from anywhere, on any device, at any time
  • Geospatial in the cloud empowers colleagues to work directly together on the same data, models, and applications. The same way that Google Docs allows multiple people to work on the same document at the same time


The most prominent examples of geospatial cloud computing are Google Earth Engine and Microsoft Planetary Computer. Both of these platforms provide access to large amounts of geospatial data and have built-in tools for analysis and visualization. You have the ability to bring your own data to these platforms by storing in cloud storage and using cloud native formats.


For those that use the ESRI ecosystem, there is ArcGIS Online. Proprietary licenses are required to use ArcGIS Online, but many university and government agencies provide these to their GIS employees.

arcgis arcgis

Check out the Cloud Computing section to learn more.



Resources

Abernathey, R. P. et al. (2021) "Cloud-Native Repositories for Big Scientific Data," in Computing in Science & Engineering, vol. 23, no. 2, pp. 26-35, 1 March-April 2021, https://doi.org/10.1109/MCSE.2021.3059437

Chris Holmes's blog on Cloud Native

Cloud-Native Geospatial Outreach Event - April 2022 - from Open Geospatial Consortium (OGS)

Gentemann, C. L., et al. (2021). “Science Storms the Cloud”. AGU Advances, 2, e2020AV000354. https://doi.org/10.1029/2020AV000354

Mapscaping Podcast on Cloud Native Geospatial


Last update: 2023-12-05