{
"cells": [
{
"cell_type": "markdown",
"id": "ca57bfa8-586b-4e10-9260-f872d414f56d",
"metadata": {},
"source": [
"[](https://githubtocolab.com/tyson-swetnam/agic-2022/blob/main/docs/notebooks/xarray.ipynb)\n",
"[](https://tyson-swetnam.github.io/agic-2022/notebooks/xarray/)\n",
"\n",
"\n",
"## Xarray example Jupyter Notebook\n",
"\n",
"Installation using `pip`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bac5c5fc",
"metadata": {},
"outputs": [],
"source": [
"# to run this cell uncomment the line below by removing the # \n",
"!pip install xarray==0.20.2 dask netCDF4 bottleneck pooch"
]
},
{
"cell_type": "markdown",
"id": "45fd00d3-bee0-40b6-8a3a-dbbd6252fc06",
"metadata": {},
"source": [
"## Xarray installation\n",
"Install Xarray and some of its dependencies if not already installed using `conda`:\n",
"\n",
"``` conda install -c conda-forge xarray==0.20.2 dask netCDF4 bottleneck pooch```\n",
"\n",
"It may take a while resolving installation environments.\n",
"If it is successful, will install other package dependecies.\n",
"\n",
"Xarray comes with a collection of datasets to explore: [xarray.tutorial.open_dataset](https://docs.xarray.dev/en/stable/generated/xarray.tutorial.open_dataset.html)\n",
"\n",
"Available datasets:\n",
"\n",
"`\"air_temperature\"`: NCEP reanalysis subset\n",
"\n",
"`\"air_temperature_gradient\"`: NCEP reanalysis subset with approximate x,y gradients\n",
"\n",
"`\"basin_mask\"`: Dataset with ocean basins marked using integers\n",
"\n",
"`\"ASE_ice_velocity\"`: MEaSUREs InSAR-Based Ice Velocity of the Amundsen Sea Embayment, Antarctica, Version 1\n",
"\n",
"`\"rasm\"`: Output of the Regional Arctic System Model (RASM)\n",
"\n",
"`\"ROMS_example\"`: Regional Ocean Model System (ROMS) output\n",
"\n",
"`\"tiny\"`: small synthetic dataset with a 1D data variable\n",
"\n",
"`\"era5-2mt-2019-03-uk.grib\"`: ERA5 temperature data over the UK\n",
"\n",
"`\"eraint_uvz\"`: data from ERA-Interim reanalysis, monthly averages of upper level data\n",
"\n",
"`\"ersstv5\"`: NOAA’s Extended Reconstructed Sea Surface Temperature monthly averages\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "dd41fd10-1a13-4c6f-83e2-87d10597abba",
"metadata": {},
"outputs": [],
"source": [
"# Load required libraries\n",
"\n",
"%matplotlib inline\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import dask.array as da\n",
"import dask.dataframe as dd\n",
"import pooch\n",
"import xarray as xr\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"\n",
"plt.rcParams['figure.figsize'] = (8,5)\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "97b19748-031b-481f-997f-6433d2470320",
"metadata": {},
"outputs": [],
"source": [
"# Load the air_temperature dataset and define a xarray datastructure\n",
"# 4 x Daily Air temperature in degrees K at sigma level 995 \n",
"# (2013-01-01 to 2014-12-31)\n",
"# Spatial Coverage\n",
"# 2.5 degree x 2.5 degree global grids (144x73) [2.5 degree = 172.5 miles]\n",
"# 0.0E to 357.5E, 90.0N to 90.0S\n",
"\n",
"ds = xr.tutorial.open_dataset('air_temperature')\n",
"#ds.info()\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ff9dd0d7-be7c-4e1d-9d14-2a32bc07b435",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
<xarray.Dataset>\n", "Dimensions: (lat: 25, time: 2920, lon: 53)\n", "Coordinates:\n", " * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0\n", " * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0\n", " * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00\n", "Data variables:\n", " air (time, lat, lon) float32 ...\n", "Attributes:\n", " Conventions: COARDS\n", " title: 4x daily NMC reanalysis (1948)\n", " description: Data is from NMC initialized reanalysis\\n(4x/day). These a...\n", " platform: Model\n", " references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
<xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)>\n", "[3869000 values with dtype=float32]\n", "Coordinates:\n", " * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0\n", " * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0\n", " * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00\n", "Attributes:\n", " long_name: 4xDaily Air temperature at sigma level 995\n", " units: degK\n", " precision: 2\n", " GRIB_id: 11\n", " GRIB_name: TMP\n", " var_desc: Air temperature\n", " dataset: NMC Reanalysis\n", " level_desc: Surface\n", " statistic: Individual Obs\n", " parent_stat: Other\n", " actual_range: [185.16 322.1 ]
<xarray.Dataset>\n", "Dimensions: (lat: 25, month: 12, lon: 53)\n", "Coordinates:\n", " * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0\n", " * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0\n", " * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12\n", "Data variables:\n", " air (month, lat, lon) float32 246.3 246.4 246.2 ... 297.6 297.6 297.5
<xarray.Dataset>\n", "Dimensions: (lat: 25, lon: 53, time: 2920)\n", "Coordinates:\n", " * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0\n", " * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0\n", " * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00\n", " month (time) int64 1 1 1 1 1 1 1 1 1 1 ... 12 12 12 12 12 12 12 12 12 12\n", "Data variables:\n", " air (time, lat, lon) float32 -5.15 -3.886 -2.715 ... -1.375 -1.848
<xarray.DataArray (time: 4, space: 3)>\n", "array([[0.17633122, 0.59377299, 0.30699365],\n", " [0.4243673 , 0.90601059, 0.2009795 ],\n", " [0.37157228, 0.28094102, 0.56942557],\n", " [0.60069197, 0.88338457, 0.07061534]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04\n", " * space (space) <U2 'IA' 'IL' 'IN'
<xarray.DataArray (time: 2, space: 3)>\n", "array([[0.17633122, 0.59377299, 0.30699365],\n", " [0.4243673 , 0.90601059, 0.2009795 ]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02\n", " * space (space) <U2 'IA' 'IL' 'IN'" ], "text/plain": [ "
<xarray.DataArray ()>\n", "array(0.17633122)\n", "Coordinates:\n", " time datetime64[ns] 2000-01-01\n", " space <U2 'IA'" ], "text/plain": [ "
<xarray.DataArray (time: 4, space: 2)>\n", "array([[0.30699365, 0.59377299],\n", " [0.2009795 , 0.90601059],\n", " [0.56942557, 0.28094102],\n", " [0.07061534, 0.88338457]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04\n", " * space (space) <U2 'IN' 'IL'" ], "text/plain": [ "
<xarray.DataArray (time: 2)>\n", "array([0.17633122, 0.4243673 ])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02\n", " space <U2 'IA'" ], "text/plain": [ "
<xarray.DataArray (time: 2, space: 3)>\n", "array([[0.17633122, 0.59377299, 0.30699365],\n", " [0.4243673 , 0.90601059, 0.2009795 ]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02\n", " * space (space) <U2 'IA' 'IL' 'IN'" ], "text/plain": [ "
<xarray.DataArray (time: 1, space: 1)>\n", "array([[0.17633122]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01\n", " * space (space) <U2 'IA'" ], "text/plain": [ "
<xarray.DataArray (space: 3)>\n", "array([0.17633122, 0.59377299, 0.30699365])\n", "Coordinates:\n", " time datetime64[ns] 2000-01-01\n", " * space (space) <U2 'IA' 'IL' 'IN'" ], "text/plain": [ "
<xarray.DataArray (time: 4, space: 3)>\n", "array([[0.17633122, 0.59377299, 0.30699365],\n", " [0.4243673 , 0.90601059, 0.2009795 ],\n", " [0.37157228, 0.28094102, 0.56942557],\n", " [0.60069197, 0.88338457, 0.07061534]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04\n", " * space (space) <U2 'IA' 'IL' 'IN'" ], "text/plain": [ "
<xarray.DataArray (time: 4, space: 1)>\n", "array([[0.17633122],\n", " [0.4243673 ],\n", " [0.37157228],\n", " [0.60069197]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04\n", " * space (space) <U2 'IA'" ], "text/plain": [ "
<xarray.DataArray (time: 3, space: 2)>\n", "array([[0.90601059, 0.2009795 ],\n", " [0.28094102, 0.56942557],\n", " [0.88338457, 0.07061534]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 2000-01-02 2000-01-03 2000-01-04\n", " * space (space) <U2 'IL' 'IN'" ], "text/plain": [ "
<xarray.DataArray (time: 4, space: 3)>\n", "array([[0.17633122, 0.59377299, 0.30699365],\n", " [0.4243673 , 0.90601059, 0.2009795 ],\n", " [0.37157228, 0.28094102, 0.56942557],\n", " [0.60069197, 0.88338457, 0.07061534]])\n", "Coordinates:\n", " * space (space) <U2 'IA' 'IL' 'IN'\n", "Dimensions without coordinates: time" ], "text/plain": [ "
<xarray.Dataset>\n", "Dimensions: (n_prof: 163, n_param: 3, n_levels: 72, n_calib: 1, n_history: 0)\n", "Dimensions without coordinates: n_prof, n_param, n_levels, n_calib, n_history\n", "Data variables: (12/65)\n", " data_type object b'Argo profile '\n", " format_version object b'3.1 '\n", " handbook_version object b'1.2 '\n", " reference_date_time object b'19500101000000'\n", " date_creation object b'20100919031633'\n", " date_update object b'20190406013003'\n", " ... ...\n", " history_parameter (n_history, n_prof) object \n", " history_start_pres (n_history, n_prof) float32 \n", " history_stop_pres (n_history, n_prof) float32 \n", " history_previous_value (n_history, n_prof) float32 \n", " history_qctest (n_history, n_prof) object \n", " crs int32 -2147483647\n", "Attributes: (12/49)\n", " title: Argo float vertical profile\n", " institution: FR GDAC\n", " source: Argo float\n", " history: 2019-10-20T10:17:46Z boyer convAGDAC.f90...\n", " references: https://www.nodc.noaa.gov/argo/\n", " user_manual_version: 3.1\n", " ... ...\n", " time_coverage_end: 2015-01-11T22:56:00Z\n", " time_coverage_duration: point\n", " time_coverage_resolution: point\n", " gadr_ConventionVersion: GADR-3.0\n", " gadr_program: convAGDAC.f90\n", " gadr_programVersion: 1.2
<xarray.DataArray 'temp_adjusted' (n_prof: 5, n_levels: 5)>\n", "array([[13.237, 13.102, 12.747, 10.805, 9.631],\n", " [14.183, 14.107, 13.924, 12.084, 11.191],\n", " [13.845, 13.796, 13.748, 13.25 , 9.889],\n", " [13.567, 13.561, 13.525, 13.223, 10.316],\n", " [11.334, 10.086, 9.03 , 9.08 , 8.094]], dtype=float32)\n", "Dimensions without coordinates: n_prof, n_levels\n", "Attributes:\n", " long_name: Sea temperature in-situ ITS-90 scale\n", " standard_name: sea_water_temperature\n", " units: degree_Celsius\n", " valid_min: -2.5\n", " valid_max: 40.0\n", " C_format: %9.3f\n", " FORTRAN_format: F9.3\n", " resolution: 0.001