{ "cells": [ { "cell_type": "markdown", "id": "cd5fac1c", "metadata": {}, "source": [ "# Example to improve code design\n", "**Claire Carouge, CLEX CMS**\n", "\n", "We are often asked how to organise a code for tasks such as:\n", "* performing the same analysis on different models or grid resolutions\n", "* performing an analysis on one dataset for different options (e.g. different seasons)\n", "\n", "The principle to apply in all cases is to avoid repeating code. In Python and most of other programming languages, to avoid repetition one can use dictionaries and/or functions and/or classes.\n", "\n", "In this blog, we are going to use a small example to illustrate those techniques. Those techniques can often be used interchangeably. Usually, functions and classes result in more flexible and reusable code.\n", "\n", "Note: the analysis chosen here does not need to use dictionaries and functions to be done. We only do so here to illustrate the generic process. The direct way to perform the analysis is noted at the end of this notebook." ] }, { "cell_type": "code", "execution_count": 1, "id": "bf4138ab", "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import numpy as np\n", "from pathlib import Path\n", "import matplotlib.pyplot as plt\n", "import cartopy.crs as ccrs" ] }, { "cell_type": "markdown", "id": "d9999091", "metadata": {}, "source": [ "## Badly organised code\n", "The following code is correct as it gives the correct answer but it is prone to errors because the code is repeated. This increases the risk of typos. For more complex analyses, it can also make it harder to test as it is not modular. And it can be time-consuming to add extra data to the analysis." ] }, { "cell_type": "code", "execution_count": 2, "id": "96c35c4b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.DataArray 'tsl' (time: 1980, lat: 96, lon: 96)>\n", "dask.array<getitem, shape=(1980, 96, 96), dtype=float32, chunksize=(1980, 96, 96), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * lat (lat) float32 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0\n", " * lon (lon) float32 360.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2\n", " solth float32 0.01419\n", " * time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00\n", "Attributes:\n", " long_name: Temperature of Soil\n", " units: K\n", " online_operation: average\n", " cell_methods: area: mean where land time: mean\n", " interval_operation: 1800 s\n", " interval_write: 1 month\n", " standard_name: soil_temperature\n", " description: Temperature of each soil layer. Reported as "missin...\n", " history: none\n", " cell_measures: area: areacella
\n",
"
| \n",
" \n", " \n", " | \n", "
array([-90. , -88.10526 , -86.210526, -84.31579 , -82.42105 , -80.52631 ,\n", " -78.63158 , -76.73684 , -74.8421 , -72.947365, -71.052635, -69.1579 ,\n", " -67.26316 , -65.36842 , -63.473682, -61.57895 , -59.68421 , -57.789474,\n", " -55.894737, -54. , -52.105263, -50.210526, -48.31579 , -46.42105 ,\n", " -44.526318, -42.63158 , -40.736843, -38.842106, -36.94737 , -35.05263 ,\n", " -33.157894, -31.263159, -29.368422, -27.473684, -25.578947, -23.68421 ,\n", " -21.789474, -19.894737, -18. , -16.105263, -14.210526, -12.315789,\n", " -10.421053, -8.526316, -6.631579, -4.736842, -2.842105, -0.947368,\n", " 0.947368, 2.842105, 4.736842, 6.631579, 8.526316, 10.421053,\n", " 12.315789, 14.210526, 16.105263, 18. , 19.894737, 21.789474,\n", " 23.68421 , 25.578947, 27.473684, 29.368422, 31.263159, 33.157894,\n", " 35.05263 , 36.94737 , 38.842106, 40.736843, 42.63158 , 44.526318,\n", " 46.42105 , 48.31579 , 50.210526, 52.105263, 54. , 55.894737,\n", " 57.789474, 59.68421 , 61.57895 , 63.473682, 65.36842 , 67.26316 ,\n", " 69.1579 , 71.052635, 72.947365, 74.8421 , 76.73684 , 78.63158 ,\n", " 80.52631 , 82.42105 , 84.31579 , 86.210526, 88.10526 , 90. ],\n", " dtype=float32)
array([360. , 3.75, 7.5 , 11.25, 15. , 18.75, 22.5 , 26.25, 30. ,\n", " 33.75, 37.5 , 41.25, 45. , 48.75, 52.5 , 56.25, 60. , 63.75,\n", " 67.5 , 71.25, 75. , 78.75, 82.5 , 86.25, 90. , 93.75, 97.5 ,\n", " 101.25, 105. , 108.75, 112.5 , 116.25, 120. , 123.75, 127.5 , 131.25,\n", " 135. , 138.75, 142.5 , 146.25, 150. , 153.75, 157.5 , 161.25, 165. ,\n", " 168.75, 172.5 , 176.25, 180. , 183.75, 187.5 , 191.25, 195. , 198.75,\n", " 202.5 , 206.25, 210. , 213.75, 217.5 , 221.25, 225. , 228.75, 232.5 ,\n", " 236.25, 240. , 243.75, 247.5 , 251.25, 255. , 258.75, 262.5 , 266.25,\n", " 270. , 273.75, 277.5 , 281.25, 285. , 288.75, 292.5 , 296.25, 300. ,\n", " 303.75, 307.5 , 311.25, 315. , 318.75, 322.5 , 326.25, 330. , 333.75,\n", " 337.5 , 341.25, 345. , 348.75, 352.5 , 356.25], dtype=float32)
array(0.01418965, dtype=float32)
array([cftime.DatetimeNoLeap(1850, 1, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n", " ...,\n", " cftime.DatetimeNoLeap(2014, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " dtype=object)
<xarray.DataArray 'tsl' (time: 1980, lat: 96, lon: 96)>\n", "dask.array<getitem, shape=(1980, 96, 96), dtype=float32, chunksize=(1980, 96, 96), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * lat (lat) float32 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0\n", " * lon (lon) float32 360.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2\n", " solth float32 0.01419\n", " * time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00\n", "Attributes:\n", " long_name: Temperature of Soil\n", " units: K\n", " online_operation: average\n", " cell_methods: area: mean where land time: mean\n", " interval_operation: 1800 s\n", " interval_write: 1 month\n", " standard_name: soil_temperature\n", " description: Temperature of each soil layer. Reported as "missin...\n", " history: none\n", " cell_measures: area: areacella
\n",
"
| \n",
" \n", " \n", " | \n", "
array([-90. , -88.10526 , -86.210526, -84.31579 , -82.42105 , -80.52631 ,\n", " -78.63158 , -76.73684 , -74.8421 , -72.947365, -71.052635, -69.1579 ,\n", " -67.26316 , -65.36842 , -63.473682, -61.57895 , -59.68421 , -57.789474,\n", " -55.894737, -54. , -52.105263, -50.210526, -48.31579 , -46.42105 ,\n", " -44.526318, -42.63158 , -40.736843, -38.842106, -36.94737 , -35.05263 ,\n", " -33.157894, -31.263159, -29.368422, -27.473684, -25.578947, -23.68421 ,\n", " -21.789474, -19.894737, -18. , -16.105263, -14.210526, -12.315789,\n", " -10.421053, -8.526316, -6.631579, -4.736842, -2.842105, -0.947368,\n", " 0.947368, 2.842105, 4.736842, 6.631579, 8.526316, 10.421053,\n", " 12.315789, 14.210526, 16.105263, 18. , 19.894737, 21.789474,\n", " 23.68421 , 25.578947, 27.473684, 29.368422, 31.263159, 33.157894,\n", " 35.05263 , 36.94737 , 38.842106, 40.736843, 42.63158 , 44.526318,\n", " 46.42105 , 48.31579 , 50.210526, 52.105263, 54. , 55.894737,\n", " 57.789474, 59.68421 , 61.57895 , 63.473682, 65.36842 , 67.26316 ,\n", " 69.1579 , 71.052635, 72.947365, 74.8421 , 76.73684 , 78.63158 ,\n", " 80.52631 , 82.42105 , 84.31579 , 86.210526, 88.10526 , 90. ],\n", " dtype=float32)
array([360. , 3.75, 7.5 , 11.25, 15. , 18.75, 22.5 , 26.25, 30. ,\n", " 33.75, 37.5 , 41.25, 45. , 48.75, 52.5 , 56.25, 60. , 63.75,\n", " 67.5 , 71.25, 75. , 78.75, 82.5 , 86.25, 90. , 93.75, 97.5 ,\n", " 101.25, 105. , 108.75, 112.5 , 116.25, 120. , 123.75, 127.5 , 131.25,\n", " 135. , 138.75, 142.5 , 146.25, 150. , 153.75, 157.5 , 161.25, 165. ,\n", " 168.75, 172.5 , 176.25, 180. , 183.75, 187.5 , 191.25, 195. , 198.75,\n", " 202.5 , 206.25, 210. , 213.75, 217.5 , 221.25, 225. , 228.75, 232.5 ,\n", " 236.25, 240. , 243.75, 247.5 , 251.25, 255. , 258.75, 262.5 , 266.25,\n", " 270. , 273.75, 277.5 , 281.25, 285. , 288.75, 292.5 , 296.25, 300. ,\n", " 303.75, 307.5 , 311.25, 315. , 318.75, 322.5 , 326.25, 330. , 333.75,\n", " 337.5 , 341.25, 345. , 348.75, 352.5 , 356.25], dtype=float32)
array(0.01418965, dtype=float32)
array([cftime.DatetimeNoLeap(1850, 1, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n", " ...,\n", " cftime.DatetimeNoLeap(2014, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " dtype=object)
<xarray.DataArray 'tsl' (time: 1980, lat: 96, lon: 96)>\n", "dask.array<getitem, shape=(1980, 96, 96), dtype=float32, chunksize=(1980, 96, 96), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * lat (lat) float32 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0\n", " * lon (lon) float32 360.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2\n", " solth float32 0.01419\n", " * time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00\n", "Attributes:\n", " long_name: Temperature of Soil\n", " units: K\n", " online_operation: average\n", " cell_methods: area: mean where land time: mean\n", " interval_operation: 1800 s\n", " interval_write: 1 month\n", " standard_name: soil_temperature\n", " description: Temperature of each soil layer. Reported as "missin...\n", " history: none\n", " cell_measures: area: areacella
\n",
"
| \n",
" \n", " \n", " | \n", "
array([-90. , -88.10526 , -86.210526, -84.31579 , -82.42105 , -80.52631 ,\n", " -78.63158 , -76.73684 , -74.8421 , -72.947365, -71.052635, -69.1579 ,\n", " -67.26316 , -65.36842 , -63.473682, -61.57895 , -59.68421 , -57.789474,\n", " -55.894737, -54. , -52.105263, -50.210526, -48.31579 , -46.42105 ,\n", " -44.526318, -42.63158 , -40.736843, -38.842106, -36.94737 , -35.05263 ,\n", " -33.157894, -31.263159, -29.368422, -27.473684, -25.578947, -23.68421 ,\n", " -21.789474, -19.894737, -18. , -16.105263, -14.210526, -12.315789,\n", " -10.421053, -8.526316, -6.631579, -4.736842, -2.842105, -0.947368,\n", " 0.947368, 2.842105, 4.736842, 6.631579, 8.526316, 10.421053,\n", " 12.315789, 14.210526, 16.105263, 18. , 19.894737, 21.789474,\n", " 23.68421 , 25.578947, 27.473684, 29.368422, 31.263159, 33.157894,\n", " 35.05263 , 36.94737 , 38.842106, 40.736843, 42.63158 , 44.526318,\n", " 46.42105 , 48.31579 , 50.210526, 52.105263, 54. , 55.894737,\n", " 57.789474, 59.68421 , 61.57895 , 63.473682, 65.36842 , 67.26316 ,\n", " 69.1579 , 71.052635, 72.947365, 74.8421 , 76.73684 , 78.63158 ,\n", " 80.52631 , 82.42105 , 84.31579 , 86.210526, 88.10526 , 90. ],\n", " dtype=float32)
array([360. , 3.75, 7.5 , 11.25, 15. , 18.75, 22.5 , 26.25, 30. ,\n", " 33.75, 37.5 , 41.25, 45. , 48.75, 52.5 , 56.25, 60. , 63.75,\n", " 67.5 , 71.25, 75. , 78.75, 82.5 , 86.25, 90. , 93.75, 97.5 ,\n", " 101.25, 105. , 108.75, 112.5 , 116.25, 120. , 123.75, 127.5 , 131.25,\n", " 135. , 138.75, 142.5 , 146.25, 150. , 153.75, 157.5 , 161.25, 165. ,\n", " 168.75, 172.5 , 176.25, 180. , 183.75, 187.5 , 191.25, 195. , 198.75,\n", " 202.5 , 206.25, 210. , 213.75, 217.5 , 221.25, 225. , 228.75, 232.5 ,\n", " 236.25, 240. , 243.75, 247.5 , 251.25, 255. , 258.75, 262.5 , 266.25,\n", " 270. , 273.75, 277.5 , 281.25, 285. , 288.75, 292.5 , 296.25, 300. ,\n", " 303.75, 307.5 , 311.25, 315. , 318.75, 322.5 , 326.25, 330. , 333.75,\n", " 337.5 , 341.25, 345. , 348.75, 352.5 , 356.25], dtype=float32)
array(0.01418965, dtype=float32)
array([cftime.DatetimeNoLeap(1850, 1, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n", " ...,\n", " cftime.DatetimeNoLeap(2014, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " dtype=object)
<xarray.DataArray 'tsl' (time: 1980, lat: 96, lon: 96)>\n", "dask.array<getitem, shape=(1980, 96, 96), dtype=float32, chunksize=(1980, 96, 96), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * lat (lat) float32 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0\n", " * lon (lon) float32 360.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2\n", " solth float32 0.01419\n", " * time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00\n", "Attributes:\n", " long_name: Temperature of Soil\n", " units: K\n", " online_operation: average\n", " cell_methods: area: mean where land time: mean\n", " interval_operation: 1800 s\n", " interval_write: 1 month\n", " standard_name: soil_temperature\n", " description: Temperature of each soil layer. Reported as "missin...\n", " history: none\n", " cell_measures: area: areacella
\n",
"
| \n",
" \n", " \n", " | \n", "
array([-90. , -88.10526 , -86.210526, -84.31579 , -82.42105 , -80.52631 ,\n", " -78.63158 , -76.73684 , -74.8421 , -72.947365, -71.052635, -69.1579 ,\n", " -67.26316 , -65.36842 , -63.473682, -61.57895 , -59.68421 , -57.789474,\n", " -55.894737, -54. , -52.105263, -50.210526, -48.31579 , -46.42105 ,\n", " -44.526318, -42.63158 , -40.736843, -38.842106, -36.94737 , -35.05263 ,\n", " -33.157894, -31.263159, -29.368422, -27.473684, -25.578947, -23.68421 ,\n", " -21.789474, -19.894737, -18. , -16.105263, -14.210526, -12.315789,\n", " -10.421053, -8.526316, -6.631579, -4.736842, -2.842105, -0.947368,\n", " 0.947368, 2.842105, 4.736842, 6.631579, 8.526316, 10.421053,\n", " 12.315789, 14.210526, 16.105263, 18. , 19.894737, 21.789474,\n", " 23.68421 , 25.578947, 27.473684, 29.368422, 31.263159, 33.157894,\n", " 35.05263 , 36.94737 , 38.842106, 40.736843, 42.63158 , 44.526318,\n", " 46.42105 , 48.31579 , 50.210526, 52.105263, 54. , 55.894737,\n", " 57.789474, 59.68421 , 61.57895 , 63.473682, 65.36842 , 67.26316 ,\n", " 69.1579 , 71.052635, 72.947365, 74.8421 , 76.73684 , 78.63158 ,\n", " 80.52631 , 82.42105 , 84.31579 , 86.210526, 88.10526 , 90. ],\n", " dtype=float32)
array([360. , 3.75, 7.5 , 11.25, 15. , 18.75, 22.5 , 26.25, 30. ,\n", " 33.75, 37.5 , 41.25, 45. , 48.75, 52.5 , 56.25, 60. , 63.75,\n", " 67.5 , 71.25, 75. , 78.75, 82.5 , 86.25, 90. , 93.75, 97.5 ,\n", " 101.25, 105. , 108.75, 112.5 , 116.25, 120. , 123.75, 127.5 , 131.25,\n", " 135. , 138.75, 142.5 , 146.25, 150. , 153.75, 157.5 , 161.25, 165. ,\n", " 168.75, 172.5 , 176.25, 180. , 183.75, 187.5 , 191.25, 195. , 198.75,\n", " 202.5 , 206.25, 210. , 213.75, 217.5 , 221.25, 225. , 228.75, 232.5 ,\n", " 236.25, 240. , 243.75, 247.5 , 251.25, 255. , 258.75, 262.5 , 266.25,\n", " 270. , 273.75, 277.5 , 281.25, 285. , 288.75, 292.5 , 296.25, 300. ,\n", " 303.75, 307.5 , 311.25, 315. , 318.75, 322.5 , 326.25, 330. , 333.75,\n", " 337.5 , 341.25, 345. , 348.75, 352.5 , 356.25], dtype=float32)
array(0.01418965, dtype=float32)
array([cftime.DatetimeNoLeap(1850, 1, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1850, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n", " ...,\n", " cftime.DatetimeNoLeap(2014, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(2014, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " dtype=object)