As we shall see, it is relatively easy to do with two variables of the same rank, and that is what the `where()`

function is designed for. But, depending on which Python package you are using, this problem can be more complicated with variables of different ranks, or not!
In this blog, we will look at both scenarios.

We first need to import some packages, read in some data and calculate the maximum in time at all spatial points. As usual, I'm going to use CMIP data as it's easily accessible.

In [1]:

```
%matplotlib inline
import xarray as xr
import numpy as np
```

In [2]:

```
# Let's get the 2m temperature and the sensible heat flux.
# We do not want to decode the time unit.
# as datetime objects can't be plotted.
ds = xr.open_dataset('/g/data/rr3/publications/CMIP5/output1/CSIRO-BOM/ACCESS1-0/historical/mon/atmos/Amon/r1i1p1/latest/tas/tas_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc',
decode_times=False)
ds1 = xr.open_dataset('/g/data/rr3/publications/CMIP5/output1/CSIRO-BOM/ACCESS1-0/historical/mon/atmos/Amon/r1i1p1/latest/hfss/hfss_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc',
decode_times=False)
tas = ds.tas
hfss = ds1.hfss
```

In [3]:

```
tas
```

Out[3]:

In [4]:

```
hfss
```

Out[4]:

In [5]:

```
# Calculate the maximum temperature at each spatial point along the time axis:
tas_max=tas.max('time')
tas_max
```

Out[5]:

In [6]:

```
# Get the values of the sensible heat flux when temperature is maximum
hfss_at_max= hfss.where(tas == tas_max)
hfss_at_max.mean('time').plot(size=8)
hfss_at_max
```

It really depends a lot on what object you are using. It is really easy with `xarray`

arrays and not so much with `numpy`

arrays

To illustrate this, we are going to look at the problem of finding at what times the 2m temperature is maximum at each point

`xarray`

¶In this case, it is as simple as the previous case with variables of the same rank. Note that the result is a full 3D array with lots of missing values (NaN). There are only values when `tas`

is at a maximum for that point in space.

In [7]:

```
time_at_max=tas.time.where(tas == tas_max)
time_at_max.mean('time').plot(size=8)
time_at_max
```

Out[7]:

**Note:** The time unit is number of days since 0001-01-01, so one would have to convert to a more usable format for scientific usage

`numpy`

arrays¶We'll use `tas`

and `tas.time`

again but without using the `xarray`

built-in methods. If I try the `numpy`

equivalent solution:

In [8]:

```
nptime_at_max = np.where(tas == tas_max, tas.time, np.nan)
```

This does not work as `numpy`

(and `xarray`

) can only perform a `where()`

operation on arrays of the same shape. When provided with arrays of different shapes (like in this case), both `numpy`

and `xarray`

will try to make them conform with each other by expanding the smallest arrays to the shape of the biggest array. The values of the smallest arrays are copied across all missing dimensions. This process is called `broadcasting`

.

The problem is `numpy`

is quite conservative in its broadcasting rules and can not perform it in the case above. `xarray`

is much better at broadcasting as it uses all the metadata stored in the DataArray to identify the dimensions in each array. That is the reason why `DataArray.where()`

worked above.

If you have `numpy`

arrays, the best solution is to quickly transform your `numpy`

arrays into `xarray`

DataArrays. You only need to name the dimensions. Any name will do, as long as you give the same name for the common dimensions in the :

In [9]:

```
#DataArray.values will return only the numpy array with the values and none of the metadata stored in the DataArray
new_tas = xr.DataArray(tas.values, dims=('t','l','L'))
new_time = xr.DataArray(tas.time.values, dims='t')
new_max = new_tas.max(dim='t')
new_time_at_max = new_time.where(new_tas == new_max)
new_time_at_max
```

Out[9]:

As you see, you can keep your values and the time in separate arrays (new_tas and new_time). You don't need to add the time array as a coordinate to the 3D array. Although it can be a good idea to do it in general as it keeps the data self-describing.

The other advantage of using `xarray`

is it allows you to extend the functionalities beyond the built-in functions a lot more easily. The idea here is to use the `groupby().apply()`

workflow to apply a user-defined function.
You could obviously use it as a solution to the current problem:

In [10]:

```
def check_max(data):
return np.where(data == tas_max, data.time, np.nan)
tasmax_dates = tas.groupby('time').apply(check_max)
tasmax_dates.mean('time').plot(size=8)
tasmax_dates
```

Out[10]:

But this is slower than using the built-in functions directly. And it doesn't keep the attributes (like time_bnds, units, calendar, etc)! It is then best to keep this approach for more complex problems that can not be easily solved otherwise.