Welcome to skeletons’s documentation!
Geo-skeletons is an easy extendable way to build python classes to represent gridded and non-gridded geophysical data. It provides the basic structure to work with spherical and cartesian coordinates, and can be extended to data-specific objects by adding coordinates, data variables and logical masks. It also integrates with the geo-parameters module to provide easy access to metadata.
Quick Installation
To get started with geo-skeletons, you can install it with pip or conda:
$ pip install geo-skeletons
$ conda install -c conda-forge geo-skeletons
Using PointSkeletons
Creation
A point cloud can easily be represented as an unstructured skeleton:
from geo_skeletons import PointSkeleton
points = PointSkeleton(lon=(30.0,30.1,30.5), lat=(60.0,60.0,60.8))
Accessing coordinates
Coordinates are now accessible both in the (native) spherical and the (non-native) cartesian versions, where the best UTM-zone has automatically been deduced for the given lon/lat coordinates:
>>> points.lon()
array([30. , 30.1, 30.5])
>>> points.x()
array([332705.17887694, 338279.24910909, 363958.72298911])
>>> points.utm.zone()
(36, 'V')
>>> points.x(utm=(33,'W'))
array([1331808.13859715, 1337286.99102854, 1338117.44887216])
>>> points.utm.set((33,'W'))
Setting UTM (33, 'W')
>>> points.x()
array([1331808.13859715, 1337286.99102854, 1338117.44887216])
Both methods have a strict option that only returns the coordinates if matches the native standard for the skeleton and returns None otherwise:
>>> points.lon(strict=True)
array([30. , 30.1, 30.5])
>>> points.x(strict=True)
>>>
To get the native coordinates you can use the native option in either method:
>>> points.lon(native=True)
array([30. , 30.1, 30.5])
>>> points.x(native=True)
array([30. , 30.1, 30.5])
The strict and native options are implemented to make it easier to use skeletons inside larger modules, since it removes the need for a lot of checks.
The .lonlat() and .xy() methods gives a tuple with arrays of coordinates:
>>> points.lonlat()
(array([30. , 30.1, 30.5]), array([60. , 60. , 60.8]))
Underlying xarray Dataset structure
The skeleton information is stored in an xarray Dataset. This will be convenient when the skeleton is expanded by additional coordinates or variables:
>>> points
<PointSkeleton (Skeleton)>
------------------------------ Coordinate groups -------------------------------
Spatial: (inds)
Grid: (inds)
Gridpoint: *empty*
All: (inds)
------------------------------------ Xarray ------------------------------------
<xarray.Dataset> Size: 72B
Dimensions: (inds: 3)
Coordinates:
* inds (inds) int64 24B 0 1 2
Data variables:
lat (inds) float64 24B 60.0 60.0 60.8
lon (inds) float64 24B 30.0 30.1 30.5
Attributes:
name: LonelySkeleton
utm_zone: 36V
--------------------------------------------------------------------------------
>>> points.ds()
<xarray.Dataset> Size: 72B
Dimensions: (inds: 3)
Coordinates:
* inds (inds) int64 24B 0 1 2
Data variables:
lat (inds) float64 24B 60.0 60.0 60.8
lon (inds) float64 24B 30.0 30.1 30.5
Attributes:
name: LonelySkeleton
utm_zone: 36V
Since there is no gridded structure, these vectors are given as a function of indeces:
>>> points.inds()
array([0, 1, 2])
The size of the skeleton, defined by the indeces, is given by:
>>> points.size()
(3,)
However, the size of the x- and y-vectors are given by:
>>> points.nx()
3
>>> points.ny()
3
The core of the Skeleton will still keep track of lon/lat as coordinates (to differentiate from actual data variables that can be added)
Using GriddedSkeletons
Creation and setting spacing
Unlike a PointSkeleton, a GriddedSkeleton is defined on an area:
from geo_skeletons import GriddedSkeleton
grid = GriddedSkeleton(lon=(30.0,30.5), lat=(60.0,60.8))
A structure can be given gy setting a desired spacing. The basic method is to specify the number of grid points in each direction:
grid.set_spacing(nx=6, ny=9)
The spacing can also be set by defining a longitude/latitude spacing, and appoximate spacing in metres, or an approximate spacing in nautical miles:
grid.set_spacing(dlon=0.1, dlat=0.1)
grid.set_spacing(dx=6000, dy=8000) # 6 km resolution in longitude and 8 km resolution in latitude direction
grid.set_spacing(dm=8000) # Same as dx=dy=dm
grid.set_spacing(dnmi=0.5) # Half a nautical mile spacing
Since the grid has been defined by the edges, the desired spacing can sometimes only be approximated:
>>> grid.set_spacing(dlon=0.024, dlat=0.09)
>>> grid.dlon()
0.023809523809523808
>>> grid.dlat()
0.08888888888888857
If setting an exact spacing is more important than preserving the exact area, then this can be forced, and the area is changed slightly instead:
>>> grid = GriddedSkeleton(lon=(30.0,30.5), lat=(60.0,60.8))
>>> grid.edges('lon')
(30.0, 30.5)
>>> grid.edges('lat')
(60.0, 60.8)
>>> grid.set_spacing(dlon=0.024, dlat=0.09, floating_edge=True)
>>> grid.dlon()
0.024000000000000063
>>> grid.dlat()
0.09000000000000025
>>> grid.edges('lon')
(30.0, 30.504)
>>> grid.edges('lat')
(60.0, 60.81)
Accessing the coordinates
Setting the spacing creates longitude an latitude vectors:
>>> grid.set_spacing(nx=6, ny=9)
>>> grid.lon()
array([30. , 30.1, 30.2, 30.3, 30.4, 30.5])
>>> grid.lat()
array([60. , 60.1, 60.2, 60.3, 60.4, 60.5, 60.6, 60.7, 60.8])
Note, that these methods gives the vectors defining the grid, not the longitude and latitude coordinates of ALL the points (as for the PointSkeleton). Nonetheless, the .lonlat() method can be used:
>>> grid.lonlat()
(array([30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4,
30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3,
30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2,
30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1,
30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5]), array([60. , 60. , 60. , 60. , 60. , 60. , 60.1, 60.1, 60.1, 60.1, 60.1,
60.1, 60.2, 60.2, 60.2, 60.2, 60.2, 60.2, 60.3, 60.3, 60.3, 60.3,
60.3, 60.3, 60.4, 60.4, 60.4, 60.4, 60.4, 60.4, 60.5, 60.5, 60.5,
60.5, 60.5, 60.5, 60.6, 60.6, 60.6, 60.6, 60.6, 60.6, 60.7, 60.7,
60.7, 60.7, 60.7, 60.7, 60.8, 60.8, 60.8, 60.8, 60.8, 60.8]))
Therefore, a list of coordinates for all the points (regardless of which type of skeleton you have) can always be retrieved as:
lon, lat = grid.lonlat()
A longitude grid (meshgrid) can be retrieved both in spherical and cartesian coordinates:
>>> grid.longrid()
array([[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5],
[30. , 30.1, 30.2, 30.3, 30.4, 30.5]])
>>> grid.xgrid()
array([[332705.17887694, 338279.24910909, 343853.56603089,
349428.12110114, 355002.90578231, 360577.91154036],
[333210.54871541, 338767.76600796, 344325.23072873,
349882.93431135, 355440.86819304, 360999.02381454],
[333716.43160159, 339256.77897873, 344797.37450988,
350338.20960391, 355879.27567328, 361420.56413389],
[334222.82599447, 339746.28653178, 345269.9959361 ,
350793.94559198, 356318.12688754, 361842.53121431],
[334729.73035137, 340236.28717586, 345743.09356756,
351250.14088716, 356757.42049886, 362264.92377027],
[335237.14312789, 340726.77941807, 346216.66596286,
351706.79409952, 357197.15516879, 362687.74051482],
[335745.06277796, 341217.76176387, 346690.711679 ,
352163.90383761, 357637.32955744, 363110.98015957],
[336253.48775383, 341709.2327171 , 347165.2292714 ,
352621.46870845, 358077.9423234 , 363534.64141473],
[336762.41650608, 342201.19077999, 347640.21729394,
353079.48731754, 358518.99212383, 363958.72298911]])
As with the PointSkeleton, the GriddedSkeleton can also give its cartesian coordinates. However, since any UTM zone will be rotated in respect to the spherically defined structured grid, asking for the cartesian x-vector will cause a slight rotation. In other words, the same points can’t be reguratly gridded in both shperical and UTM spaces :
>>> grid.utm.zone()
(36, 'V')
>>> grid.x()
Regridding spherical grid to cartesian coordinates will cause a rotation! Use '_, y = skeleton.xy()' to get a list of all points.
array([334729.73035137, 340236.28717586, 345743.09356756, 351250.14088716,
356757.42049886, 362264.92377027])
To get the exact UTM coordinates of ALL the points, one can simply use:
x, y = grid.xy()
Underlying xarray Dataset structure
As with the PointSkeleton, the structure is in an xarray Dataset (but longitude and latitue vectors are now coordinates, not variables):
The size of the x- and y-vectors are given by:
>>> grid.nx()
6
>>> grid.ny()
9
The size of the skeleton, defined by the lon-lat vecotrs, is given by:
>>> grid.size()
(9, 6)
As an example, a cartesian PointSkeleton could be created from the spherical GriddedSkeleton as:
x, y = grid.xy()
points = PointSkeleton(x=x, y=y)
points.set_utm(grid.utm())
This now creates a new structure:
>>> points.ds()
<xarray.Dataset>
Dimensions: (inds: 54)
Coordinates:
* inds (inds) int64 0 1 2 3 4 5 6 7 8 9 ... 44 45 46 47 48 49 50 51 52 53
Data variables:
y (inds) float64 6.655e+06 6.655e+06 ... 6.743e+06 6.743e+06
x (inds) float64 3.327e+05 3.383e+05 3.439e+05 ... 3.585e+05 3.64e+05
Attributes:
utm_zone: 36V
Nonetheless, converting between different types of skeletons is usually not needed, since a list of all the points in UTM-coordinates can be extracted directly from the spherical GriddedSkeleton. In other words, the two following lines give the exact same result:
x, y = grid.xy()
x, y = points.xy()
The core of the Skeleton will still keep track of lon/lat as coordinates (to differentiate from actual data variables that can be added)
>>> grid.core.data_vars()
[]
>>> grid.core.coords()
['lat', 'lon']
>>> grid.core.coords('init') # A list of coordinates that are needed to initialize this class
['lat', 'lon']
Finding points
Skeleton classes are equipped with a dedicated method to find points:
>>> data = GriddedSkeleton(lon=(3,5), lat=(60,61))
>>> data.set_spacing(dm=1000)
>>> grid.yank_point(lon=2.98, lat=60.01)
{'inds_x': array([0]), 'inds_y': array([1]), 'dx': array([1120.6812202])}
>>> points = PointSkeleton(lon=(3.0, 4.0, 5.0), lat=(60.0, 60.0, 61.0))
>>> points.yank_point(lon=2.98, lat=60.01)
{'inds': array([0]), 'dx': array([1576.18628188])}
This gives the corresponding index and distance to nearest point (in metres). The method can also be used to find several points:
>>> points.yank_point(lon=(2.98, 4.1), lat=(60.01, 60.01))
{'inds': array([0, 1]), 'dx': array([1576.18628188, 5687.27546285])}
We can also find several closest points to one point:
>>> grid.yank_point(lon=2.98, lat=60.01, npoints=4)
{'inds_x': array([0, 0, 0, 1]), 'inds_y': array([1, 2, 0, 1]), 'dx': array([1120.6812202 , 1428.55452856, 1576.18628188, 2131.94091801])}
If we know that a PointSkeleton structure has been raveled from a gridded structure, we can also find the gridded indeces after the fact:
>>> lon, lat = grid.lonlat()
>>> raveled_grid = PointSkeleton(lon=lon, lat=lat)
>>> raveled_grid.yank_point(lon=2.98, lat=60.01, npoints=4)
{'inds': array([111, 222, 0, 112]), 'dx': array([1120.6812202 , 1428.55452856, 1576.18628188, 2131.94091801])}
>>> raveled_grid.yank_point(lon=2.98, lat=60.01, npoints=4, gridded_shape=grid.size())
{'inds': array([111, 222, 0, 112]), 'dx': array([1120.6812202 , 1428.55452856, 1576.18628188, 2131.94091801]), 'inds_x': array([0, 0, 0, 1]), 'inds_y': array([1, 2, 0, 1])}
Expanding skeletons
Adding data variables
The real benefit from skeletons is that you can define your own objects while still retaining all the original methods that are defined to handle the spatial coordinates. As an example, lets define an object that contains gridded significant wave height (hs) data:
from geo_skeletons import GriddedSkeleton
from geo_skeletons.decorators import add_datavar
@add_datavar(name='hs', default_value=0.)
class WaveHeight(GriddedSkeleton):
pass
The core of the Skeleton now keeps track of the data variable even though no data has been set.
A better way is to use the geo-parameters package to add data variables:
from geo_skeletons import GriddedSkeleton
from geo_skeletons.decorators import add_datavar
import geo_parameters as gp
@add_datavar(name=gp.wave.Hs('hs'), default_value=0.)
class WaveHeight(GriddedSkeleton):
pass
>>> WaveHeight.core
------------------------------ Coordinate groups -------------------------------
Spatial: (y, x)
Grid: (y, x)
Gridpoint: *empty*
All: (y, x)
------------------------------------- Data -------------------------------------
Variables:
hs (y, x): 0.0 [m] sea_surface_wave_significant_height
Masks:
*empty*
Magnitudes:
*empty*
Directions:
*empty*
--------------------------------------------------------------------------------
- The
@add_datavardecorator does the following: adds a data variable names
hssets the default value for this variable to 0.0
creates a
.hs()method to access that variablecreates a method
.set_hs()that takes a numpy array
Using this new objects is now much like using the GriddedSkeleton, but the xarray Dataset now contains a data variable.
data = WaveHeight(lon=(3,5), lat=(60,61))
data.set_spacing(dm=1000)
>>> data.dx()
996.8080414102963
>>> data.dy()
1001.4167247779651
>>> data.hs() # Gives the default value
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
>>> data.hs(strict=True) # Returns 'None' if data has not been set
None
This new data variable is contained in the underlying xarray Dataset
>>> data.ds() # hs-not saved since it is not set
<xarray.Dataset> Size: 2kB
Dimensions: (lat: 112, lon: 111)
Coordinates:
* lat (lat) float64 896B 60.0 60.01 60.02 60.03 ... 60.98 60.99 61.0
* lon (lon) float64 888B 3.0 3.018 3.036 3.055 ... 4.945 4.964 4.982 5.0
Data variables:
*empty*
Attributes:
name: LonelySkeleton
>>> data.set_hs(5.6)
>>> data.ds()
<xarray.Dataset> Size: 101kB
Dimensions: (lat: 112, lon: 111)
Coordinates:
* lat (lat) float64 896B 60.0 60.01 60.02 60.03 ... 60.98 60.99 61.0
* lon (lon) float64 888B 3.0 3.018 3.036 3.055 ... 4.945 4.964 4.982 5.0
Data variables:
hs (lat, lon) float64 99kB 5.6 5.6 5.6 5.6 5.6 ... 5.6 5.6 5.6 5.6 5.6
Attributes:
name: LonelySkeleton
The newly created .hs() method works directly with the xarray Dataset, and same slicing etc. possibilities work out of the box
>>> data.hs(lon=slice(4,4.5))
array([[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6],
[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6],
[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6],
...,
[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6],
[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6],
[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6]])
>>> data.hs(lon=3)
array([5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6])
# Return xr.DataArray
# Not that metadata is added when using a geo-parameter to create the class
>>> data.hs(lon=2.98, method='nearest', data_array=True)
<xarray.DataArray 'hs' (lat: 112)> Size: 896B
array([5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6])
Coordinates:
* lat (lat) float64 896B 60.0 60.01 60.02 60.03 ... 60.98 60.99 61.0
lon float64 8B 3.0
Attributes:
short_name: hs
long_name: significant_wave_height
standard_name: sea_surface_wave_significant_height
units: m
Adding additional coordinates
Although all skeletons will have the x-y or lon-lat spatial coordinates, decorators can be used to add additional coordinates that the possible data is defined on. As an example, lets create a structure to represent wind data on a spherical 3D grid:
from geo_skeletons import GriddedSkeleton
from geo_skeletons.decorators import add_coord, add_datavar
@add_datavar(gp.wind.Wind, default_value=10.0)
@add_coord(name="z", grid_coord=True)
class WindSpeed(GriddedSkeleton):
pass
>>> WindSpeed.core
------------------------------ Coordinate groups -------------------------------
Spatial: (y, x)
Grid: (y, x, z)
Gridpoint: *empty*
All: (y, x, z)
------------------------------------- Data -------------------------------------
Variables:
ff (y, x, z): 10.0 [m/s] wind_speed
Masks:
*empty*
Magnitudes:
*empty*
Directions:
*empty*
--------------------------------------------------------------------------------
- The
@add_coorddecorator does the following: adds a coordinate named
zto signify the heightadds the requirement to provide the variable
zwhen initializing the skeletonadds a method
.z()to access the values of this coordinateadds a method
.set_z_spacing()to set the spacing of the coordinate (onlydxandnxkeywords possible)
import numpy as np
grid = WindSpeed(lon=(25, 30), lat=(58, 62), z=(0, 100))
grid.set_spacing(dnmi=1)
grid.set_z_spacing(dx=1)
new_data = np.random.rand(grid.ny(), grid.nx(), len(grid.z()))
grid.set_ff(new_data)
>>> grid
<WindSpeed (GriddedSkeleton)>
------------------------------ Coordinate groups -------------------------------
Spatial: (lat, lon)
Grid: (lat, lon, z)
Gridpoint: *empty*
All: (lat, lon, z)
------------------------------------ Xarray ------------------------------------
<xarray.Dataset> Size: 29MB
Dimensions: (lat: 241, lon: 151, z: 101)
Coordinates:
* lat (lat) float64 2kB 58.0 58.02 58.03 58.05 ... 61.95 61.97 61.98 62.0
* lon (lon) float64 1kB 25.0 25.03 25.07 25.1 ... 29.9 29.93 29.97 30.0
* z (z) float64 808B 0.0 1.0 2.0 3.0 4.0 ... 96.0 97.0 98.0 99.0 100.0
Data variables:
ff (lat, lon, z) float64 29MB 0.8337 0.1296 0.6254 ... 0.2336 0.1013
--------------------------------------------------------------------------------
If we want to have several variables that are defined on different coordinate, we have some flixibility by using coordinate groups. If we, e.g. want to add a data variable that defines the surface rougness we can let the z-coordinate default to beeing added as a gridpoint coordinate instead of grid coordinate. We can then define over which coordinate group the data variable is added:
@add_datavar("roughness", default_value=0.0, coord_group="grid")
@add_datavar(gp.wind.Wind, default_value=10.0, coord_group="all")
@add_coord(name="z", grid_coord=False)
class WindSpeed(GriddedSkeleton):
pass
>>> WindSpeed.core
------------------------------ Coordinate groups -------------------------------
Spatial: (y, x)
Grid: (y, x)
Gridpoint: (z)
All: (y, x, z)
------------------------------------- Data -------------------------------------
Variables:
ff (y, x, z): 10.0 [m/s] wind_speed
roughness (y, x): 0.0
Masks:
*empty*
Magnitudes:
*empty*
Directions:
*empty*
--------------------------------------------------------------------------------
Adding a time, frequency and direction variable
The adding of certain coordinates have dedicated method. As an example, let us use a wave spectrum:
from geo_skeletons import GriddedSkeleton
from geo_skeletons.decorators import add_datavar, add_time, add_frequency, add_direction
import geo_parameters as gp
@add_datavar(gp.wave.Efth("spec"), coord_group="all")
@add_direction()
@add_frequency()
@add_time()
class Spectrum(GriddedSkeleton):
pass
Spectrum.core
------------------------------ Coordinate groups -------------------------------
Spatial: (y, x)
Grid: (time, y, x)
Gridpoint: (freq, dirs)
All: (time, y, x, freq, dirs)
------------------------------------- Data -------------------------------------
Variables:
spec (time, y, x, freq, dirs): 0.0 [m**2*s/rad] sea_surface_wave_directional_variance_spectral_density
Masks:
*empty*
Magnitudes:
*empty*
Directions:
*empty*
--------------------------------------------------------------------------------
We can see that by default time is added as a grid coordinate, while frequency and direction are added as gridpoint coordinates. This doesn’t affect the spectrum directly, since it is still defined over all coordinates. However, we can think of grid coordinates as “outer” coordinates, defining the grid on which we have a certain object that is defined over “inner” (gridpoint) coordinates.
data = Spectrum(
lon=(10, 20),
lat=(50, 60),
freq=np.arange(0, 1, 0.1),
dirs=np.arange(0, 360, 10),
time=("2020-01-01 00:00", "2020-01-02 00:00", "6h"),
)
The add_frequency and add_direction creates method that can give out angular frequencies or the directions in radians
>>> data.freq()
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
>>> data.freq(angular=True)
array([0. , 0.62831853, 1.25663706, 1.88495559, 2.51327412,
3.14159265, 3.76991118, 4.39822972, 5.02654825, 5.65486678])
>>> data.dirs()
array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120,
130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250,
260, 270, 280, 290, 300, 310, 320, 330, 340, 350])
>>> data.dirs(angular=True)
array([0. , 0.17453293, 0.34906585, 0.52359878, 0.6981317 ,
0.87266463, 1.04719755, 1.22173048, 1.3962634 , 1.57079633,
1.74532925, 1.91986218, 2.0943951 , 2.26892803, 2.44346095,
2.61799388, 2.7925268 , 2.96705973, 3.14159265, 3.31612558,
3.4906585 , 3.66519143, 3.83972435, 4.01425728, 4.1887902 ,
4.36332313, 4.53785606, 4.71238898, 4.88692191, 5.06145483,
5.23598776, 5.41052068, 5.58505361, 5.75958653, 5.93411946,
6.10865238])
Useing the add_time decorator adds some time-specific functionality to the Skeleton:
>>> data.time()
DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 06:00:00',
'2020-01-01 12:00:00', '2020-01-01 18:00:00',
'2020-01-02 00:00:00'],
dtype='datetime64[ns]', freq=None)
>>> data.time(datetime=False)
['2020-01-01 00:00:00', '2020-01-01 06:00:00', '2020-01-01 12:00:00', '2020-01-01 18:00:00', '2020-01-02 00:00:00']
>>> data.time(datetime=False, fmt='%Y%m%d')
['20200101', '20200101', '20200101', '20200101', '20200102']
To get only one time stamp for each day (or year or month or hour):
>>> data.days() # or .years(), .months(), .hours()
DatetimeIndex(['2020-01-01', '2020-01-02'], dtype='datetime64[ns]', freq=None)
>>> data.days(datetime=False)
['2020-01-01', '2020-01-02']
>>> data.days(datetime=False, fmt='%Y%m%d')
['20200101', '20200102']
We can now also get the size of a single spectral object by looking at the size defined by the gridpoint coordinates
>>> data.size('gridpoint')
(10, 36)
The coordinate groups also determines the order of iteration (iteration this way is easy, but slow for larger data sets):
for one_spec in data:
pass
>>> one_spec
<Spectrum (GriddedSkeleton)>
------------------------------ Coordinate groups -------------------------------
Spatial: (lat, lon)
Grid: (time, lat, lon)
Gridpoint: (freq, dirs)
All: (time, lat, lon, freq, dirs)
------------------------------------ Xarray ------------------------------------
<xarray.Dataset> Size: 392B
Dimensions: (time: 1, lat: 1, lon: 1, freq: 10, dirs: 36)
Coordinates:
* time (time) datetime64[ns] 8B 2020-01-02
* lat (lat) int64 8B 60
* lon (lon) int64 8B 20
* freq (freq) float64 80B 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
* dirs (dirs) int64 288B 0 10 20 30 40 50 60 ... 300 310 320 330 340 350
Data variables:
*empty*
Attributes:
name: LonelySkeleton
---------------------------------- Empty data ----------------------------------
Empty variables:
spec (time, lat, lon, freq, dirs): 0.0 [m**2*s/rad] sea_surface_wave_directional_variance_spectral_density
--------------------------------------------------------------------------------
>>> one_spec.spec().shape # The method automatically squeezes out trivial dimensions
(10, 36)
>>> spec.spec(squeeze=False).shape
(1, 1, 1, 10, 36)
>>> one_spec.set_spec(one_spec.spec()*2) # Automatically expands trivial dimensions
Setting data
The Skeletons set-method is helpful if the data at hand does not conform exactly to the format specified in the Skeleton. First, the Skeleton can squeeze and unsqueeze trivial dimensions at will. Let’s define a Skeleton for significant wave height from a wave forecast:
from geo_skeletons import GriddedSkeleton
from geo_skeletons.decorators import add_datavar, add_time
import geo_parameters as gp
@add_datavar(gp.wave.Hs)
@add_time()
class WaveForecast(GriddedSkeleton):
pass
data = WaveForecast(
lon=(0, 10),
lat=(50, 60),
time=("2020-01-01 00:00", "2020-01-03 00:00", "3h"),
)
data.set_spacing(nx=11, ny=21)
The hs-variable now has a given shape:
>>> data.shape('hs')
(17, 21, 11) # time, lat, lon
Let us say we get some data that is taken from an ensemble forecast, but cut to only the deterministic member, but that dimension has not been squeezed out:
>>> eps_hs.shape
(17, 1, 21, 11)
The Skeleton set-method can automatically get rid of this extra trivial dimension:
data.set_hs(eps_hs) # Or data.set('hs', eps_hs)
This could have been done by np.squeeze, but doing that can cause troubles if we have other trivial dimensions, such as only one time.
The Skeleton can also re-arange non-trivial dimensions. Say that we get data from another forecast that has dimensions ['lon','lat','ensembe','time'] (Skeleton has ['time','lat','lon']). We can then specify the order in the set method:
>>> eps2_hs.shape
(11, 21, 1, 17)
>>> data.set_hs(eps2_hs)
*** geo_skeletons.errors.DataWrongDimensionError: Data has shape (11, 21, 1, 17), but coordinates define a shape (17, 21, 11)!!!
>>> data.set_hs(eps2_hs, coords=['lon', 'lat', 'time']) # Not that we only need to specify order of non-trivial dimensions
>>> data.shape('hs')
(17, 21, 11)
If we give in an xr.DataArray instead of a np.array, then the order of the coordinates wil be parsed from the metadata in the DataArray if possible, but this can always be overridden by using the coords keyword.
If we only have two dimensions, then we can allow the Skeleton to try a transpose of the data:
@add_datavar(gp.wave.Hs)
class WaveData(GriddedSkeleton):
pass
grid = WaveData(lon=(10, 20), lat=(50, 60))
grid.set_spacing(nx=10, ny=20)
data = np.ones((10, 20))
>>> grid.set_hs(data)
*** geo_skeletons.errors.DataWrongDimensionError: Data has shape (10, 20), but coordinates define a shape (20, 10)!!!
>>> grid.set_hs(data, allow_transpose=True)
>>> np.mean(grid.hs())
np.float64(1.0)
As previously, trivial dimensions are squeezed out prior to the transpose:
>>> data = np.full((1,10,1,20,1), 5.5)
>> grid.set_hs(data)
*** geo_skeletons.errors.DataWrongDimensionError: Data has shape (1, 10, 1, 20, 1), but coordinates define a shape (20, 10)!!!
>>> grid.set_hs(data, allow_transpose=True)
>>> np.mean(grid.hs())
np.float64(5.5)
Adding magnitudes and directions
The recommended way to add magnitudes and directions are to add the components as data variables and add a magnitude and direction connected to those components:
from geo_skeletons import GriddedSkeleton
from geo_skeletons.decorators import add_datavar, add_magnitude
import geo_parameters as gp
@add_magnitude(gp.wind.Wind("u"), x="ux", y="uy", direction=gp.wind.WindDir("ud"))
@add_datavar(gp.wind.YWind("uy"))
@add_datavar(gp.wind.XWind("ux"))
class Wind(GriddedSkeleton):
pass
>>> Wind.core
------------------------------ Coordinate groups -------------------------------
Spatial: (y, x)
Grid: (y, x)
Gridpoint: *empty*
All: (y, x)
------------------------------------- Data -------------------------------------
Variables:
ux (y, x): 0.1 [m/s] x_wind
uy (y, x): 0.1 [m/s] y_wind
Masks:
*empty*
Magnitudes:
u: magnitude of (ux,uy) [m/s] wind_speed
Directions:
ud: direction of (ux,uy) [deg] wind_from_direction
--------------------------------------------------------------------------------
- The
@add_magnitudedecorator does the following: adds a magnitude
ufor the componentsuxanduyadds a direction (optional)
udfor the componentsuxanduyparses the standard_name of gp.wind.WindDir to identify that this is a ‘from’ direction
creates
.u()and.ud()methods to access the magnitude and directioncreates
.set_u()and.set_ud()methods to set the magnitude and direction
Now we can still set the wind speed and direction, but only the components will be stored in the Dataset (obviously the components can also still be set and accessed normally):
>>> data = Wind(lon=(10, 14), lat=(50, 60))
>>> data.set_u(10)
>>> data.set_ud(45)
>>> data
<Wind (GriddedSkeleton)>
------------------------------ Coordinate groups -------------------------------
Spatial: (lat, lon)
Grid: (lat, lon)
Gridpoint: *empty*
All: (lat, lon)
------------------------------------ Xarray ------------------------------------
<xarray.Dataset> Size: 96B
Dimensions: (lat: 2, lon: 2)
Coordinates:
* lat (lat) int64 16B 50 60
* lon (lon) int64 16B 10 14
Data variables:
ux (lat, lon) float64 32B -7.071 -7.071 -7.071 -7.071
uy (lat, lon) float64 32B -7.071 -7.071 -7.071 -7.071
Attributes:
name: LonelySkeleton
-------------------------- Magnitudes and directions ---------------------------
u: magnitude of (ux,uy) [m/s] wind_speed
ud: direction of (ux,uy) [deg] wind_from_direction
--------------------------------------------------------------------------------
The directionality of the wind was parsed from the standard name of the geo-parameter gp.wind.WindDir. Therefore the components could be set properly and unambiguously. The direction can be set and retrieved in different directional conventions:
>>> data.set_ud(270, dir_type='to')
>>> data.ud(dir_type='from')
array([[90., 90.],
[90., 90.]])
>>> data.ud(dir_type='to')
array([[270., 270.],
[270., 270.]])
>>> data.ud(dir_type='math')
array([[3.14159265, 3.14159265],
[3.14159265, 3.14159265]])
This same functionality is also present for individual parameters of directional nature (that typically don’t use components):
# This is direction from, as determined by the standard name
# To get direction to, use gp.wave.DirpTo
@add_datavar(gp.wave.Dirp)
class Wave(GriddedSkeleton):
pass
>>> data.core
------------------------------ Coordinate groups -------------------------------
Spatial: (lat, lon)
Grid: (lat, lon)
Gridpoint: *empty*
All: (lat, lon)
------------------------------------- Data -------------------------------------
Variables:
dirp (lat, lon): 0.0 [deg] sea_surface_wave_from_direction_at_variance_spectral_density_maximum
Masks:
*empty*
Magnitudes:
*empty*
Directions:
*empty*
--------------------------------------------------------------------------------
data = Wave(lon=(10, 14), lat=(50, 60))
data.set_dirp(45, dir_type="to")
>>> data.dirp() # default dir_type is the one that the parameter is, in this case 'from'
array([[225., 225.],
[225., 225.]])
Adding masks
Logical masks (for example marking land points or points of interest) can be added to the skeletons. To for example add a land-sea mask to a gridded significant wave height:
from geo_skeleton.decorators import add_datavar, add_mask
import geo_parameters as gp
@add_datavar(gp.wave.Hs("hs"), default_value=0.0)
@add_mask(
name="sea",
default_value=0,
coord_group="grid",
opposite_name="land",
)
class WaveHeight(GriddedSkeleton):
pass
The geo-skeleton now has a land-sea mask. Only the sea mask is stored in the xr.Dataset and the land mask is calculated as the inverse of the sea mask.
>>> grid = WaveHeight(lon=(0, 1), lat=(60, 61), name="Wavegrid")
>>> grid
<WaveHeight (GriddedSkeleton)>
------------------------------ Coordinate groups -------------------------------
Spatial: (lat, lon)
Grid: (lat, lon)
Gridpoint: *empty*
All: (lat, lon)
------------------------------------ Xarray ------------------------------------
<xarray.Dataset> Size: 32B
Dimensions: (lat: 2, lon: 2)
Coordinates:
* lat (lat) int64 16B 60 61
* lon (lon) int64 16B 0 1
Data variables:
*empty*
Attributes:
name: Wavegrid
---------------------------------- Empty data ----------------------------------
Empty variables:
hs (lat, lon): 0.0 [m] sea_surface_wave_significant_height
Empty masks:
sea_mask (lat, lon): False
land_mask (lat, lon): True
--------------------------------------------------------------------------------
Both are still gettable, and also the points corresponding to the masks have methods:
>>> grid.sea_mask()
array([[False, False],
[False, False]])
>>> grid.land_mask()
array([[ True, True],
[ True, True]])
>>> grid.sea_points()
(array([], dtype=int64), array([], dtype=int64))
>>> grid.land_points()
(array([0, 1, 0, 1]), array([60, 60, 61, 61]))
We can also set either the land mask or the sea mask, and that defines boths masks, since they are per definition opposites of each other:
>>> grid.set_sea_mask([[True, False], [True, False]])
>>> grid.sea_mask()
array([[ True, False],
[ True, False]])
>>> grid.sea_points()
(array([0, 0]), array([60, 61]))
>>> grid.land_mask()
array([[False, True],
[False, True]])
>>> grid.land_points()
(array([1, 1]), array([60, 61]))
It is also possible to connect the masks to a certain data variable. E.g. to have only positive significant wave height values to be sea points:
@add_datavar(gp.wave.Hs("hs"), default_value=0.0)
@add_mask(
name="sea",
default_value=0,
coord_group="grid",
opposite_name="land",
triggered_by="hs",
valid_range=(0, None),
range_inclusive=False,
)
class WaveHeight(GriddedSkeleton):
pass
Here triggered_by means that setting the data variable hs sets the land-sea masks. valid_range means the hs values that define the primary mask (here sea_mask), and range_inclusive=False means that 0 does not become a sea point:
>>> grid.set_hs([[0,1],[2,-999]])
>>> grid.hs()
array([[ 0, 1],
[ 2, -999]])
>>> grid.sea_mask()
array([[False, True],
[ True, False]])
>>> grid.land_points()
(array([0, 1]), array([60, 61]))
The masks are still settable independently if needed (e.g. if onw wants to apply a separate land mask after the fact), but if they are not set then they are always “in sync” with the significant wave height data.
Plotting the data
Skeletons don’t have any plotting functionality built in, but since it wraps around xarray datasets, the xarray plotting functions can be used. We can take as an example the wind class defined in the “Adding magnitudes and directions” section. Note, that even though the wind speed (added as a magnitude of the components) is not stored in the underlying xr.Dataset, we can still get it as a DataArray and plot it:
import numpy as np
import matplotlib.pyplot as plt
data = Wind(lon=(0, 10), lat=(50, 60))
data.set_spacing(dm=1000)
new_data = np.random.rand(*data.size())
data.set_u(new_data)
data.u(data_array=True).plot()
plt.show()
Here, the data_array=True tells the method to return the xarray data array instead of a numpy array of the values.