Welcome to skeletons's documentation! ===================================== **Geo-skeletons** is an easy extendable way to build python classes to represent gridded and non-gridded geophysical data. It provides the basic structure to work with spherical and cartesian coordinates, and can be extended to data-specific objects by adding coordinates, data variables and logical masks. It also integrates with the geo-parameters module to provide easy access to metadata. Quick Installation ============================================= To get started with geo-skeletons, you can install it with pip or conda: .. code-block:: shell $ pip install geo-skeletons .. code-block:: shell $ conda install -c conda-forge geo-skeletons Using PointSkeletons ============================================= Creation --------------------------------------------- A point cloud can easily be represented as an unstructured skeleton: .. code-block:: python from geo_skeletons import PointSkeleton points = PointSkeleton(lon=(30.0,30.1,30.5), lat=(60.0,60.0,60.8)) Accessing coordinates --------------------------------------------- Coordinates are now accessible both in the (native) spherical and the (non-native) cartesian versions, where the best UTM-zone has automatically been deduced for the given lon/lat coordinates: .. code-block:: python >>> points.lon() array([30. , 30.1, 30.5]) >>> points.x() array([332705.17887694, 338279.24910909, 363958.72298911]) >>> points.utm.zone() (36, 'V') >>> points.x(utm=(33,'W')) array([1331808.13859715, 1337286.99102854, 1338117.44887216]) >>> points.utm.set((33,'W')) Setting UTM (33, 'W') >>> points.x() array([1331808.13859715, 1337286.99102854, 1338117.44887216]) Both methods have a ``strict`` option that only returns the coordinates if matches the native standard for the skeleton and returns ``None`` otherwise: .. code-block:: python >>> points.lon(strict=True) array([30. , 30.1, 30.5]) >>> points.x(strict=True) >>> To get the native coordinates you can use the ``native`` option in either method: .. code-block:: python >>> points.lon(native=True) array([30. , 30.1, 30.5]) >>> points.x(native=True) array([30. , 30.1, 30.5]) The ``strict`` and ``native`` options are implemented to make it easier to use skeletons inside larger modules, since it removes the need for a lot of checks. The .lonlat() and .xy() methods gives a tuple with arrays of coordinates: .. code-block:: python >>> points.lonlat() (array([30. , 30.1, 30.5]), array([60. , 60. , 60.8])) Underlying xarray Dataset structure -------------------------------------------- The skeleton information is stored in an xarray Dataset. This will be convenient when the skeleton is expanded by additional coordinates or variables: .. code-block:: python >>> points ------------------------------ Coordinate groups ------------------------------- Spatial: (inds) Grid: (inds) Gridpoint: *empty* All: (inds) ------------------------------------ Xarray ------------------------------------ Size: 72B Dimensions: (inds: 3) Coordinates: * inds (inds) int64 24B 0 1 2 Data variables: lat (inds) float64 24B 60.0 60.0 60.8 lon (inds) float64 24B 30.0 30.1 30.5 Attributes: name: LonelySkeleton utm_zone: 36V -------------------------------------------------------------------------------- >>> points.ds() Size: 72B Dimensions: (inds: 3) Coordinates: * inds (inds) int64 24B 0 1 2 Data variables: lat (inds) float64 24B 60.0 60.0 60.8 lon (inds) float64 24B 30.0 30.1 30.5 Attributes: name: LonelySkeleton utm_zone: 36V Since there is no gridded structure, these vectors are given as a function of indeces: .. code-block:: python >>> points.inds() array([0, 1, 2]) The size of the skeleton, defined by the indeces, is given by: .. code-block:: python >>> points.size() (3,) However, the size of the *x- and y-vectors* are given by: .. code-block:: python >>> points.nx() 3 >>> points.ny() 3 The core of the Skeleton will still keep track of lon/lat as coordinates (to differentiate from actual data variables that can be added) .. code-block:: python >>> points.core.data_vars() # 'lon' and 'lat' not included, since they are not "proper" data variables [] >>> points.core.coords() ['inds'] >>> points.core.coords('init') # A list of coordinates that are needed to initialize this class ['lat', 'lon'] Using GriddedSkeletons ============================================= Creation and setting spacing --------------------------------------------- Unlike a PointSkeleton, a GriddedSkeleton is defined on an area: .. code-block:: python from geo_skeletons import GriddedSkeleton grid = GriddedSkeleton(lon=(30.0,30.5), lat=(60.0,60.8)) A structure can be given gy setting a desired spacing. The basic method is to specify the number of grid points in each direction: .. code-block:: python grid.set_spacing(nx=6, ny=9) The spacing can also be set by defining a longitude/latitude spacing, and appoximate spacing in metres, or an approximate spacing in nautical miles: .. code-block:: python grid.set_spacing(dlon=0.1, dlat=0.1) grid.set_spacing(dx=6000, dy=8000) # 6 km resolution in longitude and 8 km resolution in latitude direction grid.set_spacing(dm=8000) # Same as dx=dy=dm grid.set_spacing(dnmi=0.5) # Half a nautical mile spacing Since the grid has been defined by the edges, the desired spacing can sometimes only be approximated: .. code-block:: python >>> grid.set_spacing(dlon=0.024, dlat=0.09) >>> grid.dlon() 0.023809523809523808 >>> grid.dlat() 0.08888888888888857 If setting an exact spacing is more important than preserving the exact area, then this can be forced, and the area is changed slightly instead: .. code-block:: python >>> grid = GriddedSkeleton(lon=(30.0,30.5), lat=(60.0,60.8)) >>> grid.edges('lon') (30.0, 30.5) >>> grid.edges('lat') (60.0, 60.8) >>> grid.set_spacing(dlon=0.024, dlat=0.09, floating_edge=True) >>> grid.dlon() 0.024000000000000063 >>> grid.dlat() 0.09000000000000025 >>> grid.edges('lon') (30.0, 30.504) >>> grid.edges('lat') (60.0, 60.81) Accessing the coordinates --------------------------------------------- Setting the spacing creates longitude an latitude vectors: .. code-block:: python >>> grid.set_spacing(nx=6, ny=9) >>> grid.lon() array([30. , 30.1, 30.2, 30.3, 30.4, 30.5]) >>> grid.lat() array([60. , 60.1, 60.2, 60.3, 60.4, 60.5, 60.6, 60.7, 60.8]) Note, that these methods gives the vectors defining the grid, **not** the longitude and latitude coordinates of ALL the points (as for the PointSkeleton). Nonetheless, the ``.lonlat()`` method can be used: .. code-block:: python >>> grid.lonlat() (array([30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5, 30. , 30.1, 30.2, 30.3, 30.4, 30.5]), array([60. , 60. , 60. , 60. , 60. , 60. , 60.1, 60.1, 60.1, 60.1, 60.1, 60.1, 60.2, 60.2, 60.2, 60.2, 60.2, 60.2, 60.3, 60.3, 60.3, 60.3, 60.3, 60.3, 60.4, 60.4, 60.4, 60.4, 60.4, 60.4, 60.5, 60.5, 60.5, 60.5, 60.5, 60.5, 60.6, 60.6, 60.6, 60.6, 60.6, 60.6, 60.7, 60.7, 60.7, 60.7, 60.7, 60.7, 60.8, 60.8, 60.8, 60.8, 60.8, 60.8])) Therefore, a list of coordinates for all the points (regardless of which type of skeleton you have) can always be retrieved as: .. code-block:: python lon, lat = grid.lonlat() A longitude grid (meshgrid) can be retrieved both in spherical and cartesian coordinates: .. code-block:: python >>> grid.longrid() array([[30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5], [30. , 30.1, 30.2, 30.3, 30.4, 30.5]]) >>> grid.xgrid() array([[332705.17887694, 338279.24910909, 343853.56603089, 349428.12110114, 355002.90578231, 360577.91154036], [333210.54871541, 338767.76600796, 344325.23072873, 349882.93431135, 355440.86819304, 360999.02381454], [333716.43160159, 339256.77897873, 344797.37450988, 350338.20960391, 355879.27567328, 361420.56413389], [334222.82599447, 339746.28653178, 345269.9959361 , 350793.94559198, 356318.12688754, 361842.53121431], [334729.73035137, 340236.28717586, 345743.09356756, 351250.14088716, 356757.42049886, 362264.92377027], [335237.14312789, 340726.77941807, 346216.66596286, 351706.79409952, 357197.15516879, 362687.74051482], [335745.06277796, 341217.76176387, 346690.711679 , 352163.90383761, 357637.32955744, 363110.98015957], [336253.48775383, 341709.2327171 , 347165.2292714 , 352621.46870845, 358077.9423234 , 363534.64141473], [336762.41650608, 342201.19077999, 347640.21729394, 353079.48731754, 358518.99212383, 363958.72298911]]) As with the PointSkeleton, the GriddedSkeleton can also give its cartesian coordinates. However, since any UTM zone will be rotated in respect to the spherically defined structured grid, asking for the cartesian x-vector will cause a slight rotation. In other words, the same points can't be reguratly gridded in both shperical and UTM spaces : .. code-block:: python >>> grid.utm.zone() (36, 'V') >>> grid.x() Regridding spherical grid to cartesian coordinates will cause a rotation! Use '_, y = skeleton.xy()' to get a list of all points. array([334729.73035137, 340236.28717586, 345743.09356756, 351250.14088716, 356757.42049886, 362264.92377027]) To get the **exact** UTM coordinates of ALL the points, one can simply use: .. code-block:: python x, y = grid.xy() Underlying xarray Dataset structure -------------------------------------------- As with the PointSkeleton, the structure is in an xarray Dataset (but longitude and latitue vectors are now coordinates, not variables): .. code-block:: python >>> grid ------------------------------ Coordinate groups ------------------------------- Spatial: (lat, lon) Grid: (lat, lon) Gridpoint: *empty* All: (lat, lon) ------------------------------------ Xarray ------------------------------------ Size: 120B Dimensions: (lat: 9, lon: 6) Coordinates: * lat (lat) float64 72B 60.0 60.1 60.2 60.3 60.4 60.5 60.6 60.7 60.8 * lon (lon) float64 48B 30.0 30.1 30.2 30.3 30.4 30.5 Data variables: *empty* Attributes: name: LonelySkeleton -------------------------------------------------------------------------------- >>> grid.ds() Size: 120B Dimensions: (lat: 9, lon: 6) Coordinates: * lat (lat) float64 72B 60.0 60.1 60.2 60.3 60.4 60.5 60.6 60.7 60.8 * lon (lon) float64 48B 30.0 30.1 30.2 30.3 30.4 30.5 Data variables: *empty* Attributes: name: LonelySkeleton The size of the x- and y-vectors are given by: .. code-block:: python >>> grid.nx() 6 >>> grid.ny() 9 The size of the skeleton, defined by the lon-lat vecotrs, is given by: .. code-block:: python >>> grid.size() (9, 6) As an example, a cartesian PointSkeleton *could* be created from the spherical GriddedSkeleton as: .. code-block:: python x, y = grid.xy() points = PointSkeleton(x=x, y=y) points.set_utm(grid.utm()) This now creates a new structure: .. code-block:: python >>> points.ds() Dimensions: (inds: 54) Coordinates: * inds (inds) int64 0 1 2 3 4 5 6 7 8 9 ... 44 45 46 47 48 49 50 51 52 53 Data variables: y (inds) float64 6.655e+06 6.655e+06 ... 6.743e+06 6.743e+06 x (inds) float64 3.327e+05 3.383e+05 3.439e+05 ... 3.585e+05 3.64e+05 Attributes: utm_zone: 36V Nonetheless, converting between different types of skeletons is usually not needed, since a list of all the points in UTM-coordinates can be extracted directly from the spherical GriddedSkeleton. In other words, the two following lines give the exact same result: .. code-block:: python x, y = grid.xy() x, y = points.xy() The core of the Skeleton will still keep track of lon/lat as coordinates (to differentiate from actual data variables that can be added) .. code-block:: python >>> grid.core.data_vars() [] >>> grid.core.coords() ['lat', 'lon'] >>> grid.core.coords('init') # A list of coordinates that are needed to initialize this class ['lat', 'lon'] Finding points ============================================= Skeleton classes are equipped with a dedicated method to find points: .. code-block:: python >>> data = GriddedSkeleton(lon=(3,5), lat=(60,61)) >>> data.set_spacing(dm=1000) >>> grid.yank_point(lon=2.98, lat=60.01) {'inds_x': array([0]), 'inds_y': array([1]), 'dx': array([1120.6812202])} >>> points = PointSkeleton(lon=(3.0, 4.0, 5.0), lat=(60.0, 60.0, 61.0)) >>> points.yank_point(lon=2.98, lat=60.01) {'inds': array([0]), 'dx': array([1576.18628188])} This gives the corresponding index and distance to nearest point (in metres). The method can also be used to find several points: .. code-block:: python >>> points.yank_point(lon=(2.98, 4.1), lat=(60.01, 60.01)) {'inds': array([0, 1]), 'dx': array([1576.18628188, 5687.27546285])} We can also find several closest points to one point: .. code-block:: python >>> grid.yank_point(lon=2.98, lat=60.01, npoints=4) {'inds_x': array([0, 0, 0, 1]), 'inds_y': array([1, 2, 0, 1]), 'dx': array([1120.6812202 , 1428.55452856, 1576.18628188, 2131.94091801])} If we know that a PointSkeleton structure has been raveled from a gridded structure, we can also find the gridded indeces after the fact: .. code-block:: python >>> lon, lat = grid.lonlat() >>> raveled_grid = PointSkeleton(lon=lon, lat=lat) >>> raveled_grid.yank_point(lon=2.98, lat=60.01, npoints=4) {'inds': array([111, 222, 0, 112]), 'dx': array([1120.6812202 , 1428.55452856, 1576.18628188, 2131.94091801])} >>> raveled_grid.yank_point(lon=2.98, lat=60.01, npoints=4, gridded_shape=grid.size()) {'inds': array([111, 222, 0, 112]), 'dx': array([1120.6812202 , 1428.55452856, 1576.18628188, 2131.94091801]), 'inds_x': array([0, 0, 0, 1]), 'inds_y': array([1, 2, 0, 1])} Expanding **skeletons** ============================================= Adding data variables -------------------------------------------- The real benefit from skeletons is that you can define your own objects while still retaining all the original methods that are defined to handle the spatial coordinates. As an example, lets define an object that contains gridded significant wave height (hs) data: .. code-block:: python from geo_skeletons import GriddedSkeleton from geo_skeletons.decorators import add_datavar @add_datavar(name='hs', default_value=0.) class WaveHeight(GriddedSkeleton): pass The core of the Skeleton now keeps track of the data variable even though no data has been set. .. code-block:: python >>> WaveHeight.core ------------------------------ Coordinate groups ------------------------------- Spatial: (y, x) Grid: (y, x) Gridpoint: *empty* All: (y, x) ------------------------------------- Data ------------------------------------- Variables: hs (y, x): 0.0 Masks: *empty* Magnitudes: *empty* Directions: *empty* -------------------------------------------------------------------------------- A better way is to use the geo-parameters package to add data variables: .. code-block:: python from geo_skeletons import GriddedSkeleton from geo_skeletons.decorators import add_datavar import geo_parameters as gp @add_datavar(name=gp.wave.Hs('hs'), default_value=0.) class WaveHeight(GriddedSkeleton): pass .. code-block:: python >>> WaveHeight.core ------------------------------ Coordinate groups ------------------------------- Spatial: (y, x) Grid: (y, x) Gridpoint: *empty* All: (y, x) ------------------------------------- Data ------------------------------------- Variables: hs (y, x): 0.0 [m] sea_surface_wave_significant_height Masks: *empty* Magnitudes: *empty* Directions: *empty* -------------------------------------------------------------------------------- The ``@add_datavar`` decorator does the following: * adds a data variable names ``hs`` * sets the default value for this variable to 0.0 * creates a ``.hs()`` method to access that variable * creates a method ``.set_hs()`` that takes a numpy array Using this new objects is now much like using the GriddedSkeleton, but the xarray Dataset now contains a data variable. .. code-block:: python data = WaveHeight(lon=(3,5), lat=(60,61)) data.set_spacing(dm=1000) >>> data.dx() 996.8080414102963 >>> data.dy() 1001.4167247779651 >>> data.hs() # Gives the default value array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], >>> data.hs(strict=True) # Returns 'None' if data has not been set None This new data variable is contained in the underlying xarray Dataset .. code-block:: python >>> data.ds() # hs-not saved since it is not set Size: 2kB Dimensions: (lat: 112, lon: 111) Coordinates: * lat (lat) float64 896B 60.0 60.01 60.02 60.03 ... 60.98 60.99 61.0 * lon (lon) float64 888B 3.0 3.018 3.036 3.055 ... 4.945 4.964 4.982 5.0 Data variables: *empty* Attributes: name: LonelySkeleton .. code-block:: python >>> data.set_hs(5.6) >>> data.ds() Size: 101kB Dimensions: (lat: 112, lon: 111) Coordinates: * lat (lat) float64 896B 60.0 60.01 60.02 60.03 ... 60.98 60.99 61.0 * lon (lon) float64 888B 3.0 3.018 3.036 3.055 ... 4.945 4.964 4.982 5.0 Data variables: hs (lat, lon) float64 99kB 5.6 5.6 5.6 5.6 5.6 ... 5.6 5.6 5.6 5.6 5.6 Attributes: name: LonelySkeleton The newly created ``.hs()`` method works directly with the xarray Dataset, and same slicing etc. possibilities work out of the box .. code-block:: python >>> data.hs(lon=slice(4,4.5)) array([[5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6], [5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6], [5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6], ..., [5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6], [5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6], [5.6, 5.6, 5.6, ..., 5.6, 5.6, 5.6]]) >>> data.hs(lon=3) array([5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6]) # Return xr.DataArray # Not that metadata is added when using a geo-parameter to create the class >>> data.hs(lon=2.98, method='nearest', data_array=True) Size: 896B array([5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6]) Coordinates: * lat (lat) float64 896B 60.0 60.01 60.02 60.03 ... 60.98 60.99 61.0 lon float64 8B 3.0 Attributes: short_name: hs long_name: significant_wave_height standard_name: sea_surface_wave_significant_height units: m Adding additional coordinates -------------------------------------------- Although all skeletons will have the x-y or lon-lat spatial coordinates, decorators can be used to add additional coordinates that the possible data is defined on. As an example, lets create a structure to represent wind data on a spherical 3D grid: .. code-block:: python from geo_skeletons import GriddedSkeleton from geo_skeletons.decorators import add_coord, add_datavar @add_datavar(gp.wind.Wind, default_value=10.0) @add_coord(name="z", grid_coord=True) class WindSpeed(GriddedSkeleton): pass .. code-block:: python >>> WindSpeed.core ------------------------------ Coordinate groups ------------------------------- Spatial: (y, x) Grid: (y, x, z) Gridpoint: *empty* All: (y, x, z) ------------------------------------- Data ------------------------------------- Variables: ff (y, x, z): 10.0 [m/s] wind_speed Masks: *empty* Magnitudes: *empty* Directions: *empty* -------------------------------------------------------------------------------- The ``@add_coord`` decorator does the following: * adds a coordinate named ``z`` to signify the height * adds the requirement to provide the variable ``z`` when initializing the skeleton * adds a method ``.z()`` to access the values of this coordinate * adds a method ``.set_z_spacing()`` to set the spacing of the coordinate (only ``dx`` and ``nx`` keywords possible) .. code-block:: python import numpy as np grid = WindSpeed(lon=(25, 30), lat=(58, 62), z=(0, 100)) grid.set_spacing(dnmi=1) grid.set_z_spacing(dx=1) new_data = np.random.rand(grid.ny(), grid.nx(), len(grid.z())) grid.set_ff(new_data) .. code-block:: python >>> grid ------------------------------ Coordinate groups ------------------------------- Spatial: (lat, lon) Grid: (lat, lon, z) Gridpoint: *empty* All: (lat, lon, z) ------------------------------------ Xarray ------------------------------------ Size: 29MB Dimensions: (lat: 241, lon: 151, z: 101) Coordinates: * lat (lat) float64 2kB 58.0 58.02 58.03 58.05 ... 61.95 61.97 61.98 62.0 * lon (lon) float64 1kB 25.0 25.03 25.07 25.1 ... 29.9 29.93 29.97 30.0 * z (z) float64 808B 0.0 1.0 2.0 3.0 4.0 ... 96.0 97.0 98.0 99.0 100.0 Data variables: ff (lat, lon, z) float64 29MB 0.8337 0.1296 0.6254 ... 0.2336 0.1013 -------------------------------------------------------------------------------- If we want to have several variables that are defined on different coordinate, we have some flixibility by using coordinate groups. If we, e.g. want to add a data variable that defines the surface rougness we can let the z-coordinate default to beeing added as a ``gridpoint`` coordinate instead of ``grid`` coordinate. We can then define over which coordinate group the data variable is added: .. code-block:: python @add_datavar("roughness", default_value=0.0, coord_group="grid") @add_datavar(gp.wind.Wind, default_value=10.0, coord_group="all") @add_coord(name="z", grid_coord=False) class WindSpeed(GriddedSkeleton): pass .. code-block:: python >>> WindSpeed.core ------------------------------ Coordinate groups ------------------------------- Spatial: (y, x) Grid: (y, x) Gridpoint: (z) All: (y, x, z) ------------------------------------- Data ------------------------------------- Variables: ff (y, x, z): 10.0 [m/s] wind_speed roughness (y, x): 0.0 Masks: *empty* Magnitudes: *empty* Directions: *empty* -------------------------------------------------------------------------------- Adding a time, frequency and direction variable -------------------------------------------- The adding of certain coordinates have dedicated method. As an example, let us use a wave spectrum: .. code-block:: python from geo_skeletons import GriddedSkeleton from geo_skeletons.decorators import add_datavar, add_time, add_frequency, add_direction import geo_parameters as gp @add_datavar(gp.wave.Efth("spec"), coord_group="all") @add_direction() @add_frequency() @add_time() class Spectrum(GriddedSkeleton): pass .. code-block:: python Spectrum.core ------------------------------ Coordinate groups ------------------------------- Spatial: (y, x) Grid: (time, y, x) Gridpoint: (freq, dirs) All: (time, y, x, freq, dirs) ------------------------------------- Data ------------------------------------- Variables: spec (time, y, x, freq, dirs): 0.0 [m**2*s/rad] sea_surface_wave_directional_variance_spectral_density Masks: *empty* Magnitudes: *empty* Directions: *empty* -------------------------------------------------------------------------------- We can see that by default time is added as a grid coordinate, while frequency and direction are added as gridpoint coordinates. This doesn't affect the spectrum directly, since it is still defined over all coordinates. However, we can think of grid coordinates as "outer" coordinates, defining the grid on which we have a certain object that is defined over "inner" (gridpoint) coordinates. .. code-block:: python data = Spectrum( lon=(10, 20), lat=(50, 60), freq=np.arange(0, 1, 0.1), dirs=np.arange(0, 360, 10), time=("2020-01-01 00:00", "2020-01-02 00:00", "6h"), ) The add_frequency and add_direction creates method that can give out angular frequencies or the directions in radians .. code-block:: python >>> data.freq() array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]) >>> data.freq(angular=True) array([0. , 0.62831853, 1.25663706, 1.88495559, 2.51327412, 3.14159265, 3.76991118, 4.39822972, 5.02654825, 5.65486678]) >>> data.dirs() array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350]) >>> data.dirs(angular=True) array([0. , 0.17453293, 0.34906585, 0.52359878, 0.6981317 , 0.87266463, 1.04719755, 1.22173048, 1.3962634 , 1.57079633, 1.74532925, 1.91986218, 2.0943951 , 2.26892803, 2.44346095, 2.61799388, 2.7925268 , 2.96705973, 3.14159265, 3.31612558, 3.4906585 , 3.66519143, 3.83972435, 4.01425728, 4.1887902 , 4.36332313, 4.53785606, 4.71238898, 4.88692191, 5.06145483, 5.23598776, 5.41052068, 5.58505361, 5.75958653, 5.93411946, 6.10865238]) Useing the add_time decorator adds some time-specific functionality to the Skeleton: .. code-block:: python >>> data.time() DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 06:00:00', '2020-01-01 12:00:00', '2020-01-01 18:00:00', '2020-01-02 00:00:00'], dtype='datetime64[ns]', freq=None) >>> data.time(datetime=False) ['2020-01-01 00:00:00', '2020-01-01 06:00:00', '2020-01-01 12:00:00', '2020-01-01 18:00:00', '2020-01-02 00:00:00'] >>> data.time(datetime=False, fmt='%Y%m%d') ['20200101', '20200101', '20200101', '20200101', '20200102'] To get only one time stamp for each day (or year or month or hour): .. code-block:: python >>> data.days() # or .years(), .months(), .hours() DatetimeIndex(['2020-01-01', '2020-01-02'], dtype='datetime64[ns]', freq=None) >>> data.days(datetime=False) ['2020-01-01', '2020-01-02'] >>> data.days(datetime=False, fmt='%Y%m%d') ['20200101', '20200102'] We can now also get the size of a single spectral object by looking at the size defined by the ``gridpoint`` coordinates .. code-block:: python >>> data.size('gridpoint') (10, 36) The coordinate groups also determines the order of iteration (iteration this way is easy, but slow for larger data sets): .. code-block:: python for one_spec in data: pass .. code-block:: python >>> one_spec ------------------------------ Coordinate groups ------------------------------- Spatial: (lat, lon) Grid: (time, lat, lon) Gridpoint: (freq, dirs) All: (time, lat, lon, freq, dirs) ------------------------------------ Xarray ------------------------------------ Size: 392B Dimensions: (time: 1, lat: 1, lon: 1, freq: 10, dirs: 36) Coordinates: * time (time) datetime64[ns] 8B 2020-01-02 * lat (lat) int64 8B 60 * lon (lon) int64 8B 20 * freq (freq) float64 80B 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 * dirs (dirs) int64 288B 0 10 20 30 40 50 60 ... 300 310 320 330 340 350 Data variables: *empty* Attributes: name: LonelySkeleton ---------------------------------- Empty data ---------------------------------- Empty variables: spec (time, lat, lon, freq, dirs): 0.0 [m**2*s/rad] sea_surface_wave_directional_variance_spectral_density -------------------------------------------------------------------------------- >>> one_spec.spec().shape # The method automatically squeezes out trivial dimensions (10, 36) >>> spec.spec(squeeze=False).shape (1, 1, 1, 10, 36) >>> one_spec.set_spec(one_spec.spec()*2) # Automatically expands trivial dimensions Setting data -------------------------------------------- The Skeletons ``set``-method is helpful if the data at hand does not conform exactly to the format specified in the Skeleton. First, the Skeleton can squeeze and unsqueeze trivial dimensions at will. Let's define a Skeleton for significant wave height from a wave forecast: .. code-block:: python from geo_skeletons import GriddedSkeleton from geo_skeletons.decorators import add_datavar, add_time import geo_parameters as gp @add_datavar(gp.wave.Hs) @add_time() class WaveForecast(GriddedSkeleton): pass data = WaveForecast( lon=(0, 10), lat=(50, 60), time=("2020-01-01 00:00", "2020-01-03 00:00", "3h"), ) data.set_spacing(nx=11, ny=21) The ``hs``-variable now has a given shape: .. code-block:: python >>> data.shape('hs') (17, 21, 11) # time, lat, lon Let us say we get some data that is taken from an ensemble forecast, but cut to only the deterministic member, but that dimension has not been squeezed out: .. code-block:: python >>> eps_hs.shape (17, 1, 21, 11) The Skeleton ``set``-method can automatically get rid of this extra trivial dimension: .. code-block:: python data.set_hs(eps_hs) # Or data.set('hs', eps_hs) This could have been done by ``np.squeeze``, but doing that can cause troubles if we have other trivial dimensions, such as only one time. The Skeleton can also re-arange non-trivial dimensions. Say that we get data from another forecast that has dimensions ``['lon','lat','ensembe','time']`` (Skeleton has ``['time','lat','lon']``). We can then specify the order in the set method: .. code-block:: python >>> eps2_hs.shape (11, 21, 1, 17) >>> data.set_hs(eps2_hs) *** geo_skeletons.errors.DataWrongDimensionError: Data has shape (11, 21, 1, 17), but coordinates define a shape (17, 21, 11)!!! >>> data.set_hs(eps2_hs, coords=['lon', 'lat', 'time']) # Not that we only need to specify order of non-trivial dimensions >>> data.shape('hs') (17, 21, 11) If we give in an ``xr.DataArray`` instead of a ``np.array``, then the order of the coordinates wil be parsed from the metadata in the ``DataArray`` if possible, but this can always be overridden by using the ``coords`` keyword. If we only have two dimensions, then we can allow the Skeleton to try a transpose of the data: .. code-block:: python @add_datavar(gp.wave.Hs) class WaveData(GriddedSkeleton): pass grid = WaveData(lon=(10, 20), lat=(50, 60)) grid.set_spacing(nx=10, ny=20) .. code-block:: python data = np.ones((10, 20)) >>> grid.set_hs(data) *** geo_skeletons.errors.DataWrongDimensionError: Data has shape (10, 20), but coordinates define a shape (20, 10)!!! >>> grid.set_hs(data, allow_transpose=True) >>> np.mean(grid.hs()) np.float64(1.0) As previously, trivial dimensions are squeezed out prior to the transpose: .. code-block:: python >>> data = np.full((1,10,1,20,1), 5.5) >> grid.set_hs(data) *** geo_skeletons.errors.DataWrongDimensionError: Data has shape (1, 10, 1, 20, 1), but coordinates define a shape (20, 10)!!! >>> grid.set_hs(data, allow_transpose=True) >>> np.mean(grid.hs()) np.float64(5.5) Adding magnitudes and directions -------------------------------------------- The recommended way to add magnitudes and directions are to add the components as data variables and add a magnitude and direction connected to those components: .. code-block:: python from geo_skeletons import GriddedSkeleton from geo_skeletons.decorators import add_datavar, add_magnitude import geo_parameters as gp @add_magnitude(gp.wind.Wind("u"), x="ux", y="uy", direction=gp.wind.WindDir("ud")) @add_datavar(gp.wind.YWind("uy")) @add_datavar(gp.wind.XWind("ux")) class Wind(GriddedSkeleton): pass .. code-block:: python >>> Wind.core ------------------------------ Coordinate groups ------------------------------- Spatial: (y, x) Grid: (y, x) Gridpoint: *empty* All: (y, x) ------------------------------------- Data ------------------------------------- Variables: ux (y, x): 0.1 [m/s] x_wind uy (y, x): 0.1 [m/s] y_wind Masks: *empty* Magnitudes: u: magnitude of (ux,uy) [m/s] wind_speed Directions: ud: direction of (ux,uy) [deg] wind_from_direction -------------------------------------------------------------------------------- The ``@add_magnitude`` decorator does the following: * adds a magnitude ``u`` for the components ``ux`` and ``uy`` * adds a direction (optional) ``ud`` for the components ``ux`` and ``uy`` * parses the standard_name of gp.wind.WindDir to identify that this is a 'from' direction * creates ``.u()`` and ``.ud()`` methods to access the magnitude and direction * creates ``.set_u()`` and ``.set_ud()`` methods to set the magnitude and direction Now we can still set the wind speed and direction, but only the components will be stored in the Dataset (obviously the components can also still be set and accessed normally): .. code-block:: python >>> data = Wind(lon=(10, 14), lat=(50, 60)) >>> data.set_u(10) >>> data.set_ud(45) >>> data ------------------------------ Coordinate groups ------------------------------- Spatial: (lat, lon) Grid: (lat, lon) Gridpoint: *empty* All: (lat, lon) ------------------------------------ Xarray ------------------------------------ Size: 96B Dimensions: (lat: 2, lon: 2) Coordinates: * lat (lat) int64 16B 50 60 * lon (lon) int64 16B 10 14 Data variables: ux (lat, lon) float64 32B -7.071 -7.071 -7.071 -7.071 uy (lat, lon) float64 32B -7.071 -7.071 -7.071 -7.071 Attributes: name: LonelySkeleton -------------------------- Magnitudes and directions --------------------------- u: magnitude of (ux,uy) [m/s] wind_speed ud: direction of (ux,uy) [deg] wind_from_direction -------------------------------------------------------------------------------- The directionality of the wind was parsed from the standard name of the geo-parameter gp.wind.WindDir. Therefore the components could be set properly and unambiguously. The direction can be set and retrieved in different directional conventions: .. code-block:: python >>> data.set_ud(270, dir_type='to') >>> data.ud(dir_type='from') array([[90., 90.], [90., 90.]]) >>> data.ud(dir_type='to') array([[270., 270.], [270., 270.]]) >>> data.ud(dir_type='math') array([[3.14159265, 3.14159265], [3.14159265, 3.14159265]]) This same functionality is also present for individual parameters of directional nature (that typically don't use components): .. code-block:: python # This is direction from, as determined by the standard name # To get direction to, use gp.wave.DirpTo @add_datavar(gp.wave.Dirp) class Wave(GriddedSkeleton): pass .. code-block:: python >>> data.core ------------------------------ Coordinate groups ------------------------------- Spatial: (lat, lon) Grid: (lat, lon) Gridpoint: *empty* All: (lat, lon) ------------------------------------- Data ------------------------------------- Variables: dirp (lat, lon): 0.0 [deg] sea_surface_wave_from_direction_at_variance_spectral_density_maximum Masks: *empty* Magnitudes: *empty* Directions: *empty* -------------------------------------------------------------------------------- .. code-block:: python data = Wave(lon=(10, 14), lat=(50, 60)) data.set_dirp(45, dir_type="to") >>> data.dirp() # default dir_type is the one that the parameter is, in this case 'from' array([[225., 225.], [225., 225.]]) Adding masks -------------------------------------------- Logical masks (for example marking land points or points of interest) can be added to the skeletons. To for example add a land-sea mask to a gridded significant wave height: .. code-block:: python from geo_skeleton.decorators import add_datavar, add_mask import geo_parameters as gp @add_datavar(gp.wave.Hs("hs"), default_value=0.0) @add_mask( name="sea", default_value=0, coord_group="grid", opposite_name="land", ) class WaveHeight(GriddedSkeleton): pass The geo-skeleton now has a land-sea mask. Only the sea mask is stored in the xr.Dataset and the land mask is calculated as the inverse of the sea mask. .. code-block:: python >>> grid = WaveHeight(lon=(0, 1), lat=(60, 61), name="Wavegrid") >>> grid ------------------------------ Coordinate groups ------------------------------- Spatial: (lat, lon) Grid: (lat, lon) Gridpoint: *empty* All: (lat, lon) ------------------------------------ Xarray ------------------------------------ Size: 32B Dimensions: (lat: 2, lon: 2) Coordinates: * lat (lat) int64 16B 60 61 * lon (lon) int64 16B 0 1 Data variables: *empty* Attributes: name: Wavegrid ---------------------------------- Empty data ---------------------------------- Empty variables: hs (lat, lon): 0.0 [m] sea_surface_wave_significant_height Empty masks: sea_mask (lat, lon): False land_mask (lat, lon): True -------------------------------------------------------------------------------- Both are still gettable, and also the points corresponding to the masks have methods: .. code-block:: python >>> grid.sea_mask() array([[False, False], [False, False]]) >>> grid.land_mask() array([[ True, True], [ True, True]]) >>> grid.sea_points() (array([], dtype=int64), array([], dtype=int64)) >>> grid.land_points() (array([0, 1, 0, 1]), array([60, 60, 61, 61])) We can also set either the land mask or the sea mask, and that defines boths masks, since they are per definition opposites of each other: .. code-block:: python >>> grid.set_sea_mask([[True, False], [True, False]]) >>> grid.sea_mask() array([[ True, False], [ True, False]]) >>> grid.sea_points() (array([0, 0]), array([60, 61])) >>> grid.land_mask() array([[False, True], [False, True]]) >>> grid.land_points() (array([1, 1]), array([60, 61])) It is also possible to connect the masks to a certain data variable. E.g. to have only positive significant wave height values to be sea points: .. code-block:: python @add_datavar(gp.wave.Hs("hs"), default_value=0.0) @add_mask( name="sea", default_value=0, coord_group="grid", opposite_name="land", triggered_by="hs", valid_range=(0, None), range_inclusive=False, ) class WaveHeight(GriddedSkeleton): pass Here ``triggered_by`` means that setting the data variable ``hs`` sets the land-sea masks. ``valid_range`` means the ``hs`` values that define the primary mask (here ``sea_mask``), and ``range_inclusive=False`` means that ``0`` does **not** become a sea point: .. code-block:: python >>> grid.set_hs([[0,1],[2,-999]]) >>> grid.hs() array([[ 0, 1], [ 2, -999]]) >>> grid.sea_mask() array([[False, True], [ True, False]]) >>> grid.land_points() (array([0, 1]), array([60, 61])) The masks are still settable independently if needed (e.g. if onw wants to apply a separate land mask after the fact), but if they are not set then they are always "in sync" with the significant wave height data. Plotting the data -------------------------------------------- Skeletons don't have any plotting functionality built in, but since it wraps around xarray datasets, the xarray plotting functions can be used. We can take as an example the wind class defined in the "Adding magnitudes and directions" section. Note, that even though the wind speed (added as a magnitude of the components) is not stored in the underlying xr.Dataset, we can still get it as a DataArray and plot it: .. code-block:: python import numpy as np import matplotlib.pyplot as plt data = Wind(lon=(0, 10), lat=(50, 60)) data.set_spacing(dm=1000) new_data = np.random.rand(*data.size()) data.set_u(new_data) data.u(data_array=True).plot() plt.show() .. image:: example_wind_plot.png Here, the ``data_array=True`` tells the method to return the xarray data array instead of a numpy array of the values.