6.1.2. Key IO Data Structures in SuPy#
6.1.2.1. Introduction#
The cell below demonstrates a minimal case of SuPy simulation with all key IO data structures included:
[1]:
import supy as sp
df_state_init, df_forcing = sp.load_SampleData()
df_output, df_state_final = sp.run_supy(df_forcing, df_state_init)
Input: SuPy requires two
DataFrame
s to perform a simulation, which are:df_state_init
: model initial states;df_forcing
: forcing data.
These input data can be loaded either through calling load_SampleData() as shown above or using init_supy. Or, based on the loaded sample
DataFrame
s, you can modify the content to create newDataFrame
s for your specific needs.Output: The output data by SuPy consists of two
DataFrame
s:df_output
: model output results; this is usually the basis for scientific analysis.df_state_final
: model final states; any of its entries can be used as adf_state_init
to start another SuPy simulation.
6.1.2.2. Input#
6.1.2.2.1. df_state_init
: model initial states#
[2]:
df_state_init.head()
[2]:
var | ah_min | ah_slope_cooling | ah_slope_heating | ahprof_24hr | ... | tair24hr | numcapita | gridiv | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ind_dim | (0,) | (1,) | (0,) | (1,) | (0,) | (1,) | (0, 0) | (0, 1) | (1, 0) | (1, 1) | (2, 0) | (2, 1) | (3, 0) | (3, 1) | (4, 0) | ... | (275,) | (276,) | (277,) | (278,) | (279,) | (280,) | (281,) | (282,) | (283,) | (284,) | (285,) | (286,) | (287,) | 0 | 0 |
grid | |||||||||||||||||||||||||||||||
98 | 15.0 | 15.0 | 2.7 | 2.7 | 2.7 | 2.7 | 0.57 | 0.65 | 0.45 | 0.49 | 0.43 | 0.46 | 0.4 | 0.47 | 0.4 | ... | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 273.15 | 204.58 | 98 |
1 rows × 1200 columns
df_state_init
is organised with grids in rows and their states in columns. The details of all state variables can be found in the description page.
Please note the properties are stored as flattened values to fit into the tabular format due to the nature of DataFrame
though they may actually be of higher dimension (e.g. ahprof_24hr with the dimension {24, 2}). To indicate the variable dimensionality of these properties, SuPy use the ind_dim
level in columns for indices of values:
0
for scalars;(ind_dim1, ind_dim2, ...)
for arrays (for a generic sense, vectors are 1D arrays).
Take ohm_coef
below for example, it has a dimension of {8, 4, 3} according to the description, which implies the actual values used by SuPy in simulations are passed in a layout as an array of the dimension {8, 4, 3}. As such, to get proper values passed in, users should follow the dimensionality requirement to prepare/modify df_state_init
.
[3]:
df_state_init.loc[:,'ohm_coef']
[3]:
ind_dim | (0, 0, 0) | (0, 0, 1) | (0, 0, 2) | (0, 1, 0) | (0, 1, 1) | (0, 1, 2) | (0, 2, 0) | (0, 2, 1) | (0, 2, 2) | (0, 3, 0) | (0, 3, 1) | (0, 3, 2) | (1, 0, 0) | (1, 0, 1) | (1, 0, 2) | ... | (6, 3, 0) | (6, 3, 1) | (6, 3, 2) | (7, 0, 0) | (7, 0, 1) | (7, 0, 2) | (7, 1, 0) | (7, 1, 1) | (7, 1, 2) | (7, 2, 0) | (7, 2, 1) | (7, 2, 2) | (7, 3, 0) | (7, 3, 1) | (7, 3, 2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
grid | |||||||||||||||||||||||||||||||
98 | 0.719 | 0.194 | -36.6 | 0.719 | 0.194 | -36.6 | 0.719 | 0.194 | -36.6 | 0.719 | 0.194 | -36.6 | 0.238 | 0.427 | -16.7 | ... | 0.5 | 0.21 | -39.1 | 0.25 | 0.6 | -30.0 | 0.25 | 0.6 | -30.0 | 0.25 | 0.6 | -30.0 | 0.25 | 0.6 | -30.0 |
1 rows × 96 columns
6.1.2.2.2. df_forcing
: forcing data#
df_forcing
is organised with temporal records in rows and forcing variables in columns. The details of all forcing variables can be found in the description page.
The missing values can be specified with -999
s, which are the default NANs accepted by SuPy and its backend SUEWS.
[4]:
df_forcing.head()
[4]:
iy | id | it | imin | qn | qh | qe | qs | qf | U | RH | Tair | pres | rain | kdown | snow | ldown | fcld | Wuh | xsmd | lai | kdiff | kdir | wdir | isec | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2012-01-01 00:05:00 | 2012 | 1 | 0 | 5 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | 85.463333 | 11.77375 | 1001.5125 | 0.0 | 0.153333 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:10:00 | 2012 | 1 | 0 | 10 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | 85.463333 | 11.77375 | 1001.5125 | 0.0 | 0.153333 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:15:00 | 2012 | 1 | 0 | 15 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | 85.463333 | 11.77375 | 1001.5125 | 0.0 | 0.153333 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:20:00 | 2012 | 1 | 0 | 20 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | 85.463333 | 11.77375 | 1001.5125 | 0.0 | 0.153333 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
2012-01-01 00:25:00 | 2012 | 1 | 0 | 25 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 4.515 | 85.463333 | 11.77375 | 1001.5125 | 0.0 | 0.153333 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 |
Note:
The index of df_forcing
SHOULD BE strictly of DatetimeIndex
type if you want create a df_forcing
for SuPy simulation. The SuPy runtime time-step size is instructed by the df_forcing
with its index information.
The infomation below indicates SuPy will run at a 5 min (i.e. 300 s) time-step if driven by this specific df_forcing
:
[5]:
freq_forcing=df_forcing.index.freq
freq_forcing
[5]:
<300 * Seconds>
6.1.2.3. Output#
6.1.2.3.1. df_output
: model output results#
df_output
is organised with temporal records of grids in rows and output variables of different groups in columns. The details of all forcing variables can be found in the description page.
[6]:
df_output.head()
[6]:
group | SUEWS | ... | DailyState | |||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
var | Kdown | Kup | Ldown | Lup | Tsurf | QN | QF | QS | QH | QE | QHlumps | QElumps | QHresis | Rain | Irr | ... | WU_Grass2 | WU_Grass3 | deltaLAI | LAIlumps | AlbSnow | DensSnow_Paved | DensSnow_Bldgs | DensSnow_EveTr | DensSnow_DecTr | DensSnow_Grass | DensSnow_BSoil | DensSnow_Water | a1 | a2 | a3 | |
grid | datetime | |||||||||||||||||||||||||||||||
98 | 2012-01-01 00:05:00 | 0.153333 | 0.018279 | 344.310184 | 371.986259 | 11.775615 | -27.541021 | 40.574001 | -46.53243 | 62.420064 | 3.576493 | 49.732605 | 9.832804 | 0.042327 | 0.0 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2012-01-01 00:10:00 | 0.153333 | 0.018279 | 344.310184 | 371.986259 | 11.775615 | -27.541021 | 39.724283 | -46.53243 | 61.654096 | 3.492744 | 48.980360 | 9.735333 | 0.042294 | 0.0 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2012-01-01 00:15:00 | 0.153333 | 0.018279 | 344.310184 | 371.986259 | 11.775615 | -27.541021 | 38.874566 | -46.53243 | 60.885968 | 3.411154 | 48.228114 | 9.637861 | 0.042260 | 0.0 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2012-01-01 00:20:00 | 0.153333 | 0.018279 | 344.310184 | 371.986259 | 11.775615 | -27.541021 | 38.024849 | -46.53243 | 60.115745 | 3.331660 | 47.475869 | 9.540389 | 0.042226 | 0.0 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2012-01-01 00:25:00 | 0.153333 | 0.018279 | 344.310184 | 371.986259 | 11.775615 | -27.541021 | 37.175131 | -46.53243 | 59.343488 | 3.254200 | 46.723623 | 9.442917 | 0.042192 | 0.0 | 0.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 218 columns
df_output
are recorded at the same temporal resolution as df_forcing
:
[7]:
freq_out = df_output.index.levels[1].freq
(freq_out, freq_out == freq_forcing)
[7]:
(<300 * Seconds>, True)
6.1.2.3.2. df_state_final
: model final states#
df_state_final
has the identical data structure as df_state_init
except for the extra level datetime
in index, which stores the temporal information associated with model states. Such structure can facilitate the reuse of it as initial model states for other simulations (e.g., diagnostics of runtime model states with save_state=True
set in run_supy
; or simply using it as the initial conditions for future simulations starting at the ending times of previous runs).
The meanings of state variables in df_state_final
can be found in the description page.
[8]:
df_state_final.head()
[8]:
var | aerodynamicresistancemethod | ah_min | ah_slope_cooling | ah_slope_heating | ahprof_24hr | ... | wuprofm_24hr | z | z0m_in | zdm_in | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ind_dim | 0 | (0,) | (1,) | (0,) | (1,) | (0,) | (1,) | (0, 0) | (0, 1) | (1, 0) | (1, 1) | (2, 0) | (2, 1) | (3, 0) | (3, 1) | ... | (18, 0) | (18, 1) | (19, 0) | (19, 1) | (20, 0) | (20, 1) | (21, 0) | (21, 1) | (22, 0) | (22, 1) | (23, 0) | (23, 1) | 0 | 0 | 0 | |
datetime | grid | |||||||||||||||||||||||||||||||
2012-01-01 00:05:00 | 98 | 2 | 15.0 | 15.0 | 2.7 | 2.7 | 2.7 | 2.7 | 0.57 | 0.65 | 0.45 | 0.49 | 0.43 | 0.46 | 0.4 | 0.47 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 49.6 | 1.9 | 14.2 |
2013-01-01 00:05:00 | 98 | 2 | 15.0 | 15.0 | 2.7 | 2.7 | 2.7 | 2.7 | 0.57 | 0.65 | 0.45 | 0.49 | 0.43 | 0.46 | 0.4 | 0.47 | ... | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 49.6 | 1.9 | 14.2 |
2 rows × 1200 columns