Toolkits

Toolkit modules provide abstract functions that operate on Pandas data frames and series. They can be used in isolation as a useful library, or can be used through a higher level Analysis method. Toolkit modules are organized by function and in general will only operate on data types from one particular backend. Currently, every toolkit function is implemented with Pandas.

Filters

This module provides functions for flagging pandas data series based on a range of criteria. The functions are largely intended for application in wind plant operational energy analysis, particularly wind speed vs. power curves.

operational_analysis.toolkits.filters.bin_filter(bin_col, value_col, bin_width, threshold=2, center_type='mean', bin_min=None, bin_max=None, threshold_type='std', direction='all')[source]

Flag time stamps for which data in <value_col> when binned by data in <bin_col> into bins of <width> is outside <threhsold> bin. The <center_type> of each bin can be either the median or mean, and flagging can be applied directionally (i.e. above or below the center, or both)

Parameters

bin_col (pandas.Series) – data to be used for binning
value_col (pandas.Series) – data to be flagged
bin_width (float) – width of bin in units of bin_col
threshold (float) – outlier threshold (multiplicative factor of std of <value_col> in bin)
bin_min (float) – minimum bin value below which flag should not be applied
bin_max (float) – maximum bin value above which flag should not be applied
threshold_type (str) – option to apply a ‘std’ or ‘scalar’ based threshold
center_type (str) – option to use a ‘mean’ or ‘median’ center for each bin
direction (str) – option to apply flag only to data ‘above’ or ‘below’ the mean, otherwise the default is
‘all’

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.cluster_mahalanobis_2d(data_col1, data_col2, n_clusters=13, dist_thresh=3.0)[source]

K-means clustering of data into <n_cluster> clusters; Mahalanobis distance evaluated for each cluster and points with distances outside of <dist_thresh> are flagged; distinguishes between asset ids

Parameters

data_col1 (pandas.Series) – first data column in 2D cluster analysis
data_col2 (pandas.Series) – second data column in 2D cluster analysis
n_clusters (int) – ‘ number of clusters to use
dist_thresh (float) – maximum Mahalanobis distance within each cluster for data to be remain unflagged

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.range_flag(data_col, below=- inf, above=inf)[source]

Flag data for which the specified data is outside a specified range

Parameters

data_col (pandas.Series) – data to be flagged
below (float) – upper threshold (inclusive) for data; default np.inf
above (float) – lower threshold (inclusive) for data; default -np.inf

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.std_range_flag(data_col, threshold=2.0)[source]

Flag time stamps for which the measurement is outside of the threshold number of standard deviations from the mean across all passed columns; does not distinguish between asset ids

Parameters

data_col (pandas.Series) – data to be flagged
threshold (float) – multiplicative factor on standard deviation to use in flagging

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.unresponsive_flag(data_col, threshold=3)[source]

Flag time stamps for which the reported data does not change for <threshold> repeated intervals. Function includes the option to group by a column in the data frame (e.g. turbine ID)

Parameters

data_col (pandas.Series) – data to be flagged
threshold (int) – number of intervals over which measurment does not change

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.window_range_flag(window_col, window_start, window_end, value_col, value_min, value_max)[source]

Flag time stamps for which measurement in column <window> within range [window_start, window_end] and measurement in column <value> outside of range [value_min, value_max]

Parameters

window_col (pandas.Series) – data used to define the window
window_start (float) – minimum value for window
window_end (float) – maximum value for window
value_col (pandas.Series) – data to be flagged
value_max (float) – upper threshold for data; default np.inf
value_min (float) – lower threshold for data; default -np.inf

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

Power Curve

This module provides methods to fit power curve models and use them to make predictions about ‘ideal’ power generation.

operational_analysis.toolkits.power_curve.functions.IEC(windspeed_column, power_column, bin_width=0.5, windspeed_start=0, windspeed_end=30.0)[source]

Use IEC 61400-12-1-2 method for creating wind-speed binned power curve. Power is set to zero for windspeed values outside of the cutoff range specified by windspeed_start and windspeed_end, inclusive of the endpoints.

Parameters

windspeed_column (pandas.Series) – feature column
power_column (pandas.Series) – response column
bin_width (float) – width of windspeed bin, default is 0.5 m/s according to standard
windspeed_start (float) – left edge of first windspeed bin
windspeed_end (float) – right edge of last windspeed bin

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

operational_analysis.toolkits.power_curve.functions.gam(windspeed_column, power_column, n_splines=20)[source]

Use a generalized additive model to fit power to wind speed.

Parameters

windspeed_column (pandas.Series) – Wind speed feature column
power_column (pandas.Series) – Power response column
n_splines (int) – number of splines to use in the fit

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

operational_analysis.toolkits.power_curve.functions.gam_3param(windspeed_column, winddir_column, airdens_column, power_column, n_splines=20)[source]

Use a generalized additive model to fit power to wind speed, wind direction and air density.

Parameters

windspeed_column (pandas.Series) – Wind speed feature column
power_column (pandas.Series) – Power response column
winddir_column (pandas.Series) – Optional. Wind direction feature column
airdens_column (pandas.Series) – Optional. Air density feature column
n_splines (int) – number of splines to use in the fit

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

operational_analysis.toolkits.power_curve.functions.logistic_5_parametric(windspeed_column, power_column)[source]

The present implementation follows the filtering method reported in:

M. Yesilbudaku Partitional clustering-based outlier detection for power curve optimization of wind turbines 2016 IEEE International Conference on Renewable Energy Research and Applications (ICRERA), Birmingham, 2016, pp. 1080-1084.

and the power curve method developed and reviewed in:

M Lydia, AI Selvakumar, SS Kumar, GEP. Kumar Advanced algorithms for wind turbine power curve modeling IEEE Trans Sustainable Energy, 4 (2013), pp. 827-835

M. Lydia, S.S. Kumar, I. Selvakumar, G.E. Prem Kumar A comprehensive review on wind turbine power curve modeling techniques Renew. Sust. Energy Rev., 30 (2014), pp. 452-460

In this case, the function fits the 5 parameter logistics function to observed data via a least-squares optimization (i.e. minimizing the sum of the squares of the residual between the points as evaluated by the parameterized function and the points of observed data).

Parameters

windspeed_column (pandas.Series) – feature column
power_column (pandas.Series) – response column
bin_width (float) – width of windspeed bin, default is 0.5 m/s according to standard
windspeed_start (float) – left edge of first windspeed bin
windspeed_end (float) – right edge of last windspeed bin

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

Imputing

This module provides methods for filling in null data with interpolated (imputed) values.

operational_analysis.toolkits.imputing.correlation_matrix_by_id_column(df, align_col, id_col, value_col)[source]

Create a correlation matrix between different assets in a data frame

Parameters

df (pandas.DataFrame) – input data frame
align_col (str) – name of column in <df> on which different assets are to be aligned
id_col (str) – the column distinguishing the different assets
value_col (str) – the column containing the data values to be used when assessing correlation

Returns

Correlation matrix with <id_col> as index and column names

Return type

pandas.DataFrame

operational_analysis.toolkits.imputing.impute_all_assets_by_correlation(data, input_col, ref_col, align_col, id_col, r2_threshold=0.7, method='linear')[source]

Imputes NaN data in a Pandas data frame to the best extent possible by considering available data across different assets in the data frame. Highest correlated assets are prioritized in the imputation process.

Steps include:

Establish correlation matrix of specified data between different assets
For each asset in the data frame, sort neighboring assets by correlation strength
Then impute asset data based on available data in the highest correlated neighbor
If NaN data still remains in asset, move on to next highest correlated neighbor, etc.
Continue until either:
1. There are no NaN data remaining in asset data
2. There are no more neighbors to consider
3. The neighboring asset does not meet the specified correlation threshold, <r2_threshold>

Parameters

data (pandas.DataFrame) – the data frame subject to imputation
input_col (str) – the name of the column in <data> to be imputed
ref_col (str) – the name of the column in <data> to be used in imputation
align_col (str) – the name of the column in <data> on which different assets are to be merged
id_col (str) – the name of the column in <data> distinguishing different assets
r2_threshold (float) – the correlation threshold for a neighboring assets to be considered valid for use in imputation

Returns

The imputation results

Return type

pandas.Series

operational_analysis.toolkits.imputing.impute_data(target_data, target_value_col, ref_data, ref_value_col, align_col, method='linear')[source]

Replaces NaN data in a target Pandas series with imputed data from a reference Panda series based on a linear regression relationship.

Steps include:

Merge the target and reference data frames on <align_col>, which is shared between the two
Determine the linear regression relationship between the target and reference data series
Apply that relationship to NaN data in the target series for which there is finite data in the reference series
Return the imputed results as well as the index matching the target data frame

Parameters

target_data (pandas.DataFrame) – the data frame containing NaN data to be imputed
target_value_col (str) – the name of the column in <target_data> to be imputed
ref_data (pandas.DataFrame) – the data frame containg data to be used in imputation
ref_value_col (str) – the name of the column in <target_data> to be used in imputation
align_col (str) – the name of the column in <data> on which different assets are to be merged

Returns

Copy of target_data_col series with NaN occurrences imputed where possible.

Return type

pandas.Series

Timeseries

This module provides useful functions for processing timeseries data

operational_analysis.toolkits.timeseries.convert_local_to_utc(d, tz_string)[source]

Convert timestamps in local time to UTC. The function can only act on a single timestamp at a time, so for example use the .apply function in Pandas:

date_utc = df[‘time’].apply(convert_local_to_utc, args = (‘US/Pacific’,))

Also note that this function doesn’t solve the end of DST when times between 1:00-2:00 are repeated in November. Those dates are left repeated in UTC time and need to be shifted manually.

The function does address the missing 2:00-3:00 times at the start of DST in March

Parameters

d (datetime.datetime) – the local date, tzinfo must not be set
tz_string (str) – the local timezone

Returns

the local date converted to UTC time

Return type

datetime.datetime

operational_analysis.toolkits.timeseries.find_duplicate_times(t_series, freq)[source]

Find duplicate input data and report them. The first duplicated item is not reported, only subsequent duplicates.

Parameters

t_series (pandas.Series) – Pandas series of datetime objects
freq (string) – time series frequency

Returns

Duplicates from input data

Return type

pandas.Series

operational_analysis.toolkits.timeseries.find_time_gaps(t_series, freq)[source]

Find data gaps in input data and report them

Parameters

t_series (pandas.Series) – Pandas series of datetime objects
freq (string) – time series frequency

Returns

Series of missing time stamps in datetime format

Return type

pandas.Series

operational_analysis.toolkits.timeseries.gap_fill_data_frame(df, time_col, freq)[source]

Find missing timestamps in the input data frame and add rows with NaN values for those missing rows. Return a new data frame that has no missing timestamps and that is sorted by time.

Parameters

df (pandas.DataFrame) – the input data frame
time_col (str) – name of the column in ‘df’ with time data

Returns

output data frame with NaN data for the data gaps

Return type

pandas.DataFrame

operational_analysis.toolkits.timeseries.num_days(s)[source]

Return number of days in ‘s’

Parameters: s (pandas.Series) – The data to be checked for number of days.
Returns: Number of days in the data
Return type: int

operational_analysis.toolkits.timeseries.num_hours(s)[source]

Return number of data points in ‘s’

Parameters: s (pandas.Series) – The data to be checked for number of data points
Returns: Number of hours in the data
Return type: int

operational_analysis.toolkits.timeseries.percent_nan(s)[source]

Return percentage of data that are Nan or 1 if the series is empty.

Parameters: s (pandas.Series) – The data to be checked for ‘na’ values
Returns: Percentage of NaN data in the data series
Return type: float

Met Data Processing

This module provides methods for processing meteorological data.

operational_analysis.toolkits.met_data_processing.air_density_adjusted_wind_speed(wind_col, density_col)[source]

Apply air density correction to wind speed measurements following IEC-61400-12-1 standard

Parameters

wind_col (str) – array containing the wind speed data; units of m/s
density_col (str) – array containing the air density data; units of kg/m3

Returns

density-adjusted wind speeds; units of m/s

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.compute_air_density(temp_col, pres_col, humi_col=None)[source]

Calculate air density from the ideal gas law based on the definition provided by IEC 61400-12 given pressure, temperature and relative humidity.

This function assumes temperature and pressure are reported in standard units of measurement (i.e. Kelvin for temperature, Pascal for pressure, humidity has no dimension).

Humidity values are optional. According to the IEC a humiditiy of 50% (0.5) is set as default value.

Parameters

temp_col (array-like) – array with temperature values; units of Kelvin
pres_col (array-like) – array with pressure values; units of Pascals
humi_col (array-like) – optional array with relative humidity values; dimensionless (range 0 to 1)

Returns

Rho, calcualted air density; units of kg/m3

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.compute_shear(df, windspeed_heights, ref_col='empty')[source]

Compute shear coefficient between wind speed measurements

Parameters

df (pandas.DataFrame) – dataframe with wind speed columns
windspeed_heights (dict) – keys are strings of columns in <df> containing wind speed data, values are associated sensor heights (m)
ref_col (str) – data column name for the data to use as the normalization value; only pertinent if optimizing over multiple measurements

Returns

shear coefficient (unitless)

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.compute_turbulence_intensity(mean_col, std_col)[source]

Compute turbulence intensity

Parameters

mean_col (array) – array containing the wind speed mean data; units of m/s
std_col (array) – array containing the wind speed standard deviation data; units of m/s

Returns

turbulence intensity, (unitless ratio)

Return type

array

operational_analysis.toolkits.met_data_processing.compute_u_v_components(wind_speed, wind_dir)[source]

Compute vector components of the horizontal wind given wind speed and direction

Parameters

wind_speed (pandas.Series) – horizontal wind speed; units of m/s
wind_dir (pandas.Series) – wind direction; units of degrees

Returns

u(pandas.Series): the zonal component of the wind; units of m/s. v(pandas.Series): the meridional component of the wind; units of m/s

Return type

(tuple)

operational_analysis.toolkits.met_data_processing.compute_veer(wind_a, height_a, wind_b, height_b)[source]

Compute veer between wind direction measurements

Parameters

wind_a, wind_b (array) – arrays containing the wind direction mean data; units of deg
height_a, height_b (array) – sensor heights (m)

Returns

veer (deg/m)

Return type

veer(array)

operational_analysis.toolkits.met_data_processing.compute_wind_direction(u, v)[source]

Compute wind direction given u and v wind vector components

Parameters

u (pandas.Series) – the zonal component of the wind; units of m/s
v (pandas.Series) – the meridional component of the wind; units of m/s

Returns

wind direction; units of degrees

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.pressure_vertical_extrapolation(p0, temp_avg, z0, z1)[source]

Extrapolate pressure from height z0 to height z1 given the average temperature in the layer. The hydostatic equation is used to peform the extrapolation.

Parameters

p0 (pandas.Series) – pressure at height z0; units of Pascals
temp_avg (pandas.Series) – mean temperature between z0 and z1; units of Kelvin
z0 (pandas.Series) – height above surface; units of meters
z1 (pandas.Series) – extrapolation height; units of meters

Returns

p1, extrapolated pressure at z1; units of Pascals

Return type

pandas.Series

Metadata Fetch

This module fetches metadata of wind farms

operational_analysis.toolkits.metadata_fetch.add_eia_meta_to_project(project, api_key, plant_id, file_path)[source]

Assign EIA meta data to PlantData object.

Parameters

project (PlantData) – PlantData object for a particular project
api_key (string) – 32-character user-specific API key, obtained from EIA
plant_id (string) – 5-character EIA power plant code
file_path (string) – directory with EIA metadata .xlsx files

Returns

(None)

operational_analysis.toolkits.metadata_fetch.fetch_eia(api_key, plant_id, file_path)[source]

Read in EIA data of wind farm of interest - from EIA API for monthly productions, return monthly net energy generation time series - from local Excel files for wind farm metadata, return dictionary of metadata

Parameters

api_key (string) – 32-character user-specific API key, obtained from EIA
plant_id (string) – 5-character EIA power plant code
file_path (string) – directory with EIA metadata .xlsx files in 2017

Returns

monthly net energy generation in MWh dictionary: metadata of the wind farm with ‘plant_id’

Return type

pandas.Series

Unit Conversion

This module provides basic methods for unit conversion and calculation of basic wind plant variables

operational_analysis.toolkits.unit_conversion.compute_gross_energy(net_energy, avail_losses, curt_losses, avail_type='frac', curt_type='frac')[source]

This function computes gross energy for a wind plant or turbine by adding reported availability and curtailment losses to reported net energy. Account is made of whether availabilty or curtailment loss data is reported in energy (‘energy’) or fractional units (‘frac’). If in energy units, this function assumes that net energy, availability loss, and curtailment loss are all reported in the same units

Parameters

net energy (numpy array of Pandas series) – reported net energy for wind plant or turbine
avail (numpy array of Pandas series) – reported availability losses for wind plant or turbine
curt (numpy array of Pandas series) – reported curtailment losses for wind plant or turbine

Returns

calculated gross energy for wind plant or turbine

Return type

gross (numpy array of Pandas series)

operational_analysis.toolkits.unit_conversion.convert_feet_to_meter(variable)[source]

Compute variable in [meter] from [feet] and return the data column

Parameters

df (pandas.Series) – the existing data frame to append to
variable (string) – variable in feet

Returns

variable in meters of the input data frame ‘df’

Return type

pandas.Series

operational_analysis.toolkits.unit_conversion.convert_power_to_energy(power_col, sample_rate_min='10T')[source]

Compute energy [kWh] from power [kw] and return the data column

Parameters

df (pandas.DataFrame) – the existing data frame to append to
col (string) – Power column to use if not power_kw
sample_rate_min (float) – Sampling rate in minutes to use for conversion, if not ten minutes

Returns

Energy in kWh that matches the length of the input data frame ‘df’

Return type

pandas.Series

Plotting

This module provides helpful functions for creating various plots

operational_analysis.toolkits.pandas_plotting.color_to_rgb(color)[source]

Converts named colors, hex and normalised RGB to 255 RGB values

Parameters: color (color) – RGB, HEX or named color
Returns: 255 RGB values
Return type: rgb(tuple)

Example

>>> color_to_rgb("Red")
(255, 0, 0)

>>> color_to_rgb((1,1,0))
(255,255,0)

>>> color_to_rgb("#ff00ff")
(255,0,255)

operational_analysis.toolkits.pandas_plotting.coordinateMapping(lon1, lat1, lon2, lat2)[source]

Map latitude and longitude to local cartesian coordinates

Parameters

lon1 (numpy array of shape (1, ) or scalar) – longitude of cartesian coordinate system origin
lat1 (numpy array of shape (1, ) or scalar) – latitude of cartesian coordinate system origin
lon2 (numpy array of shape (n, ) or scalar) – longitude(s) of points of interest
lat2 (numpy array of shape (n, ) or scalar) – latitude(s) of points of interest

Returns

Tuple representing cartesian coordinates (x, y); if arguments entered as scalars, returns scalars in tuple, if arguments entered as numpy arrays, returns numpy arrays each of shape (n,1)

operational_analysis.toolkits.pandas_plotting.luminance(rgb)[source]

Calculates the brightness of an rgb 255 color. See https://en.wikipedia.org/wiki/Relative_luminance

Parameters: rgb (tuple) – 255 (red, green, blue) tuple
Returns: relative luminance
Return type: luminance(scalar)

Example

>>> rgb = (255,127,0)
>>> luminance(rgb)
0.5687976470588235

>>> luminance((0,50,255))
0.21243529411764706

operational_analysis.toolkits.pandas_plotting.plot_array(project)[source]

Plot locations of turbines and met towers, with labels, on latitude/longitude grid

Parameters: project (plant object) – project to be plotted
Returns: (None)

operational_analysis.toolkits.pandas_plotting.plot_windfarm(project, tile_name='OpenMap', plot_width=800, plot_height=800, marker_size=14, kwargs_for_figure={}, kwargs_for_marker={})[source]

Plot the windfarm spatially on a map using the Bokeh plotting libaray.

Parameters

project (plant object) – project to be plotted
tile_name (str) – tile set to be used for the underlay, e.g. OpenMap, ESRI, OpenTopoMap
plot_width (scalar) – width of plot
plot_height (scalar) – height of plot
marker_size (scalar) – size of markers
kwargs_for_figure (dict) – additional figure options for advanced users, see Bokeh docs
kwargs_for_marker (dict) – additional marker options for advanced users, see Bokeh docs. We have some custom behavior around the “fill_color” attribute. If “fill_color” is not defined, OpenOA will use an internally defined color pallete. If “fill_color” is the name of a column in the asset table, OpenOA will use the value of that column as the marker color. Otherwise, “fill_color” is passed through to Bokeh.

Returns

windfarm map

Return type

Bokeh_plot(axes handle)

Example

import pandas as pd

from bokeh.plotting import figure, output_file, show

from operational_analysis.toolkits.pandas_plotting import plot_windfarm
from operational_analysis.types import PlantData

from examples.project_ENGIE import Project_Engie

# Load plant object
project = Project_Engie("../examples/data/la_haute_borne")

# Prepare data
project.prepare()

# Create the bokeh wind farm plot
show(plot_windfarm(project,tile_name="ESRI",plot_width=600,plot_height=600))

operational_analysis.toolkits.pandas_plotting.powerRose_array(project, fig, rect, tid, model_eval, shift=[0], direction=1)[source]

Plot power curve on polar coordinates overlaying plot of surrounding array (both local and further distance)

Parameters

project (plant object) – project to be plotted
fig (figure handle) – figure handle
rect (list of four scalars) – [left offset, bottom offset, width, height] as fractions of figure
width/height
tid (string) – id of turbine to be plotted
model_eval (dict) – JORDAN, WHAT IS THIS SUPPOSED TO BE??
shift (list of scalars) – number of degrees to rotate wind direction data, each plotted as new line
direction (-1, 1) – wind direction data measured clockwise (1) or counterclockwise (-1)

Returns:

operational_analysis.toolkits.pandas_plotting.subplot_powerRose_array(project, turbine_ids, shift=0, direction=1, columns=None, left_margin=0.1, bottom_margin=0.1, gap_w_frac=0.2, gap_h_frac=0.2, aspect=1)[source]

Wrapper for powerRose_array plotting for multiple subplots

Parameters

project (plant object) – project to be plotted
turbine_ids (list of strings) – ids of turbines to be plotted
shift (list of scalars) – number of degrees to rotate wind direction data, each plotted as new line
direction (-1, 1) – wind direction data measured clockwise (1) or counterclockwise (-1)
columns (scalar integer) – number of subplot columns
left_margin (scalar) – fraction of figure width to include as left margin
bottom_margin (scalar) – fraction of figure height to include as bottom margin
gap_w_frac (scalar) – fraction of figure width to include between subplots
gap_h_frac (scalar) – fraction of figure height to include as between subplots
aspect (scalar) – aspect ratio for subplots

Returns

(None)

operational_analysis.toolkits.pandas_plotting.subplt_c1_c2(turbine, axarr, c1, c2, c='Blues', xlim=None, ylim=None, xlabel=None, ylabel=None)[source]

hexbin plot of turbine[c1] vs turbine [c2]

Parameters

turbine (pandas dataframe) – data to be plotted
axarr (axis handle) – axis handle
c1 (string) – column name of x axis
c2 (string) – column name of y axis
c (string or colormap handle) – colormap

Returns

Return type

hb(plot handle)

operational_analysis.toolkits.pandas_plotting.subplt_c1_c2_flagged(turbine, axarr, c1, c2, flag_cols, flag_value, cmap='Blues', xlim=None, ylim=None, xlabel=None, ylabel=None)[source]

hexbin plot of turbine[c1] vs turbine [c2], showing only for which <flag_cols> have <value>

Parameters

turbine (pandas dataframe) – data to be plotted
axarr (axis handle) – axis handle
c1 (string) – column name of x axis
c2 (string) – column name of y axis
c (string or colormap handle) – colormap
flag_cols (list of strings) – column name(s) for flag columns
value_cols (string) – value in <filter_cols> for which data plotted

Returns

Return type

hb(plot handle)

operational_analysis.toolkits.pandas_plotting.subplt_c1_c2_raw_flagged(turbine, axarr, c1, c2, flag_cols, flag_value, cmap='Blues', markers=['x'], colors=['r'], xlim=None, ylim=None, xlabel=None, ylabel=None)[source]

hexbin plot of turbine[c1] vs turbine [c2], showing data <flag_cols> have <value> as overlaid scatter plot

Parameters

turbine (pandas dataframe) – data to be plotted
axarr (axis handle) – axis handle
c1 (string) – column name of x axis
c2 (string) – column name of y axis
c (string or colormap handle) – colormap
flag_cols (list of strings) – column name(s) for flag columns
value_cols (string) – value in <filter_cols> for which data plotted

Returns

Return type

hb(plot handle)

operational_analysis.toolkits.pandas_plotting.subplt_power_curve(turbine, axarr, fig, c3, pc)[source]

operational_analysis.toolkits.pandas_plotting.turbine_polar_4Dscatter(array, tid, theta, r, color, size, cmap='autumn_r')[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters

array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns
tid (str) – index of asset on which to center carthesian axes
theta (pandas series, np array, list) – anglular coordinates of points, in degrees
r (pandas series, np array, list) – radial coordinates of points
color (pandas series, np array, list) – color of points
size (pandas series, np array, list) – size of points

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)

operational_analysis.toolkits.pandas_plotting.turbine_polar_contour(array, tid, theta, r, z, levels, colors, ax_carthesian=None, ax_polar=None, label='')[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters

array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns
tid (str) – index of asset on which to center carthesian axes
theta (pandas series, np array, list) – anglular coordinates of points, in degrees
r (pandas series, np array, list) – radial coordinates of points
z (pandas series, np array, list) – colors of points
levels (list of float) – levels at which to draw contours
colors (list of colormap rows) – colors of drawn contours
ax_carthesian (axes handle) – carthesian axes on which array plotted
ax_polar (axes handle) – polar axes on which data plotted
label (string) – legend label

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)

operational_analysis.toolkits.pandas_plotting.turbine_polar_contourf(array, tid, theta, r, c, cmap='autumn_r')[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters

array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns
tid (str) – index of asset on which to center carthesian axes
theta (pandas series, np array, list) – anglular coordinates of points, in degrees
r (pandas series, np array, list) – radial coordinates of points
c (pandas series, np array, list) – colors of points

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)

operational_analysis.toolkits.pandas_plotting.turbine_polar_line(array, theta, r, line_label, tid, color='b', ax_carthesian=None, ax_polar=None)[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters

array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns
theta (pandas series, np array, list) – anglular coordinates of points, in degrees
r (pandas series, np array, list) – radial coordinates of points
line_label (str) – legend label
tid (str) – index of asset on which to center carthesian axes
ax_carthesian (axes handle) – existing carthesian axes on which to add array plot
ax_polar (axes handle) – existing polar axes on which to add plot

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)