Toolkits

Toolkit modules provide abstract functions that operate on Pandas data frames and series. They can be used in isolation as a useful library, or can be used through a higher level Analysis method. Toolkit modules are organized by function and in general will only operate on data types from one particular backend. Currently, every toolkit function is implemented with Pandas.

Filters

This module provides functions for flagging pandas data series based on a range of criteria. The functions are largely intended for application in wind plant operational energy analysis, particularly wind speed vs. power curves.

operational_analysis.toolkits.filters.bin_filter(bin_col, value_col, bin_width, threshold=2, center_type='mean', bin_min=None, bin_max=None, threshold_type='std', direction='all')[source]

Flag time stamps for which data in <value_col> when binned by data in <bin_col> into bins of <width> is outside <threhsold> bin. The <center_type> of each bin can be either the median or mean, and flagging can be applied directionally (i.e. above or below the center, or both)

Parameters
  • bin_col (pandas.Series) – data to be used for binning

  • value_col (pandas.Series) – data to be flagged

  • bin_width (float) – width of bin in units of bin_col

  • threshold (float) – outlier threshold (multiplicative factor of std of <value_col> in bin)

  • bin_min (float) – minimum bin value below which flag should not be applied

  • bin_max (float) – maximum bin value above which flag should not be applied

  • threshold_type (str) – option to apply a ‘std’ or ‘scalar’ based threshold

  • center_type (str) – option to use a ‘mean’ or ‘median’ center for each bin

  • direction (str) – option to apply flag only to data ‘above’ or ‘below’ the mean, otherwise the default is

  • ‘all’

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.cluster_mahalanobis_2d(data_col1, data_col2, n_clusters=13, dist_thresh=3.0)[source]

K-means clustering of data into <n_cluster> clusters; Mahalanobis distance evaluated for each cluster and points with distances outside of <dist_thresh> are flagged; distinguishes between asset ids

Parameters
  • data_col1 (pandas.Series) – first data column in 2D cluster analysis

  • data_col2 (pandas.Series) – second data column in 2D cluster analysis

  • n_clusters (int) – ‘ number of clusters to use

  • dist_thresh (float) – maximum Mahalanobis distance within each cluster for data to be remain unflagged

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.range_flag(data_col, below=- inf, above=inf)[source]

Flag data for which the specified data is outside a specified range

Parameters
  • data_col (pandas.Series) – data to be flagged

  • below (float) – upper threshold (inclusive) for data; default np.inf

  • above (float) – lower threshold (inclusive) for data; default -np.inf

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.std_range_flag(data_col, threshold=2.0)[source]

Flag time stamps for which the measurement is outside of the threshold number of standard deviations from the mean across all passed columns; does not distinguish between asset ids

Parameters
  • data_col (pandas.Series) – data to be flagged

  • threshold (float) – multiplicative factor on standard deviation to use in flagging

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.unresponsive_flag(data_col, threshold=3)[source]

Flag time stamps for which the reported data does not change for <threshold> repeated intervals. Function includes the option to group by a column in the data frame (e.g. turbine ID)

Parameters
  • data_col (pandas.Series) – data to be flagged

  • threshold (int) – number of intervals over which measurment does not change

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

operational_analysis.toolkits.filters.window_range_flag(window_col, window_start, window_end, value_col, value_min, value_max)[source]

Flag time stamps for which measurement in column <window> within range [window_start, window_end] and measurement in column <value> outside of range [value_min, value_max]

Parameters
  • window_col (pandas.Series) – data used to define the window

  • window_start (float) – minimum value for window

  • window_end (float) – maximum value for window

  • value_col (pandas.Series) – data to be flagged

  • value_max (float) – upper threshold for data; default np.inf

  • value_min (float) – lower threshold for data; default -np.inf

Returns

Array-like object with boolean entries.

Return type

pandas.Series(bool)

Power Curve

This module provides methods to fit power curve models and use them to make predictions about ‘ideal’ power generation.

operational_analysis.toolkits.power_curve.functions.IEC(windspeed_column, power_column, bin_width=0.5, windspeed_start=0, windspeed_end=30.0)[source]

Use IEC 61400-12-1-2 method for creating wind-speed binned power curve. Power is set to zero for windspeed values outside of the cutoff range specified by windspeed_start and windspeed_end, inclusive of the endpoints.

Parameters
  • windspeed_column (pandas.Series) – feature column

  • power_column (pandas.Series) – response column

  • bin_width (float) – width of windspeed bin, default is 0.5 m/s according to standard

  • windspeed_start (float) – left edge of first windspeed bin

  • windspeed_end (float) – right edge of last windspeed bin

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

operational_analysis.toolkits.power_curve.functions.gam(windspeed_column, power_column, n_splines=20)[source]

Use a generalized additive model to fit power to wind speed.

Parameters
  • windspeed_column (pandas.Series) – Wind speed feature column

  • power_column (pandas.Series) – Power response column

  • n_splines (int) – number of splines to use in the fit

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

operational_analysis.toolkits.power_curve.functions.gam_3param(windspeed_column, winddir_column, airdens_column, power_column, n_splines=20)[source]

Use a generalized additive model to fit power to wind speed, wind direction and air density.

Parameters
  • windspeed_column (pandas.Series) – Wind speed feature column

  • power_column (pandas.Series) – Power response column

  • winddir_column (pandas.Series) – Optional. Wind direction feature column

  • airdens_column (pandas.Series) – Optional. Air density feature column

  • n_splines (int) – number of splines to use in the fit

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

operational_analysis.toolkits.power_curve.functions.logistic_5_parametric(windspeed_column, power_column)[source]

The present implementation follows the filtering method reported in:

M. Yesilbudaku Partitional clustering-based outlier detection for power curve optimization of wind turbines 2016 IEEE International Conference on Renewable Energy Research and Applications (ICRERA), Birmingham, 2016, pp. 1080-1084.

and the power curve method developed and reviewed in:

M Lydia, AI Selvakumar, SS Kumar, GEP. Kumar Advanced algorithms for wind turbine power curve modeling IEEE Trans Sustainable Energy, 4 (2013), pp. 827-835

M. Lydia, S.S. Kumar, I. Selvakumar, G.E. Prem Kumar A comprehensive review on wind turbine power curve modeling techniques Renew. Sust. Energy Rev., 30 (2014), pp. 452-460

In this case, the function fits the 5 parameter logistics function to observed data via a least-squares optimization (i.e. minimizing the sum of the squares of the residual between the points as evaluated by the parameterized function and the points of observed data).

Parameters
  • windspeed_column (pandas.Series) – feature column

  • power_column (pandas.Series) – response column

  • bin_width (float) – width of windspeed bin, default is 0.5 m/s according to standard

  • windspeed_start (float) – left edge of first windspeed bin

  • windspeed_end (float) – right edge of last windspeed bin

Returns

Python function of type (Array[float] -> Array[float]) implementing the power curve.

Return type

function

Imputing

This module provides methods for filling in null data with interpolated (imputed) values.

operational_analysis.toolkits.imputing.correlation_matrix_by_id_column(df, align_col, id_col, value_col)[source]

Create a correlation matrix between different assets in a data frame

Parameters
  • df (pandas.DataFrame) – input data frame

  • align_col (str) – name of column in <df> on which different assets are to be aligned

  • id_col (str) – the column distinguishing the different assets

  • value_col (str) – the column containing the data values to be used when assessing correlation

Returns

Correlation matrix with <id_col> as index and column names

Return type

pandas.DataFrame

operational_analysis.toolkits.imputing.impute_all_assets_by_correlation(data, input_col, ref_col, align_col, id_col, r2_threshold=0.7, method='linear')[source]

Imputes NaN data in a Pandas data frame to the best extent possible by considering available data across different assets in the data frame. Highest correlated assets are prioritized in the imputation process.

Steps include:

  1. Establish correlation matrix of specified data between different assets

  2. For each asset in the data frame, sort neighboring assets by correlation strength

  3. Then impute asset data based on available data in the highest correlated neighbor

  4. If NaN data still remains in asset, move on to next highest correlated neighbor, etc.

  5. Continue until either:
    1. There are no NaN data remaining in asset data

    2. There are no more neighbors to consider

    3. The neighboring asset does not meet the specified correlation threshold, <r2_threshold>

Parameters
  • data (pandas.DataFrame) – the data frame subject to imputation

  • input_col (str) – the name of the column in <data> to be imputed

  • ref_col (str) – the name of the column in <data> to be used in imputation

  • align_col (str) – the name of the column in <data> on which different assets are to be merged

  • id_col (str) – the name of the column in <data> distinguishing different assets

  • r2_threshold (float) – the correlation threshold for a neighboring assets to be considered valid for use in imputation

Returns

The imputation results

Return type

pandas.Series

operational_analysis.toolkits.imputing.impute_data(target_data, target_value_col, ref_data, ref_value_col, align_col, method='linear')[source]

Replaces NaN data in a target Pandas series with imputed data from a reference Panda series based on a linear regression relationship.

Steps include:

  1. Merge the target and reference data frames on <align_col>, which is shared between the two

  2. Determine the linear regression relationship between the target and reference data series

  3. Apply that relationship to NaN data in the target series for which there is finite data in the reference series

  4. Return the imputed results as well as the index matching the target data frame

Parameters
  • target_data (pandas.DataFrame) – the data frame containing NaN data to be imputed

  • target_value_col (str) – the name of the column in <target_data> to be imputed

  • ref_data (pandas.DataFrame) – the data frame containg data to be used in imputation

  • ref_value_col (str) – the name of the column in <target_data> to be used in imputation

  • align_col (str) – the name of the column in <data> on which different assets are to be merged

Returns

Copy of target_data_col series with NaN occurrences imputed where possible.

Return type

pandas.Series

Timeseries

This module provides useful functions for processing timeseries data

operational_analysis.toolkits.timeseries.convert_local_to_utc(d, tz_string)[source]

Convert timestamps in local time to UTC. The function can only act on a single timestamp at a time, so for example use the .apply function in Pandas:

date_utc = df[‘time’].apply(convert_local_to_utc, args = (‘US/Pacific’,))

Also note that this function doesn’t solve the end of DST when times between 1:00-2:00 are repeated in November. Those dates are left repeated in UTC time and need to be shifted manually.

The function does address the missing 2:00-3:00 times at the start of DST in March

Parameters
  • d (datetime.datetime) – the local date, tzinfo must not be set

  • tz_string (str) – the local timezone

Returns

the local date converted to UTC time

Return type

datetime.datetime

operational_analysis.toolkits.timeseries.find_duplicate_times(t_series, freq)[source]

Find duplicate input data and report them. The first duplicated item is not reported, only subsequent duplicates.

Parameters
  • t_series (pandas.Series) – Pandas series of datetime objects

  • freq (string) – time series frequency

Returns

Duplicates from input data

Return type

pandas.Series

operational_analysis.toolkits.timeseries.find_time_gaps(t_series, freq)[source]

Find data gaps in input data and report them

Parameters
  • t_series (pandas.Series) – Pandas series of datetime objects

  • freq (string) – time series frequency

Returns

Series of missing time stamps in datetime format

Return type

pandas.Series

operational_analysis.toolkits.timeseries.gap_fill_data_frame(df, time_col, freq)[source]

Find missing timestamps in the input data frame and add rows with NaN values for those missing rows. Return a new data frame that has no missing timestamps and that is sorted by time.

Parameters
  • df (pandas.DataFrame) – the input data frame

  • time_col (str) – name of the column in ‘df’ with time data

Returns

output data frame with NaN data for the data gaps

Return type

pandas.DataFrame

operational_analysis.toolkits.timeseries.num_days(s)[source]

Return number of days in ‘s’

Parameters

s (pandas.Series) – The data to be checked for number of days.

Returns

Number of days in the data

Return type

int

operational_analysis.toolkits.timeseries.num_hours(s)[source]

Return number of data points in ‘s’

Parameters

s (pandas.Series) – The data to be checked for number of data points

Returns

Number of hours in the data

Return type

int

operational_analysis.toolkits.timeseries.percent_nan(s)[source]

Return percentage of data that are Nan or 1 if the series is empty.

Parameters

s (pandas.Series) – The data to be checked for ‘na’ values

Returns

Percentage of NaN data in the data series

Return type

float

Met Data Processing

This module provides methods for processing meteorological data.

operational_analysis.toolkits.met_data_processing.air_density_adjusted_wind_speed(wind_col, density_col)[source]

Apply air density correction to wind speed measurements following IEC-61400-12-1 standard

Parameters
  • wind_col (str) – array containing the wind speed data; units of m/s

  • density_col (str) – array containing the air density data; units of kg/m3

Returns

density-adjusted wind speeds; units of m/s

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.compute_air_density(temp_col, pres_col, humi_col=None)[source]

Calculate air density from the ideal gas law based on the definition provided by IEC 61400-12 given pressure, temperature and relative humidity.

This function assumes temperature and pressure are reported in standard units of measurement (i.e. Kelvin for temperature, Pascal for pressure, humidity has no dimension).

Humidity values are optional. According to the IEC a humiditiy of 50% (0.5) is set as default value.

Parameters
  • temp_col (array-like) – array with temperature values; units of Kelvin

  • pres_col (array-like) – array with pressure values; units of Pascals

  • humi_col (array-like) – optional array with relative humidity values; dimensionless (range 0 to 1)

Returns

Rho, calcualted air density; units of kg/m3

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.compute_shear(df, windspeed_heights, ref_col='empty')[source]

Compute shear coefficient between wind speed measurements

Parameters
  • df (pandas.DataFrame) – dataframe with wind speed columns

  • windspeed_heights (dict) – keys are strings of columns in <df> containing wind speed data, values are associated sensor heights (m)

  • ref_col (str) – data column name for the data to use as the normalization value; only pertinent if optimizing over multiple measurements

Returns

shear coefficient (unitless)

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.compute_turbulence_intensity(mean_col, std_col)[source]

Compute turbulence intensity

Parameters
  • mean_col (array) – array containing the wind speed mean data; units of m/s

  • std_col (array) – array containing the wind speed standard deviation data; units of m/s

Returns

turbulence intensity, (unitless ratio)

Return type

array

operational_analysis.toolkits.met_data_processing.compute_u_v_components(wind_speed, wind_dir)[source]

Compute vector components of the horizontal wind given wind speed and direction

Parameters
  • wind_speed (pandas.Series) – horizontal wind speed; units of m/s

  • wind_dir (pandas.Series) – wind direction; units of degrees

Returns

u(pandas.Series): the zonal component of the wind; units of m/s. v(pandas.Series): the meridional component of the wind; units of m/s

Return type

(tuple)

operational_analysis.toolkits.met_data_processing.compute_veer(wind_a, height_a, wind_b, height_b)[source]

Compute veer between wind direction measurements

Parameters
  • wind_a, wind_b (array) – arrays containing the wind direction mean data; units of deg

  • height_a, height_b (array) – sensor heights (m)

Returns

veer (deg/m)

Return type

veer(array)

operational_analysis.toolkits.met_data_processing.compute_wind_direction(u, v)[source]

Compute wind direction given u and v wind vector components

Parameters
  • u (pandas.Series) – the zonal component of the wind; units of m/s

  • v (pandas.Series) – the meridional component of the wind; units of m/s

Returns

wind direction; units of degrees

Return type

pandas.Series

operational_analysis.toolkits.met_data_processing.pressure_vertical_extrapolation(p0, temp_avg, z0, z1)[source]

Extrapolate pressure from height z0 to height z1 given the average temperature in the layer. The hydostatic equation is used to peform the extrapolation.

Parameters
  • p0 (pandas.Series) – pressure at height z0; units of Pascals

  • temp_avg (pandas.Series) – mean temperature between z0 and z1; units of Kelvin

  • z0 (pandas.Series) – height above surface; units of meters

  • z1 (pandas.Series) – extrapolation height; units of meters

Returns

p1, extrapolated pressure at z1; units of Pascals

Return type

pandas.Series

Metadata Fetch

This module fetches metadata of wind farms

operational_analysis.toolkits.metadata_fetch.add_eia_meta_to_project(project, api_key, plant_id, file_path)[source]

Assign EIA meta data to PlantData object.

Parameters
  • project (PlantData) – PlantData object for a particular project

  • api_key (string) – 32-character user-specific API key, obtained from EIA

  • plant_id (string) – 5-character EIA power plant code

  • file_path (string) – directory with EIA metadata .xlsx files

Returns

(None)

operational_analysis.toolkits.metadata_fetch.fetch_eia(api_key, plant_id, file_path)[source]

Read in EIA data of wind farm of interest - from EIA API for monthly productions, return monthly net energy generation time series - from local Excel files for wind farm metadata, return dictionary of metadata

Parameters
  • api_key (string) – 32-character user-specific API key, obtained from EIA

  • plant_id (string) – 5-character EIA power plant code

  • file_path (string) – directory with EIA metadata .xlsx files in 2017

Returns

monthly net energy generation in MWh dictionary: metadata of the wind farm with ‘plant_id’

Return type

pandas.Series

Unit Conversion

This module provides basic methods for unit conversion and calculation of basic wind plant variables

operational_analysis.toolkits.unit_conversion.compute_gross_energy(net_energy, avail_losses, curt_losses, avail_type='frac', curt_type='frac')[source]

This function computes gross energy for a wind plant or turbine by adding reported availability and curtailment losses to reported net energy. Account is made of whether availabilty or curtailment loss data is reported in energy (‘energy’) or fractional units (‘frac’). If in energy units, this function assumes that net energy, availability loss, and curtailment loss are all reported in the same units

Parameters
  • net energy (numpy array of Pandas series) – reported net energy for wind plant or turbine

  • avail (numpy array of Pandas series) – reported availability losses for wind plant or turbine

  • curt (numpy array of Pandas series) – reported curtailment losses for wind plant or turbine

Returns

calculated gross energy for wind plant or turbine

Return type

gross (numpy array of Pandas series)

operational_analysis.toolkits.unit_conversion.convert_feet_to_meter(variable)[source]

Compute variable in [meter] from [feet] and return the data column

Parameters
  • df (pandas.Series) – the existing data frame to append to

  • variable (string) – variable in feet

Returns

variable in meters of the input data frame ‘df’

Return type

pandas.Series

operational_analysis.toolkits.unit_conversion.convert_power_to_energy(power_col, sample_rate_min='10T')[source]

Compute energy [kWh] from power [kw] and return the data column

Parameters
  • df (pandas.DataFrame) – the existing data frame to append to

  • col (string) – Power column to use if not power_kw

  • sample_rate_min (float) – Sampling rate in minutes to use for conversion, if not ten minutes

Returns

Energy in kWh that matches the length of the input data frame ‘df’

Return type

pandas.Series

Plotting

This module provides helpful functions for creating various plots

operational_analysis.toolkits.pandas_plotting.color_to_rgb(color)[source]

Converts named colors, hex and normalised RGB to 255 RGB values

Parameters

color (color) – RGB, HEX or named color

Returns

255 RGB values

Return type

rgb(tuple)

Example

>>> color_to_rgb("Red")
(255, 0, 0)

>>> color_to_rgb((1,1,0))
(255,255,0)

>>> color_to_rgb("#ff00ff")
(255,0,255)
operational_analysis.toolkits.pandas_plotting.coordinateMapping(lon1, lat1, lon2, lat2)[source]

Map latitude and longitude to local cartesian coordinates

Parameters
  • lon1 (numpy array of shape (1, ) or scalar) – longitude of cartesian coordinate system origin

  • lat1 (numpy array of shape (1, ) or scalar) – latitude of cartesian coordinate system origin

  • lon2 (numpy array of shape (n, ) or scalar) – longitude(s) of points of interest

  • lat2 (numpy array of shape (n, ) or scalar) – latitude(s) of points of interest

Returns

Tuple representing cartesian coordinates (x, y); if arguments entered as scalars, returns scalars in tuple, if arguments entered as numpy arrays, returns numpy arrays each of shape (n,1)

operational_analysis.toolkits.pandas_plotting.luminance(rgb)[source]

Calculates the brightness of an rgb 255 color. See https://en.wikipedia.org/wiki/Relative_luminance

Parameters

rgb (tuple) – 255 (red, green, blue) tuple

Returns

relative luminance

Return type

luminance(scalar)

Example

>>> rgb = (255,127,0)
>>> luminance(rgb)
0.5687976470588235

>>> luminance((0,50,255))
0.21243529411764706
operational_analysis.toolkits.pandas_plotting.plot_array(project)[source]

Plot locations of turbines and met towers, with labels, on latitude/longitude grid

Parameters

project (plant object) – project to be plotted

Returns

(None)

operational_analysis.toolkits.pandas_plotting.plot_windfarm(project, tile_name='OpenMap', plot_width=800, plot_height=800, marker_size=14, kwargs_for_figure={}, kwargs_for_marker={})[source]

Plot the windfarm spatially on a map using the Bokeh plotting libaray.

Parameters
  • project (plant object) – project to be plotted

  • tile_name (str) – tile set to be used for the underlay, e.g. OpenMap, ESRI, OpenTopoMap

  • plot_width (scalar) – width of plot

  • plot_height (scalar) – height of plot

  • marker_size (scalar) – size of markers

  • kwargs_for_figure (dict) – additional figure options for advanced users, see Bokeh docs

  • kwargs_for_marker (dict) – additional marker options for advanced users, see Bokeh docs. We have some custom behavior around the “fill_color” attribute. If “fill_color” is not defined, OpenOA will use an internally defined color pallete. If “fill_color” is the name of a column in the asset table, OpenOA will use the value of that column as the marker color. Otherwise, “fill_color” is passed through to Bokeh.

Returns

windfarm map

Return type

Bokeh_plot(axes handle)

Example

import pandas as pd

from bokeh.plotting import figure, output_file, show

from operational_analysis.toolkits.pandas_plotting import plot_windfarm
from operational_analysis.types import PlantData

from examples.project_ENGIE import Project_Engie

# Load plant object
project = Project_Engie("../examples/data/la_haute_borne")

# Prepare data
project.prepare()

# Create the bokeh wind farm plot
show(plot_windfarm(project,tile_name="ESRI",plot_width=600,plot_height=600))
operational_analysis.toolkits.pandas_plotting.powerRose_array(project, fig, rect, tid, model_eval, shift=[0], direction=1)[source]

Plot power curve on polar coordinates overlaying plot of surrounding array (both local and further distance)

Parameters
  • project (plant object) – project to be plotted

  • fig (figure handle) – figure handle

  • rect (list of four scalars) – [left offset, bottom offset, width, height] as fractions of figure

  • width/height

  • tid (string) – id of turbine to be plotted

  • model_eval (dict) – JORDAN, WHAT IS THIS SUPPOSED TO BE??

  • shift (list of scalars) – number of degrees to rotate wind direction data, each plotted as new line

  • direction (-1, 1) – wind direction data measured clockwise (1) or counterclockwise (-1)

Returns:

operational_analysis.toolkits.pandas_plotting.subplot_powerRose_array(project, turbine_ids, shift=0, direction=1, columns=None, left_margin=0.1, bottom_margin=0.1, gap_w_frac=0.2, gap_h_frac=0.2, aspect=1)[source]

Wrapper for powerRose_array plotting for multiple subplots

Parameters
  • project (plant object) – project to be plotted

  • turbine_ids (list of strings) – ids of turbines to be plotted

  • shift (list of scalars) – number of degrees to rotate wind direction data, each plotted as new line

  • direction (-1, 1) – wind direction data measured clockwise (1) or counterclockwise (-1)

  • columns (scalar integer) – number of subplot columns

  • left_margin (scalar) – fraction of figure width to include as left margin

  • bottom_margin (scalar) – fraction of figure height to include as bottom margin

  • gap_w_frac (scalar) – fraction of figure width to include between subplots

  • gap_h_frac (scalar) – fraction of figure height to include as between subplots

  • aspect (scalar) – aspect ratio for subplots

Returns

(None)

operational_analysis.toolkits.pandas_plotting.subplt_c1_c2(turbine, axarr, c1, c2, c='Blues', xlim=None, ylim=None, xlabel=None, ylabel=None)[source]

hexbin plot of turbine[c1] vs turbine [c2]

Parameters
  • turbine (pandas dataframe) – data to be plotted

  • axarr (axis handle) – axis handle

  • c1 (string) – column name of x axis

  • c2 (string) – column name of y axis

  • c (string or colormap handle) – colormap

Returns

Return type

hb(plot handle)

operational_analysis.toolkits.pandas_plotting.subplt_c1_c2_flagged(turbine, axarr, c1, c2, flag_cols, flag_value, cmap='Blues', xlim=None, ylim=None, xlabel=None, ylabel=None)[source]

hexbin plot of turbine[c1] vs turbine [c2], showing only for which <flag_cols> have <value>

Parameters
  • turbine (pandas dataframe) – data to be plotted

  • axarr (axis handle) – axis handle

  • c1 (string) – column name of x axis

  • c2 (string) – column name of y axis

  • c (string or colormap handle) – colormap

  • flag_cols (list of strings) – column name(s) for flag columns

  • value_cols (string) – value in <filter_cols> for which data plotted

Returns

Return type

hb(plot handle)

operational_analysis.toolkits.pandas_plotting.subplt_c1_c2_raw_flagged(turbine, axarr, c1, c2, flag_cols, flag_value, cmap='Blues', markers=['x'], colors=['r'], xlim=None, ylim=None, xlabel=None, ylabel=None)[source]

hexbin plot of turbine[c1] vs turbine [c2], showing data <flag_cols> have <value> as overlaid scatter plot

Parameters
  • turbine (pandas dataframe) – data to be plotted

  • axarr (axis handle) – axis handle

  • c1 (string) – column name of x axis

  • c2 (string) – column name of y axis

  • c (string or colormap handle) – colormap

  • flag_cols (list of strings) – column name(s) for flag columns

  • value_cols (string) – value in <filter_cols> for which data plotted

Returns

Return type

hb(plot handle)

operational_analysis.toolkits.pandas_plotting.subplt_power_curve(turbine, axarr, fig, c3, pc)[source]
operational_analysis.toolkits.pandas_plotting.turbine_polar_4Dscatter(array, tid, theta, r, color, size, cmap='autumn_r')[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters
  • array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns

  • tid (str) – index of asset on which to center carthesian axes

  • theta (pandas series, np array, list) – anglular coordinates of points, in degrees

  • r (pandas series, np array, list) – radial coordinates of points

  • color (pandas series, np array, list) – color of points

  • size (pandas series, np array, list) – size of points

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)

operational_analysis.toolkits.pandas_plotting.turbine_polar_contour(array, tid, theta, r, z, levels, colors, ax_carthesian=None, ax_polar=None, label='')[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters
  • array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns

  • tid (str) – index of asset on which to center carthesian axes

  • theta (pandas series, np array, list) – anglular coordinates of points, in degrees

  • r (pandas series, np array, list) – radial coordinates of points

  • z (pandas series, np array, list) – colors of points

  • levels (list of float) – levels at which to draw contours

  • colors (list of colormap rows) – colors of drawn contours

  • ax_carthesian (axes handle) – carthesian axes on which array plotted

  • ax_polar (axes handle) – polar axes on which data plotted

  • label (string) – legend label

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)

operational_analysis.toolkits.pandas_plotting.turbine_polar_contourf(array, tid, theta, r, c, cmap='autumn_r')[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters
  • array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns

  • tid (str) – index of asset on which to center carthesian axes

  • theta (pandas series, np array, list) – anglular coordinates of points, in degrees

  • r (pandas series, np array, list) – radial coordinates of points

  • c (pandas series, np array, list) – colors of points

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)

operational_analysis.toolkits.pandas_plotting.turbine_polar_line(array, theta, r, line_label, tid, color='b', ax_carthesian=None, ax_polar=None)[source]

Polar plot (<r>, <theta>) overlaying plot of surrounding array, centered on turbine <tid>

Parameters
  • array (pandas dataframe) – index by (string) labels of assets, ‘x’ and ‘y’ coordinate columns

  • theta (pandas series, np array, list) – anglular coordinates of points, in degrees

  • r (pandas series, np array, list) – radial coordinates of points

  • line_label (str) – legend label

  • tid (str) – index of asset on which to center carthesian axes

  • ax_carthesian (axes handle) – existing carthesian axes on which to add array plot

  • ax_polar (axes handle) – existing polar axes on which to add plot

Returns

carthesian axes on which array plotted ax_polar(axes handle): polar axes on which data plotted

Return type

ax_carthesian(axes handle)