Utils

This chapter documents the Utils. Functions and plots to aid in exploratory analysis

Analysis

One off functions for various analysis.

first_top5_bottom_stats(doc_filter, col_lst):

Calculate mu, std, var, max, min, skew, kurt for all matches depending on teamPlacement. The intent is for a map_choice and mode_choice to be fed into the DocumentFilter. Does calculations for all matches, regardless of matchID.

Parameters
  • doc_filter (DocumentFilter) – Input DocumentFilter.

  • col_lst (List[str] or str) – Input List of Columns to analyze.

Returns

Stats, related to the items in col_lst, for winners, top 5 or 10, and bottom.

Return type

pd.DataFrame

Example

None

Note

If Rebirth is selected in the DocumentFilter, will return top 5. If Verdansk, top 10 is returned.

bucket_stats(doc_filter, placement, col_lst):

Calculate mu, std, var, max, min, skew, kurt for all matches depending on teamPlacement. The intent is for a map_choice and mode_choice to be fed into the DocumentFilter. Does calculations for all matches, considering of matchID.

Parameters
  • doc_filter (DocumentFilter) – Input DocumentFilter.

  • placement (List[int] or int) – Target placement.

  • col_lst (List[str] or str) – Input List of Columns to analyze.

Returns

Stats, related to the items in col_lst, for placement value.

Return type

pd.DataFrame

Example

None

Note

teamPlacement value used to filter data. If two int’s are provided, will filter within that range. First value should be the lower value. Example [0,6] will return top 5 placements.

previous_next_placement(doc_filter):

Calculate mu teamPlacement before and after a teamPlacement. The intent is for a map_choice and mode_choice to be fed into the DocumentFilter.

Parameters

doc_filter (DocumentFilter) – Input DocumentFilter.

Returns

Previous and next expected placement based on current placement.

Return type

pd.DataFrame

Example

None

Note

None

match_difficulty(our_doc_filter, other_doc_filter, mu_lst, sum_lst, test):

Calculate the relative match difficulty based on player and player squad stats.

Parameters
  • our_doc_filter (DocumentFilter) – A DocumentFilter with squad and player data only.

  • other_doc_filter (DocumentFilter) – A DocumentFilter with all other players data.

  • mu_lst (List[str]) – A list of columns to consider the mu. Optional

  • sum_lst (List[str]) – A list of columns to consider the sum. Optional

  • test (bool) – If True, will use all columns for the analysis. Optional

Returns

Match difficulty.

Return type

pd.DataFrame

Example

None

Note

The intent is for a map_choice and mode_choice to be fed into both DocumentFilter’s.

get_daily_hourly_weekday_stats(doc_filter):

Calculate kills, deaths, wins, top 5s or 10s, match count, and averagePlacement for every day, week, hour.

Parameters

doc_filter (DocumentFilter) – Input DocumentFilter.

Returns

3 pd.DataFrames and a dict

Return type

None

Example

None

Note

The intent is for a map_choice and mode_choice to be fed into the DocumentFilter.

get_weapons(doc_filter):

Calculate the Kills, deaths, assists, headshots, averagePlacement and count for each weapon.

Parameters

doc_filter (DocumentFilter) – Input DocumentFilter.

Returns

A DataFrame with a players gun stats.

Return type

pd.DataFrame

Example

None

Note

The intent is for a username to be fed into the DocumentFilter and this will return the information for that specific player.

find_hackers(doc_filter, y_column, col_lst, std):

Calculate hackers based on various Outlier detection methods.

Parameters
  • doc_filter (DocumentFilter) – A DocumentFilter.

  • y_column (str) – A column to consider for Outlier analysis.

  • col_lst (List[str]) – A list of columns used for Outlier analysis.

  • std (int) – The std to be considered for as a threshold, default is 3. Optional

Returns

Returns an index of suspected hackers.

Return type

List[int]

Example

None

Note

The intent is for a map_choice and mode_choice to be fed into the DocumentFilter.

meta_weapons(doc_filter, top_5_or_10, top_1, col, mu):

Calculate the most popular weapons. Map_choice is required in DocumentFilter if top_5_or_10 or top_1 is True. If Neither top_5_or_10 or top_1 are True, it will calculate based on all team placements. This will only include loadouts where all attachment slots are filled. This calculates based on a daily interval.

Parameters
  • doc_filter (DocumentFilter) – A DocumentFilter.

  • top_5_or_10 (bool) – If True, will calculate using only the top 5 or 10 place teams, default is False. Optional

  • top_1 (bool) – If True, will calculate using only the 1st place or winning team, default is False. Optional

  • col (str) – If given will use a column as reference, default is None. None will count gun users per day. Optional

  • mu (bool) – If True, will calculate using mean, default is sum. Optional

Returns

The First DataFrame is filled with dict’s {kills: 0, deaths: 0, count: 0}. The Second is the percent of the lobby using.

Return type

List[pd.DataFrame]

Example

None

Note

None

Base

General transformations.

normalize(arr, multi):

Normalize an Array.

Parameters
  • arr (np.ndarray) – Input array.

  • multi (bool) – If array has multiple columns, default is None. Optional

Returns

Normalized array.

Return type

np.ndarray

Example

None

Note

Set multi to True, if multiple columns.

running_mean(arr, num):

Calculate the running mean on num interval

Parameters
  • arr (np.ndarray) – Input array.

  • num (int) – Input int, default is 50. Optional

Returns

Running mean for a given array.

Return type

np.ndarray

Example

None

Note

None

cumulative_mean(arr):

Calculate the cumulative mean.

Parameters

arr (np.ndarray) – Input array.

Returns

Cumulative mean for a given array.

Return type

np.ndarray

Example

None

Note

None

Build

These functions are used when building the CallofDuty class.

CallofDuty

Outlier

Various outlier detection functions.

stack(x_arr, y_arr, multi):

Stacks x_arr and y_arr.

Parameters
  • x_arr (np.ndarray) – An array to stack.

  • y_arr (np.ndarray) – An array to stack.

  • mutli – If True, will stack based on multiple x_arr columns, default is False. Optional

Returns

Array with a x column and a y column

Return type

np.ndarray

Example

None

Note

None

_cent(x_lst, y_lst):

Calculate Centroid from x and y value(s).

Parameters
  • x_lst (List[float]) – A list of values.

  • y_lst (List[float]) – A list of values.

Returns

A list of x and y values representing the centriod of two lists.

Return type

List[float]

Example

None

Note

None

_dis(cent1, cent2):

Calculate Distance between two centroids.

Parameters
  • cent1 (List[float]) – An x, y coordinate representing a centroid.

  • cent2 – An x, y coordinate representing a centroid.

Returns

A distance measurement.

Return type

float

Example

None

Note

None

outlier_std(arr, data, y_column, _std, plus):

Calculate Outliers using a simple std value.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • data (pd.DataFrame) – A DataFrame to get data from. Optional

  • y_column (str) – A target column. Optional

  • _std (int) – A std threshold, default is 3. Optional

  • plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_var(arr, data, y_column, per, plus):

Calculate Outliers using a simple var value.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • data (pd.DataFrame) – A DataFrame to get data from. Optional

  • y_column (str) – A target column. Optional

  • per (float) – A percent threshold, default is 0.95. Optional

  • plus (bool, default is True) – If True, will grab all values above the threshold. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_regression(arr, data, x_column, y_column, _std, plus):

Calculate Outliers using regression.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • data (pd.DataFrame) – A DataFrame to get data from. Optional

  • x_column (str) – A column for x variables. Optional

  • y_column (str) – A column for y variables. Optional

  • _std (int) – A std threshold, default is 3. Optional

  • plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_distance(arr, data, x_column, y_column, _std, plus):

Calculate Outliers using distance measurements.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • x_column (str) – A column for x variables. Optional

  • y_column (str) – A column for y variables. Optional

  • _std (int) – A std threshold, default is 3. Optional

  • plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Param

data: A DataFrame to get data from. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_hist(arr, data, x_column, per, plus):

Calculate Outliers using Histogram.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • x_column (str) – A column for x variables. Optional

  • per (float) – A std threshold, default is 3. Optional

  • plus (bool) – If True, will grab all values above the threshold, default is 0.75. Optional

Param

data: A DataFrame to get data from. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_knn(arr, data, x_column, y_column, _std, plus):

Calculate Outliers using KNN.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • x_column (str) – A column for x variables. Optional

  • y_column (str) – A column for y variables. Optional

  • _std (int) – A std threshold, default is 3. Optional

  • plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Param

data: A DataFrame to get data from. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_cooks_distance(arr, data, x_column, y_column, plus, return_df):

Calculate Outliers using Cooks Distance.

Parameters
  • arr (np.ndarray) – An Array to get data from. Optional

  • data (pd.DataFrame) – A DataFrame to get data from. Optional

  • x_column (str) – A column for x variables. Optional

  • y_column (str) – A column for y variables. Optional

  • _std (int) – A std threshold, default is 3. Optional

  • plus (bool) – If True, will grab all values above the threshold, default is True. Optional

  • return_df (bool) – If True, will return a DataFrame, default is False. Optional

Returns

An array of indexes.

Return type

np.ndarray or pd.DataFrame

Example

None

Note

If arr not passed, data and respective column names are required.

Plots

Various one off plots.

personal_plot(doc_filter):

Returns a series of plots.

Parameters

doc_filter (DocumentFilter) – A DocumentFilter.

Returns

None

Example

None

Note

This is intended to be used with map_choice, mode_choice and a Gamertag inputted into the DocumentFilter.

lobby_plot(doc_filter):

Returns a series of plots.

Parameters

doc_filter (DocumentFilter) – A DocumentFilter.

Returns

None

Example

None

Note

This is intended to be used with map_choice and mode_choice inputted into the DocumentFilter.

squad_plot(doc_filter, col_lst):

Build a Polar plot for visualizing squad stats.

Parameters
  • doc_filter (DocumentFilter) – A DocumentFilter.

  • col_lst (List[str] or str) – Input List of Columns to analyze.

Returns

None

Example

None

Note

This is intended to be used with map_choice and mode_choice inputted into the DocumentFilter.

Scrape

Functions for getting and dealing with new data.

connect_to_api(_id: str):

Connect to Call of Duty API.

Parameters

_id (str) – A matchID str.

Returns

A Json of lobby data related to specified matchID.

Return type

Json

Example

None

Note

Connect to Cod API to receive lobby information.

clean_api_data(json_object):

Cleans the JSON output from connect_to_api

Parameters

json_object (Json) – Json object.

Returns

Match information in a table.

Return type

pd.DataFrame

Example

None

Note

Takes a Json object related to a matchID and constructs a pd.DataFrame with all relevant information. This will need to be saved(or concatenated to an existing csv) and loaded through the _evaulate_df() to work properly in this model.