Utils¶

This chapter documents the Utils. Functions and plots to aid in exploratory analysis

Analysis¶

One off functions for various analysis.

first_top5_bottom_stats(doc_filter, col_lst):

Calculate mu, std, var, max, min, skew, kurt for all matches depending on teamPlacement. The intent is for a map_choice and mode_choice to be fed into the DocumentFilter. Does calculations for all matches, regardless of matchID.

Parameters

doc_filter (DocumentFilter) – Input DocumentFilter.
col_lst (List[str] or str) – Input List of Columns to analyze.

Returns

Stats, related to the items in col_lst, for winners, top 5 or 10, and bottom.

Return type

pd.DataFrame

Example

None

Note

If Rebirth is selected in the DocumentFilter, will return top 5. If Verdansk, top 10 is returned.

bucket_stats(doc_filter, placement, col_lst):

Calculate mu, std, var, max, min, skew, kurt for all matches depending on teamPlacement. The intent is for a map_choice and mode_choice to be fed into the DocumentFilter. Does calculations for all matches, considering of matchID.

Parameters

doc_filter (DocumentFilter) – Input DocumentFilter.
placement (List[int] or int) – Target placement.
col_lst (List[str] or str) – Input List of Columns to analyze.

Returns

Stats, related to the items in col_lst, for placement value.

Return type

pd.DataFrame

Example

None

Note

teamPlacement value used to filter data. If two int’s are provided, will filter within that range. First value should be the lower value. Example [0,6] will return top 5 placements.

previous_next_placement(doc_filter):

Calculate mu teamPlacement before and after a teamPlacement. The intent is for a map_choice and mode_choice to be fed into the DocumentFilter.

Parameters: doc_filter (DocumentFilter) – Input DocumentFilter.
Returns: Previous and next expected placement based on current placement.
Return type: pd.DataFrame
Example: None
Note: None

match_difficulty(our_doc_filter, other_doc_filter, mu_lst, sum_lst, test):

Calculate the relative match difficulty based on player and player squad stats.

Parameters

our_doc_filter (DocumentFilter) – A DocumentFilter with squad and player data only.
other_doc_filter (DocumentFilter) – A DocumentFilter with all other players data.
mu_lst (List[str]) – A list of columns to consider the mu. Optional
sum_lst (List[str]) – A list of columns to consider the sum. Optional
test (bool) – If True, will use all columns for the analysis. Optional

Returns

Match difficulty.

Return type

pd.DataFrame

Example

None

Note

The intent is for a map_choice and mode_choice to be fed into both DocumentFilter’s.

get_daily_hourly_weekday_stats(doc_filter):

Calculate kills, deaths, wins, top 5s or 10s, match count, and averagePlacement for every day, week, hour.

Parameters: doc_filter (DocumentFilter) – Input DocumentFilter.
Returns: 3 pd.DataFrames and a dict
Return type: None
Example: None
Note: The intent is for a map_choice and mode_choice to be fed into the DocumentFilter.

get_weapons(doc_filter):

Calculate the Kills, deaths, assists, headshots, averagePlacement and count for each weapon.

Parameters: doc_filter (DocumentFilter) – Input DocumentFilter.
Returns: A DataFrame with a players gun stats.
Return type: pd.DataFrame
Example: None
Note: The intent is for a username to be fed into the DocumentFilter and this will return the information for that specific player.

find_hackers(doc_filter, y_column, col_lst, std):

Calculate hackers based on various Outlier detection methods.

Parameters

doc_filter (DocumentFilter) – A DocumentFilter.
y_column (str) – A column to consider for Outlier analysis.
col_lst (List[str]) – A list of columns used for Outlier analysis.
std (int) – The std to be considered for as a threshold, default is 3. Optional

Returns

Returns an index of suspected hackers.

Return type

List[int]

Example

None

Note

The intent is for a map_choice and mode_choice to be fed into the DocumentFilter.

meta_weapons(doc_filter, top_5_or_10, top_1, col, mu):

Calculate the most popular weapons. Map_choice is required in DocumentFilter if top_5_or_10 or top_1 is True. If Neither top_5_or_10 or top_1 are True, it will calculate based on all team placements. This will only include loadouts where all attachment slots are filled. This calculates based on a daily interval.

Parameters

doc_filter (DocumentFilter) – A DocumentFilter.
top_5_or_10 (bool) – If True, will calculate using only the top 5 or 10 place teams, default is False. Optional
top_1 (bool) – If True, will calculate using only the 1st place or winning team, default is False. Optional
col (str) – If given will use a column as reference, default is None. None will count gun users per day. Optional
mu (bool) – If True, will calculate using mean, default is sum. Optional

Returns

The First DataFrame is filled with dict’s {kills: 0, deaths: 0, count: 0}. The Second is the percent of the lobby using.

Return type

List[pd.DataFrame]

Example

None

Note

None

Base¶

General transformations.

normalize(arr, multi):

Normalize an Array.

Parameters

arr (np.ndarray) – Input array.
multi (bool) – If array has multiple columns, default is None. Optional

Returns

Normalized array.

Return type

np.ndarray

Example

None

Note

Set multi to True, if multiple columns.

running_mean(arr, num):

Calculate the running mean on num interval

Parameters

arr (np.ndarray) – Input array.
num (int) – Input int, default is 50. Optional

Returns

Running mean for a given array.

Return type

np.ndarray

Example

None

Note

None

cumulative_mean(arr):

Calculate the cumulative mean.

Parameters: arr (np.ndarray) – Input array.
Returns: Cumulative mean for a given array.
Return type: np.ndarray
Example: None
Note: None

Build¶

These functions are used when building the CallofDuty class.

CallofDuty

Outlier¶

Various outlier detection functions.

stack(x_arr, y_arr, multi):

Stacks x_arr and y_arr.

Parameters

x_arr (np.ndarray) – An array to stack.
y_arr (np.ndarray) – An array to stack.
mutli – If True, will stack based on multiple x_arr columns, default is False. Optional

Returns

Array with a x column and a y column

Return type

np.ndarray

Example

None

Note

None

_cent(x_lst, y_lst):

Calculate Centroid from x and y value(s).

Parameters

x_lst (List[float]) – A list of values.
y_lst (List[float]) – A list of values.

Returns

A list of x and y values representing the centriod of two lists.

Return type

List[float]

Example

None

Note

None

_dis(cent1, cent2):

Calculate Distance between two centroids.

Parameters

cent1 (List[float]) – An x, y coordinate representing a centroid.
cent2 – An x, y coordinate representing a centroid.

Returns

A distance measurement.

Return type

float

Example

None

Note

None

outlier_std(arr, data, y_column, _std, plus):

Calculate Outliers using a simple std value.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
data (pd.DataFrame) – A DataFrame to get data from. Optional
y_column (str) – A target column. Optional
_std (int) – A std threshold, default is 3. Optional
plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_var(arr, data, y_column, per, plus):

Calculate Outliers using a simple var value.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
data (pd.DataFrame) – A DataFrame to get data from. Optional
y_column (str) – A target column. Optional
per (float) – A percent threshold, default is 0.95. Optional
plus (bool, default is True) – If True, will grab all values above the threshold. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_regression(arr, data, x_column, y_column, _std, plus):

Calculate Outliers using regression.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
data (pd.DataFrame) – A DataFrame to get data from. Optional
x_column (str) – A column for x variables. Optional
y_column (str) – A column for y variables. Optional
_std (int) – A std threshold, default is 3. Optional
plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_distance(arr, data, x_column, y_column, _std, plus):

Calculate Outliers using distance measurements.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
x_column (str) – A column for x variables. Optional
y_column (str) – A column for y variables. Optional
_std (int) – A std threshold, default is 3. Optional
plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Param

data: A DataFrame to get data from. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_hist(arr, data, x_column, per, plus):

Calculate Outliers using Histogram.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
x_column (str) – A column for x variables. Optional
per (float) – A std threshold, default is 3. Optional
plus (bool) – If True, will grab all values above the threshold, default is 0.75. Optional

Param

data: A DataFrame to get data from. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_knn(arr, data, x_column, y_column, _std, plus):

Calculate Outliers using KNN.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
x_column (str) – A column for x variables. Optional
y_column (str) – A column for y variables. Optional
_std (int) – A std threshold, default is 3. Optional
plus (bool) – If True, will grab all values above the threshold, default is True. Optional

Param

data: A DataFrame to get data from. Optional

Returns

An array of indexes.

Return type

np.ndarray

Example

None

Note

If arr not passed, data and respective column names are required.

outlier_cooks_distance(arr, data, x_column, y_column, plus, return_df):

Calculate Outliers using Cooks Distance.

Parameters

arr (np.ndarray) – An Array to get data from. Optional
data (pd.DataFrame) – A DataFrame to get data from. Optional
x_column (str) – A column for x variables. Optional
y_column (str) – A column for y variables. Optional
_std (int) – A std threshold, default is 3. Optional
plus (bool) – If True, will grab all values above the threshold, default is True. Optional
return_df (bool) – If True, will return a DataFrame, default is False. Optional

Returns

An array of indexes.

Return type

np.ndarray or pd.DataFrame

Example

None

Note

If arr not passed, data and respective column names are required.

Plots¶

Various one off plots.

personal_plot(doc_filter):

Returns a series of plots.

Parameters: doc_filter (DocumentFilter) – A DocumentFilter.
Returns: None
Example: None
Note: This is intended to be used with map_choice, mode_choice and a Gamertag inputted into the DocumentFilter.

lobby_plot(doc_filter):

Returns a series of plots.

Parameters: doc_filter (DocumentFilter) – A DocumentFilter.
Returns: None
Example: None
Note: This is intended to be used with map_choice and mode_choice inputted into the DocumentFilter.

squad_plot(doc_filter, col_lst):

Build a Polar plot for visualizing squad stats.

Parameters

doc_filter (DocumentFilter) – A DocumentFilter.
col_lst (List[str] or str) – Input List of Columns to analyze.

Returns

None

Example

None

Note

This is intended to be used with map_choice and mode_choice inputted into the DocumentFilter.

Scrape¶

Functions for getting and dealing with new data.

connect_to_api(_id: str):

Connect to Call of Duty API.

Parameters: _id (str) – A matchID str.
Returns: A Json of lobby data related to specified matchID.
Return type: Json
Example: None
Note: Connect to Cod API to receive lobby information.

clean_api_data(json_object):

Cleans the JSON output from connect_to_api

Parameters: json_object (Json) – Json object.
Returns: Match information in a table.
Return type: pd.DataFrame
Example: None
Note: Takes a Json object related to a matchID and constructs a pd.DataFrame with all relevant information. This will need to be saved(or concatenated to an existing csv) and loaded through the _evaulate_df() to work properly in this model.