dalio.util package

Submodules

dalio.util.level_utils module

Utilities for dealing with DataFrame index or column levels

dalio.util.level_utils.add_suffix(all_cols, cols, suffix)

Add suffix to appropriate level in a given column index.

Parameters
  • all_cols (pd.Index, pd.MultiIndex) – all columns from an index. This is only relevent when the columns at hand are a multindex, as each tuple element will contain elements from all levels (not only the selected ones)

  • cols (str, list, dict) – selected columns

  • suffix (str) – the suffix to add to the selected columns.

dalio.util.level_utils.drop_cols(df, cols)

Drop selected columns from levels

Parameters
  • df (pd.DataFrame) – dataframe to have columns dropped.

  • cols (hashable, iterable, dict) – column selection

dalio.util.level_utils.extract_cols(df, cols)

Extract columns from a dataframe

Parameters
  • df (pd.DataFrame) – dataframe containing the columns

  • cols (hashable, iterable, dict) – single column, list of columnst or dict with the level as keys and column(s) as values.

Raises

KeyError – if columns are not in dataframe

dalio.util.level_utils.extract_level_names_dict(df)

Extract all column names in a dataframe as (level: names_ dicitonar7

Parameters

df (pd.DataFrame) – dataframe whose columns will be extracted

dalio.util.level_utils.filter_levels(levels, filters)

Filter columns in levels to either be equal to specified columns or a filtering function

Parameters
  • levels (dict) – all column names in a (level: names) dict

  • filters (str, list, callable, dict) – either columns to place on a specified level or filter functions to select columns there.

dalio.util.level_utils.get_slice_from_dict(df, cols)

Get a tuple of slices that locate the specified (level: column) combination.

Parameters
  • df (pd.DataFrame) – dataframe with multiindex

  • cols (dict) – (level: column) dictionary

Raises
  • ValueError – if any of the level keys are not integers

  • KeyError – if any level key is out of bounds or if columns are not in the dataframe

dalio.util.level_utils.insert_cols(df, new_data, cols)

Insert new data into specified existing columns

Parameters
  • df (pd.DataFrame) – dataframe to insert data into.

  • new_data (any) – new data to be inserted

  • cols (hashable, iterable, dict) – existing columns in data.

Raises
  • KeyError – if columns are not in dataframe

  • Exception – if new data doesn’t fit cols dimensions

dalio.util.level_utils.mi_join(df1, df2, *args, **kwargs)

Join two dataframes and sort their columns

Parameters
  • df2 (df1,) – dataframes to join

  • **kwargs (*args,) –

    arguments for join function (called from df1)

Raises

ValueError if number of levels don't match

dalio.util.plotting_utils module

Plotting utilities

Thank you for the creators of pypfopt for the wonderful code!

dalio.util.plotting_utils.plot_covariance(cov_matrix, plot_correlation=False, show_tickers=True, ax=None)

Generate a basic plot of the covariance (or correlation) matrix, given a covariance matrix.

Parameters
  • cov_matrix (pd.DataFrame, np.ndarray) – covariance matrix

  • plot_correlation (bool) – whether to plot the correlation matrix instead, defaults to False. Optional.

  • show_tickers (bool) – whether to use tickers as labels (not recommended for large portfolios). Optional. Defaults to True.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

Returns

matplotlib axis

dalio.util.plotting_utils.plot_dendrogram(hrp, show_tickers=True, ax=None, **kwargs)

Plot the clusters in the form of a dendrogram.

Parameters
  • hrp – HRPpt object that has already been optimized.

  • show_tickers (bool) – whether to use tickers as labels (not recommended for large portfolios). Optional. Defaults to True.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

  • **kwargs – optional parameters for main graph.

Returns

matplotlib axis

dalio.util.plotting_utils.plot_efficient_frontier(cla, points=100, visible=25, show_assets=True, ax=None, **kwargs)

Plot the efficient frontier based on a CLA object

Parameters
  • points (int) – number of points to plot. Optional. Defaults to 100

  • show_assets (bool) – whether we should plot the asset risks/returns also. Optional. Defaults to True.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

  • **kwargs – optional parameters for main graph.

Returns

matplotlib axis

dalio.util.plotting_utils.plot_weights(weights, ax=None, **kwargs)

Plot the portfolio weights as a horizontal bar chart

Parameters
  • weights (dict) – the weights outputted by any PyPortfolioOpt optimiser.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

  • **kwargs – optional parameters for main graph.

Returns

matplotlib axis

dalio.util.processing_utils module

Data processing utilities

dalio.util.processing_utils.list_str(listi)
dalio.util.processing_utils.process_cols(cols)

Standardize input columns

dalio.util.processing_utils.process_date(date)

Standardize input date

Raises

TypeError – if the type of the date parameter cannot be converted to a pandas timestamp

dalio.util.processing_utils.process_new_colnames(cols, new_cols)

Get new column names based on the column parameter

dalio.util.processing_utils.process_new_df(df1, df2, cols, new_cols)

Process new dataframe given columns and new column names

Parameters
  • df1 (pd.DataFrame) – first dataframe.

  • df2 (pd.DataFrame) – dataframe to join or get columns from

  • cols (iterable) – iterable of columns being targetted.

  • new_cols (iterable) – iterable of new column names.

dalio.util.transformation_utils module

dalio.util.transformation_utils.out_of_place_col_insert(df, series, loc, column_name=None)

Returns a new dataframe with given column inserted at given location.

Parameters
  • df (pandas.DataFrame) – The dataframe into which to insert the column.

  • series (pandas.Series) – The pandas series to be inserted.

  • loc (int) – The location into which to insert the new column.

  • column_name (str, default None) – The name to assign the new column. If None, the given series name attribute is attempted; if the given series is missing the name attribute a ValueError exception will be raised.

Returns

The resulting dataframe.

Return type

pandas.DataFrame

Example

>>> import pandas as pd; import pdpipe as pdp;
>>> df = pd.DataFrame([[1, 'a'], [4, 'b']], columns=['a', 'g'])
>>> ser = pd.Series([7, 5])
>>> out_of_place_col_insert(df, ser, 1, 'n')
   a  n  g
0  1  7  a
1  4  5  b

dalio.util.translation_utils module

Translation utilities

dalio.util.translation_utils.get_numeric_column_names(df)

Return the names of all columns of numeric type.

Parameters

df (pandas.DataFrame) – The dataframe to get numeric column names for.

Returns

The names of all columns of numeric type.

Return type

list of str

Example

>>> import pandas as pd; import pdpipe as pdp;
>>> data = [[2, 3.2, "acd"], [1, 7.2, "alk"], [8, 12.1, "alk"]]
>>> df = pd.DataFrame(data, [1,2,3], ["rank", "ph","lbl"])
>>> sorted(get_numeric_column_names(df))
['ph', 'rank']
dalio.util.translation_utils.translate_df(translator, df, inplace=False)

Translate dataframe column and index names in accordance to translator dictionary.

Parameters
  • translator (dict) – dictionary of {original: translated} key value pairs.

  • df (pd.DataFrame) – dataframe to have rows and columns translated.

  • inplace (bool) – whether to perform operation inplace or return a translated copy. Optional. Defaults to False.

Module contents

dalio.util.extract_level_names_dict(df)

Extract all column names in a dataframe as (level: names_ dicitonar7

Parameters

df (pd.DataFrame) – dataframe whose columns will be extracted

dalio.util.filter_levels(levels, filters)

Filter columns in levels to either be equal to specified columns or a filtering function

Parameters
  • levels (dict) – all column names in a (level: names) dict

  • filters (str, list, callable, dict) – either columns to place on a specified level or filter functions to select columns there.

dalio.util.extract_cols(df, cols)

Extract columns from a dataframe

Parameters
  • df (pd.DataFrame) – dataframe containing the columns

  • cols (hashable, iterable, dict) – single column, list of columnst or dict with the level as keys and column(s) as values.

Raises

KeyError – if columns are not in dataframe

dalio.util.insert_cols(df, new_data, cols)

Insert new data into specified existing columns

Parameters
  • df (pd.DataFrame) – dataframe to insert data into.

  • new_data (any) – new data to be inserted

  • cols (hashable, iterable, dict) – existing columns in data.

Raises
  • KeyError – if columns are not in dataframe

  • Exception – if new data doesn’t fit cols dimensions

dalio.util.drop_cols(df, cols)

Drop selected columns from levels

Parameters
  • df (pd.DataFrame) – dataframe to have columns dropped.

  • cols (hashable, iterable, dict) – column selection

dalio.util.get_slice_from_dict(df, cols)

Get a tuple of slices that locate the specified (level: column) combination.

Parameters
  • df (pd.DataFrame) – dataframe with multiindex

  • cols (dict) – (level: column) dictionary

Raises
  • ValueError – if any of the level keys are not integers

  • KeyError – if any level key is out of bounds or if columns are not in the dataframe

dalio.util.mi_join(df1, df2, *args, **kwargs)

Join two dataframes and sort their columns

Parameters
  • df2 (df1,) – dataframes to join

  • **kwargs (*args,) –

    arguments for join function (called from df1)

Raises

ValueError if number of levels don't match

dalio.util.add_suffix(all_cols, cols, suffix)

Add suffix to appropriate level in a given column index.

Parameters
  • all_cols (pd.Index, pd.MultiIndex) – all columns from an index. This is only relevent when the columns at hand are a multindex, as each tuple element will contain elements from all levels (not only the selected ones)

  • cols (str, list, dict) – selected columns

  • suffix (str) – the suffix to add to the selected columns.

dalio.util.out_of_place_col_insert(df, series, loc, column_name=None)

Returns a new dataframe with given column inserted at given location.

Parameters
  • df (pandas.DataFrame) – The dataframe into which to insert the column.

  • series (pandas.Series) – The pandas series to be inserted.

  • loc (int) – The location into which to insert the new column.

  • column_name (str, default None) – The name to assign the new column. If None, the given series name attribute is attempted; if the given series is missing the name attribute a ValueError exception will be raised.

Returns

The resulting dataframe.

Return type

pandas.DataFrame

Example

>>> import pandas as pd; import pdpipe as pdp;
>>> df = pd.DataFrame([[1, 'a'], [4, 'b']], columns=['a', 'g'])
>>> ser = pd.Series([7, 5])
>>> out_of_place_col_insert(df, ser, 1, 'n')
   a  n  g
0  1  7  a
1  4  5  b
dalio.util.translate_df(translator, df, inplace=False)

Translate dataframe column and index names in accordance to translator dictionary.

Parameters
  • translator (dict) – dictionary of {original: translated} key value pairs.

  • df (pd.DataFrame) – dataframe to have rows and columns translated.

  • inplace (bool) – whether to perform operation inplace or return a translated copy. Optional. Defaults to False.

dalio.util.get_numeric_column_names(df)

Return the names of all columns of numeric type.

Parameters

df (pandas.DataFrame) – The dataframe to get numeric column names for.

Returns

The names of all columns of numeric type.

Return type

list of str

Example

>>> import pandas as pd; import pdpipe as pdp;
>>> data = [[2, 3.2, "acd"], [1, 7.2, "alk"], [8, 12.1, "alk"]]
>>> df = pd.DataFrame(data, [1,2,3], ["rank", "ph","lbl"])
>>> sorted(get_numeric_column_names(df))
['ph', 'rank']
dalio.util.process_cols(cols)

Standardize input columns

dalio.util.process_new_colnames(cols, new_cols)

Get new column names based on the column parameter

dalio.util.process_date(date)

Standardize input date

Raises

TypeError – if the type of the date parameter cannot be converted to a pandas timestamp

dalio.util.process_new_df(df1, df2, cols, new_cols)

Process new dataframe given columns and new column names

Parameters
  • df1 (pd.DataFrame) – first dataframe.

  • df2 (pd.DataFrame) – dataframe to join or get columns from

  • cols (iterable) – iterable of columns being targetted.

  • new_cols (iterable) – iterable of new column names.

dalio.util.translate_df(translator, df, inplace=False)

Translate dataframe column and index names in accordance to translator dictionary.

Parameters
  • translator (dict) – dictionary of {original: translated} key value pairs.

  • df (pd.DataFrame) – dataframe to have rows and columns translated.

  • inplace (bool) – whether to perform operation inplace or return a translated copy. Optional. Defaults to False.

dalio.util.plot_efficient_frontier(cla, points=100, visible=25, show_assets=True, ax=None, **kwargs)

Plot the efficient frontier based on a CLA object

Parameters
  • points (int) – number of points to plot. Optional. Defaults to 100

  • show_assets (bool) – whether we should plot the asset risks/returns also. Optional. Defaults to True.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

  • **kwargs – optional parameters for main graph.

Returns

matplotlib axis

dalio.util.plot_covariance(cov_matrix, plot_correlation=False, show_tickers=True, ax=None)

Generate a basic plot of the covariance (or correlation) matrix, given a covariance matrix.

Parameters
  • cov_matrix (pd.DataFrame, np.ndarray) – covariance matrix

  • plot_correlation (bool) – whether to plot the correlation matrix instead, defaults to False. Optional.

  • show_tickers (bool) – whether to use tickers as labels (not recommended for large portfolios). Optional. Defaults to True.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

Returns

matplotlib axis

dalio.util.plot_weights(weights, ax=None, **kwargs)

Plot the portfolio weights as a horizontal bar chart

Parameters
  • weights (dict) – the weights outputted by any PyPortfolioOpt optimiser.

  • ax (matplolib.axis, None) – Axis to plot on. Optional. New axis will be created if none is specified.

  • **kwargs – optional parameters for main graph.

Returns

matplotlib axis