Here you can find exact details regarding app’s internals.

Main Interface

This is the entry point ot the application.

main.run(*, domain: str, reports_path: str, reports_list: list[dict[str, str]] = [], summary_filepath: str = '', log_path: str = '', report: str = '', path: str = '', threads: int = 0, stdout_loglevel: str = 'WARNING', file_loglevel: str = 'WARNING', verbose: bool = False) None[source]

Main function of the program.

Parameters:
  • domain (str) – SalesForce domain of your organization -> “https://corp.my.salesforce.com/”

  • reports_path (str) – Path to reports.csv file, template -> Template

  • reports_list (list[dict[str, str]]) – List of the reports as dictionaries -> [{'name': 'RaportName', 'id': '00O1V00000999GHES', 'path': WindowsPath('C:/downloads')}]

  • summary_filepath (str) – File path to summary report -> C:/downloads/summary.csv

  • log_path (str) – Path to log file -> C:/downloads/logs/

  • report (str) – Single report mode -> RaportName,00O1V00000999GHES,C:/downloads

  • path (str) – Save location path override -> C:/new_downloads

  • threads (int) – Number of threads to use. (Default: half of available threads of the machine)

  • stdout_loglevel (str) – Log level for stdout logging -> ['critical'|'error'|'warn'|'warning'|'info'|'debug'] (Default: WARNING)

  • file_loglevel (str) – Log level for file logging -> ['critical'|'error'|'warn'|'warning'|'info'|'debug'] (Default: WARNING)

  • verbose (bool) – Toggles between Progress Bar and stdout logging (Default: False)

Usage:

import sfrout
sfrout.run(domain="https://corp.my.salesforce.com/", reports_path="C:/path/to/reports.csv")

cli

SFrout is a scalable, asynchronous SalesForce report downloader for SAML/SSO clients. The app allows you to download reports based on their ID using your personal SFDC account. Supports asynchronous requests, threaded processing of the files, logging to rotating file and stdout, produces summary report for the session.

Usage:

$ sfrout "https://corp.my.salesforce.com/" "C:\path\to\reports.csv"
cli [OPTIONS] DOMAIN [REPORTS_PATH]

Options

-s, --summary_filepath <summary_filepath>

Path to the summary report -> c:/summary_report.csv

-l, --log_path <log_path>

Path to the log file -> c:/log

-r, --report <report>

Run single report -> “name,id,path,optional_report_params”

-p, --path <path>

Override save location of the reports

-t, --threads <threads>

Number of threads to spawn

Default:

0

-ls, --stdout_loglevel <stdout_loglevel>

STDOUT logging level -> [DEBUG | INFO | WARN |WARNING | ERROR | CRITICAL]

Default:

WARNING

-lf, --file_loglevel <file_loglevel>

File logging level -> [DEBUG | INFO | WARN| WARNING | ERROR | CRITICAL]

Default:

INFO

-v, --verbose

Turn off progress bar

Default:

False

Arguments

DOMAIN

Required argument

REPORTS_PATH

Optional argument

Components

Config

class components.config.ConfigProtocol(*args, **kwargs)[source]

Protocol class for config object.

Parameters:
  • reports_list_path (str) – CLI argument for input report list path.

  • report (str) – CLI argument for single report params.

  • path (str) – CLI argument for save location path override.

  • threads (int) – CLI argument for number of threads to use.

class components.config.Config(*, domain: str, reports_csv_path: str, reports_list: list[dict[str, Any]] = [], summary_filepath: str | None = None, log_path: str | None = None, report: str = '', path: str = '', threads: int = 0, stdout_loglevel: str = 'WARNNING', file_loglevel: str = 'INFO', verbose=False)[source]

Concrete class representing Config object. Contains entire configuration required for a program.

Parameters:
  • reports_list_path (str) – CLI argument for input report list path.

  • report (str) – CLI argument for single report params.

  • path (str) – CLI argument for save location path override.

  • threads (int) – CLI argument for number of threads to use.

_define_number_of_threads()[source]

Defines number of threads. By default number of threads is set to half of available threads. If threads value is not available number of threds will be set to 2. If threads number has been defined in CLI configuration threads will be equal to this number. If CLI report is filled (single report mode) then number of threads will be automatically set to 1

_input_report_path_cast(object_kwargs: list[dict[str, Any]]) list[dict[str, str | os.PathLike]][source]

Casts value of path key into Path object.

Parameters:

object_kwargs (list[dict[str, Any]]) – Colection of object parameters

Returns:

Collection of object parameters with path casted to Path object

Return type:

list[dict[str, str | PathLike]]

_input_report_single_mode_override() list[dict[str, str]][source]

Reads parameters taken from CLI. Returns parsed parameters into object kwargs.

Returns:

collection of single object kwargs (parameters) based on CLI argument

Return type:

list[dict[str, str]]

_input_report_csv_standard_file_mode() list[dict[str, str]][source]

Reads parameteres taken from input CSV. Returns parsed parameters into objects kwargs.

Returns:

collection of objects kwargs (parameters) based on input CSV

Return type:

list[dict[str, str]]

_input_report_path_override(object_kwargs: list[dict[str, str]]) list[dict[str, str]][source]

Replaces value of path kwarg parameter of the object with path value from CLI argument.

Parameters:

object_kwargs (list[dict[str, str]]) – Colection of object parameters

Returns:

Collection of object parameters with path replaced with value of path CLI argument

Return type:

list[dict[str, str]]

_parse_input_report() list[dict[str, Any]][source]

Orchestrating function for parsing parameters for input reports.

Returns:

Collection of ready to use object kwargs.

Return type:

list[dict[str, Any]]

Connectors

class components.connectors.ConnectorProtocol(*args, **kwargs)[source]

Protocol class for connector object.

Parameters:
  • queue (Queue) – Shared queue object.

  • timeout (int) – Request’s timeout value in seconds.

  • headers (dict[str, str]) – Headers required to establish the connection.

check_connection() bool[source]

Checks connection with given domain.

Returns:

Flag, True if connection is established, False otherwise.

Return type:

bool

async report_gathering(reports: list[components.containers.ReportProtocol], session: ClientSession) None[source]

Collects asynchronous responses from the servers.

Parameters:
  • reports (list[ReportProt]) – Collection of ReportProtocol objects.

  • session (aiohttp.ClientSession) – HTTP client session object to handle request in transaction.

class components.connectors.SfdcConnector(queue: Queue, *, domain: str, verbose: bool = False, timeout: int = 900, headers: dict[str, str] = {'Content-Type': 'application/csv', 'X-PrettyPrint': '1'})[source]

Concrete class representing Connector object for SFDC

Parameters:
  • queue (Queue) – Shared queue object.

  • verbose – CLI parameter used as switch between progress bar and logging to stdout on INFO level. Defaults to False.

  • timeout (int) – Request’s timeout value in seconds. Defaults to 900.

  • headers (dict[str, str]) – Headers required to establish the connection. Defaults to {‘Content-Type’: ‘application/csv’, ‘X-PrettyPrint’: ‘1’}.

  • export_params (str) – Default parameters required by SFDC. Defaults to ‘?export=csv&enc=UTF-8&isdtp=p1’.

_convert_domain_for_cookies_lookup() str[source]

Converts domain as key in cookier for sid lookup.

Returns:

Converted url complaiant with cookies keys.

Return type:

str

_parse_headers() None[source]

Parses headers for request.

_intercept_sid() str[source]

Intercepts sid from MS Edge’s CookieJar.

Returns:

Intercepted sid or empty string if sid doesn’t exist.

Return type:

str

_open_sfdc_site() None[source]

Opens SFDC website on given domain url if sid is not present or not valid.

_sid_check() bool[source]

Checks SID valididty for given SFDC domain.

Returns:

Flag, True if SID was valid, False when wasn’t.

Return type:

bool

check_connection() bool[source]

Checks the connection with given domain.

Returns:

Flag, True if connection was successful, False wasn’t.

Return type:

bool

_parse_report_url(report: ReportProtocol) str[source]

Parses report object url.

Parameters:

report (ReportProtocol) – Instance of ReportProtocol.

Returns:

Parsed url.

Return type:

str

async _request_report(report: ReportProtocol, session: ClientSession) None[source]

Sends asynchronous request to given domain with given parameters within shared session. Checks response status: - 200: response is saved in ReportProtocol.response, ReportProtocol.valid set to True, ReportProtocol is being put to the queue. - 404: error in response, ReportProtocol.valid set to False, no retries. - 500: request timeour, ReportProtocol.valid set to False, another attempt. - *: unknown error, ReportProtocol.valid set to False, another attempt.

Parameters:
  • report (ReportProtocol) – Instance of ReportProtocol.

  • session (aiohttp.ClientSession) – Shared session object.

async _toggle_progress_bar(tasks: list[_asyncio.Task]) None[source]

Toggles between showing progress bar and logging on INFO level.

Parameters:

tasks (list[asyncio.Task]) – Collection of asynchronous request tasks.

_create_async_tasks(reports: list[components.containers.ReportProtocol], session: ClientSession) list[_asyncio.Task][source]

Creates collection of asynchronous request tasks.

Parameters:
  • reports (list[ReportProtocol]) – Collection of ReportsProtocol instances.

  • session (aiohttp.ClientSession) – Shared, asynchronous session.

Returns:

Collection of asynchronous request tasks.

Return type:

list[asyncio.Task]

async _report_request_all(reports: list[components.containers.ReportProtocol], session: ClientSession) None[source]

Orchestrates entire process of processing tasks.

Parameters:
  • reports (list[ReportProtocol]) – Collection of ReportProtocol instances.

  • session (aiohttp.ClientSession) – Shared asyncio session.

async handle_requests(reports: list[components.containers.ReportProtocol]) None[source]

Creates session and process asynchronous tasks.

Parameters:

reports (list[ReportProtocol]) – Collection of ReportProtocol instances.

Containers

class components.containers.ReportProtocol(*args, **kwargs)[source]

Protocol class for report object.

Parameters:
  • name (str) – Report name, propagated to report file name

  • id (str) – Report id, identification number of the report in SFDC

  • path (PathLike) – Report path, save location for the report in form of Path object

  • type (str) – Report type, allowed options [‘SFDC’], type drives connector and report objects selection

  • export_params (str) – Default parameters required by SFDC. Defaults to ‘?export=csv&enc=UTF-8&isdtp=p1’.

  • downloaded (bool) – Flag indicating whether the reports has been succesfully downloaded or not

  • valid (bool) – Flag indicating whether the response has been succesfully retrieved or not

  • created_date (datetime) – Report save completition date

  • pull_date (timedelta) – Report response completition date

  • processing_time – The time it took to process the report in seconds

  • attempt_count (int) – Number of attempts to process the report

  • size (float) – Size of saved report file in Mb

  • response (str) – Container for request response

  • content (DataFrame) – Pandas DataFrame based on response

class components.containers.ReportsContainerProtocol(*args, **kwargs)[source]

Protocol class for report container object.

Parameters:
  • report_params_list (list[dict[str, Any]]) – Collection of dicts with parameters for object crafting.

  • summary_report_path (PathLike) – Path to save location of summary report.

create_reports() list[components.containers.ReportProtocol][source]

Orchestrating method to handle report objects factory

Returns:

Collection of Reports

Return type:

list[ReportProtocol]

create_summary_report() None[source]

Creates summary report which consist of all important details regarding Report objects. Summary report is generated once all the reports are completed.

class components.containers.SfdcReport(name: str, id: str, path: ~os.PathLike, type: str = 'SFDC', export_params: str = '?export=csv&enc=UTF-8&isdtp=p1', downloaded: bool = False, valid: bool = False, created_date: ~datetime.datetime = datetime.datetime(2023, 4, 5, 8, 48, 11, 310604), pull_date: ~datetime.datetime = datetime.datetime(2023, 4, 5, 8, 48, 11, 310604), processing_time: ~datetime.timedelta = datetime.timedelta(0), attempt_count: int = 0, size: float = 0.0, response: str = '', content: ~pandas.core.frame.DataFrame = <factory>)[source]

Concrete class representing Report object from SFDC.

Parameters:
  • name (str) – Report name, propagated to report file name.

  • id (str) – Report id, identification number of the report in SFDC.

  • path (PathLike) – Report path, save location for the report in form of Path object.

  • type (str) – Report type, allowed options [‘SFDC’], type drives connector and report objects selection. Defaults to ‘SFDC’.

  • export_params (str) – Default parameters required by SFDC. Defaults to ‘?export=csv&enc=UTF-8&isdtp=p1’.

  • downloaded (bool) – Flag indicating whether the reports has been succesfully downloaded or not. Defaults to False.

  • valid (bool) – Flag indicating whether the response has been succesfully retrieved or not. Defaults to False.

  • created_date (datetime) – Report save completition date. Defaults to current datetime.

  • pull_date (timedelta) – Report response completition date. Defaults to current datetime.

  • processing_time – The time it took to process the report in seconds. Defaults to 0 microseconds.

  • attempt_count (int) – Number of attempts to process the report. Defaults to 0 .

  • size (float) – Size of saved report file in Mb. Defaults to 0.0 .

  • response (str) – Container for request response. Defaults to empty string.

  • content (DataFrame) – Pandas DataFrame based on response. Defaults to empty Pandas DataFrame.

class components.containers.ReportsContainer(reports_params_list: list[dict[str, Any]], summary_path: PathLike | None)[source]

Concrete class representing ReportContainer object.

_create_sfdc_reports() Generator[SfdcReport, None, None][source]

SFDC Report objects factory

Returns:

Generator with SFDC Reeport objects

Return type:

Generator[SfdcReport, None, None]

Yield:

SFDC Report instance based on parsed report parameters

Return type:

SfdcReport

create_reports() list[components.containers.ReportProtocol][source]

Orchestrating method to handle report objects factory

Returns:

Collection of Reports

Return type:

list[ReportProtocol]

create_summary_report() None[source]

Creates summary report which consist of all important details regarding reports. Report is generated once all the reports are completed.

print_summary_table() None[source]

Prints summary report which consist of all important details regarding reports. Report is generated once all the reports are completed.

Handlers

class components.handlers.WorkerFactoryProtocol(*args, **kwargs)[source]

Protocol class for worker factory objects.

Parameters:
  • queue (Queue) – Shared, thread-safe queue.

  • threads (int) – Number of threads, equal to number of Workers to be deployed.

create_workers() None[source]

Creates workers on independent threads

static active_workers() int[source]

Counts active works in current time.

Returns:

Number of active workers.

Return type:

int

class components.handlers.WorkerProtocol(*args, **kwargs)[source]

Protocol class for worker factory objects.

Parameters:

queue (Queue) – Shared, thread-safe queue.

_read_stream(report: ReportProtocol) None[source]

Reads the stream of data kept in Report object via Pandas read method. Deletes response content from the object.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

_save_to_csv(report: ReportProtocol) None[source]

Saves readed data to CSV file using Pandas save method.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

_erase_report(report: ReportProtocol) None[source]

Erases the report data.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

report_processing(report: ReportProtocol) None[source]

Orchiestrates the report processing.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

run() NoReturn[source]

Starts listner process on sepearet thread, awaits objects in the queue.

Returns:

Method never returns.

Return type:

NoReturn

class components.handlers.WorkerFactory(queue: Queue, *, threads: int = 1)[source]

Concrete class representing WorkerFactory object.

create_workers() None[source]

Deploys given number of workers.

static active_workers() int[source]

Returns number of currently active workers.

Returns:

Number of workers.

Return type:

int

class components.handlers.Worker(queue: Queue)[source]

Concrete class representing Worker object.

_read_stream(report: ReportProtocol) None[source]

Reads report’s response and save it as content atribute. Erases saved response.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

_parse_save_path(report: ReportProtocol) PathLike[source]

Parses path to save location.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

Returns:

Path to save location

Return type:

os.PathLike

_save_to_csv(report: ReportProtocol) None[source]

Saves report content to CSV file. Sets object flags.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

_erase_report(report: ReportProtocol) None[source]

Deletes report content in ReportProtocol object.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

process_report(report: ReportProtocol) None[source]

Orchiestrates entire process of downloading the report.

Parameters:

report (ReportProtocol) – Instance of the ReportProtocol object.

run() NoReturn[source]

begins to listen to the queue. Starts processing once will get item from the queue. Sends signal to the queue once task is done.

Returns:

Function never returns.

Return type:

NoReturn

Excpetions

exception components.exceptions.OutdatedSIDError(message: str = 'Your SID is outdate, please provide recent SID')[source]

Exception raised for errors in the SID value.

message

explanation of the error

Type:

str

exception components.exceptions.EnvFileNotPresent(message: str = '.env file not present in main directory')[source]

Exception raised if the .env file is not present in main directory.

message

explanation of the error

Type:

str