pdmongo¶
-
pdmongo.
read_mongo
(collection, query, db, index_col=None, extra=None, columns=None, chunksize=None)[source]¶ Read MongoDB query into a DataFrame.
Returns a DataFrame corresponding to the result set of the query. Optionally provide an index_col parameter to use one of the columns as the index, otherwise default integer index will be used.
- Parameters
collection (str) – Mongo collection to select for querying
query (list) – Must be an aggregate query. The input will be passed to pymongo .aggregate
db (pymongo.database.Database or database string URI) – The database to use
index_col (str or list of str, optional, default: None) – Column(s) to set as index(MultiIndex).
extra (list, tuple or dict, optional, default: None) – List of parameters to pass to find/aggregate method.
chunksize (int, default None) – If specified, return an iterator where chunksize is the number of docs to include in each chunk.
- Returns
Dataframe
-
pdmongo.
to_mongo
(frame, name, db, if_exists='fail', index=True, index_label=None, chunksize=None)[source]¶ Write records stored in a DataFrame to a MongoDB collection.
- Parameters
frame (DataFrame, Series)
name (str) – Name of collection.
db (pymongo.database.Database or database string URI) – The database to write to
if_exists ({‘fail’, ‘replace’, ‘append’}, default ‘fail’) –
fail: If table exists, do nothing.
replace: If table exists, drop it, recreate it, and insert data.
append: If table exists, insert data. Create if does not exist.
index (boolean, default True) – Write DataFrame index as a column.
index_label (str or sequence, optional) – Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
chunksize (int, optional) – Specify the number of rows in each batch to be written at a time. By default, all rows will be written at once.