Module grab.document

The Document class is the result of network request made with Grab instance.

class grab.document.Document(grab=None)[source]
Document (in most cases it is a network response
i.e. result of network request)
browse()[source]

Save response in temporary file and open it in GUI browser.

copy(new_grab=None)[source]

Clone the Response object.

detect_charset()[source]

Detect charset of the response.

Try following methods: * meta[name=”Http-Equiv”] * XML declaration * HTTP Content-Type header

Ignore unknown charsets.

Use utf-8 as fallback charset.

json

Return response body deserialized into JSON object.

parse(charset=None, headers=None)[source]

Parse headers.

This method is called after Grab instance performs network request.

query_param(key)[source]

Return value of parameter in query string.

save(path, create_dirs=False)[source]

Save response body to file.

save_hash(location, basedir, ext=None)[source]

Save response body into file with special path builded from hash. That allows to lower number of files per directory.

Parameters:
  • location – URL of file or something else. It is used to build the SHA1 hash.
  • basedir – base directory to save the file. Note that file will not be saved directly to this directory but to some sub-directory of basedir
  • ext – extension which should be appended to file name. The dot is inserted automatically between filename and extension.
Returns:

path to saved file relative to basedir

Example:

>>> url = 'http://yandex.ru/logo.png'
>>> g.go(url)
>>> g.response.save_hash(url, 'some_dir', ext='png')
'e8/dc/f2918108788296df1facadc975d32b361a6a.png'
# the file was saved to $PWD/some_dir/e8/dc/...

TODO: replace basedir with two options: root and save_to. And returns save_to + path

url_details()[source]

Return result of urlsplit function applied to response url.

grab.document.read_bom(data)[source]

Read the byte order mark in the text, if present, and return the encoding represented by the BOM and the BOM.

If no BOM can be detected, (None, None) is returned.