Module netCDF4 :: Class Dataset
[hide private]
[frames] | no frames]

Class Dataset



object --+
         |
        Dataset
Known Subclasses:
Group

A netCDF Dataset is a collection of dimensions, groups, variables and attributes. Together they describe the meaning of data and relations among data fields stored in a netCDF file.

Constructor: Dataset(filename, mode="r", clobber=True)

Parameters:

filename - Name of netCDF file to hold dataset.

Keywords:

mode - access mode. r means read-only; no data can be modified. w means write; a new file is created, an existing file with the same name is deleted. a and r+ mean append (in analogy with serial files); an existing file is opened for reading and writing.

clobber - if True (default), opening a file with mode='w' will clobber an existing file with the same name. if False, an exception will be raised if a file with the same name already exists.

Returns:

a Dataset instance. All further operations on the netCDF Dataset are accomplised via Dataset instance methods.

A list of attribute names corresponding to global netCDF attributes defined for the Dataset can be obtained with the ncattrs() method. These attributes can be created by assigning to an attribute of the Dataset instance. A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Dataset instance.

The instance variables dimensions, variables, groups, file_format and path are read-only (and should not be modified by the user).

Instance Methods [hide private]
  __delattr__(...)
x.__delattr__('name') <==> del x.name
  __getattribute__(...)
x.__getattribute__('name') <==> x.name
  __init__(...)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
  __new__(T, S, ...)
  __setattr__(...)
x.__setattr__('name', value) <==> x.name = value
  close(...)
Close the Dataset.
  createDimension(...)
Creates a new dimension with the given dimname and size.
  createGroup(...)
Creates a new Group with the given groupname.
  createUserType(...)
Creates a new user-defined data type of type usertype, with a base data type of base_datatype and a name usertype_name.
  createVariable(...)
Creates a new variable with the given varname, datatype, and dimensions.
  ncattrs(...)
return netCDF global attribute names for this Dataset or Group in a list.
  renameDimension(...)
rename a Dimension named oldname to newname.
  renameVariable(...)
rename a Variable named oldname to newname
  set_fill_off(...)
Sets the fill mode for a Dataset open for writing to off.
  set_fill_on(...)
Sets the fill mode for a Dataset open for writing to on.
  sync(...)
Writes all buffered data in the Dataset to the disk file.

Inherited from object: __hash__, __reduce__, __reduce_ex__, __repr__, __str__


Class Variables [hide private]
  _grpid = <member '_grpid' of 'netCDF4.Dataset' objects>
  parent = <member 'parent' of 'netCDF4.Dataset' objects>

Instance Variables [hide private]
  dimensions = <member 'dimensions' of 'netCDF4.Dataset' objects>
The dimensions dictionary maps the names of dimensions defined for the Group or Dataset to instances of the Dimension class.
  file_format = <member 'file_format' of 'netCDF4.Dataset' objects>
The file_format attribute describes the netCDF file format version, one of NETCDF3_CLASSIC, NETCDF4, NETCDF4_CLASSIC or NETCDF3_64BIT.
  groups = <member 'groups' of 'netCDF4.Dataset' objects>
The groups dictionary maps the names of groups created for this Dataset or Group to instances of the Group class (the Dataset class is simply a special case of the Group class which describes the root group in the netCDF file).
  path = <member 'path' of 'netCDF4.Dataset' objects>
The path attribute shows the location of the Group in the Dataset in a unix directory format (the names of groups in the hierarchy separated by backslashes).
  variables = <member 'variables' of 'netCDF4.Dataset' objects>
The variables dictionary maps the names of variables defined for this Dataset or Group to instances of the Variable class.

Properties [hide private]

Inherited from object: __class__


Method Details [hide private]

__delattr__(...)

 
x.__delattr__('name') <==> del x.name
Overrides: object.__delattr__

__getattribute__(...)

 
x.__getattribute__('name') <==> x.name
Overrides: object.__getattribute__

__init__(...)
(Constructor)

 
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
Overrides: object.__init__

__new__(T, S, ...)

 
Returns:
a new object with type S, a subtype of T

Overrides: object.__new__

__setattr__(...)

 
x.__setattr__('name', value) <==> x.name = value
Overrides: object.__setattr__

close(...)

 

Close the Dataset.

close()

createDimension(...)

 

Creates a new dimension with the given dimname and size.

createDimension(dimname, size=None)

size must be a positive integer or None, which stands for "unlimited" (default is None). The return value is the Dimension class instance describing the new dimension. To determine the current maximum size of the dimension, use the len function on the Dimension instance. To determine if a dimension is 'unlimited', use the isunlimited() method of the Dimension instance.

createGroup(...)

 

Creates a new Group with the given groupname.

createGroup(groupname)

The return value is a Group class instance describing the new group.

createUserType(...)

 

Creates a new user-defined data type of type usertype, with a base data type of base_datatype and a name usertype_name.

createUserType(base_datatype, usertype, usertype_name)

The new datatype may be passed to the createVariable method to create new variables with this datatype in this Dataset or Group. The return value is a UserType class instance.

Parameters:

base_datatype - Base data type (data type that user-defined data type is composed of). For usertype='vlen', can be one of 'f4' (32-bit floating point), 'f8' (64-bit floating point), 'i4' (32-bit signed integer), 'i2' (16-bit signed integer), 'i8' (64-bit singed integer), 'i1' (8-bit signed integer), 'u1' (8-bit unsigned integer), 'u2' (16-bit unsigned integer), 'u4' (32-bit unsigned integer), 'u8' (64-bit unsigned integer), or 'S1'] (single-character string). For usertype='compound', the base_datatype argument must be a list of 3-element tuples describing the type of each member of the compound type. Each 3-tuple must contain a string giving the name of the member, a string describing the primitive data-type of the member ('i4','f8', etc. - 'S' not allowed) and a tuple describing the member's shape. The same format can be used to create a dtype descriptor for a numpy record array.

usertype - The type of user-defined data type (can be either 'vlen' or 'compound', 'opaque' and 'enum' not yet supported). In netCDF 4 it is possible to have nested user-defined data types (e.g. compound types composed of vlens), but this is not yet supported. All user-defined data types must consist of collections of fixed-size primitive data types (as specified by the base_datatype argument).

usertype_name - a Python string containing a description of the user-defined data type.

createVariable(...)

 

Creates a new variable with the given varname, datatype, and dimensions. If dimensions are not given, the variable is assumed to be a scalar.

createVariable(varname, datatype, dimensions=(), zlib=True, complevel=6, shuffle=True, fletcher32=False, chunking='seq', least_significant_digit=None, fill_value=None)

The datatype can either be an instance of UserType (if the Variable is to have a user-defined data type) or a string with the same meaning as the dtype.str attribute of arrays in module numpy (if the Variable is to have one of the primitive data types). Supported primitive data data types are: 'S1' (NC_CHAR), 'i1' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' (NC_SHORT), 'u2' (NC_USHORT), 'i4' (NC_INT), 'u4' (NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' (NC_FLOAT), 'f8' (NC_DOUBLE) and 'S' (NC_STRING).

Data from netCDF variables of a primitive data type are presented to python as numpy arrays with the corresponding data type, except for variables with datatype=S (variable-length strings). Variables containing variable-length strings are presented to python as numpy object arrays. Numpy arrays of arbitrary python objects can be stored in variables with datatype=S, if the object is not a python string it is converted to one using the python cPickle module. Pickle strings are automatically converted back into python objects when they are read back in from the netCDF file.

Data from netCDF variables with a user-defined data type are presented to python as numpy object (for 'vlen') or record (for 'compound') arrays. See the docstrings for UserType for more information on user-defined data types.

In netCDF 4 it is possible to have nested user-defined data types (e.g. compound types composed of vlens), but this is not yet supported. All user-defined data types must consist of collections of fixed-size primitive data types (no 'S' allowed).

dimensions must be a tuple containing dimension names (strings) that have been defined previously using createDimension. The default value is an empty tuple, which means the variable is a scalar.

If the optional keyword zlib is True, the data will be compressed in the netCDF file using gzip compression (default True).

The optional keyword complevel is an integer between 1 and 9 describing the level of compression desired (default 6).

If the optional keyword shuffle is True, the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression.

If the optional keyword fletcher32 is True, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default False.

If the optional keyword chunking is 'seq' (Default) HDF5 chunk sizes are set to favor sequential access. If chunking='sub', chunk sizes are set to favor subsetting equally in all dimensions.

The optional keyword fill_value can be used to override the default netCDF _FillValue (the value that the variable gets filled with before any data is written to it). If fill_value is set to False, then the variable is not pre-filled.

If the optional keyword parameter least_significant_digit is specified, variable data will be truncated (quantized). This produces 'lossy', but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From http://www.cdc.noaa.gov/cdc/conventions/cdc_netcdf_standard.shtml: "least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value." Default is None, or no quantization, or 'lossless' compression.

The return value is the Variable class instance describing the new variable.

A list of names corresponding to netCDF variable attributes can be obtained with the Variable method ncattrs(). A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Variable instance.

Variable instances behave much like array objects. Data can be assigned to or retrieved from a variable with indexing and slicing operations on the Variable instance. A Variable instance has seven standard attributes: dimensions, dtype, dtype_base, shape, least_significant_digit, usertype and usertype_name. Application programs should never modify these attributes. The dimensions attribute is a tuple containing the names of the dimensions associated with this variable. The dtype attribute is a string describing the variable's data type. It can either be a string describing one of the primitive data types (i4, f8, S1, etc), or an instance of the class UserType. The dtype_base attribute (only relevant if dtype is an instance of UserType) is a string describing the primitive data type of which the user-defined data type is composed. The shape attribute is a tuple describing the current sizes of all the variable's dimensions. The least_significant_digit attributes describes the power of ten of the smallest decimal place in the data the contains a reliable value. Data is truncated to this decimal place when it is assigned to the Variable instance. If None, the data is not truncated. The usertype attribute is a string describing the type of user-defined data type the Variable belongs to (False for a primitive data type, 'vlen' for a variable-length array, 'compound' for compound data type). The usertype_name attribute is a Python string describing the user-defined data type (None if usertype is False).

ncattrs(...)

 

return netCDF global attribute names for this Dataset or Group in a list.

ncattrs()

renameDimension(...)

 

rename a Dimension named oldname to newname.

renameDimension(oldname, newname)

renameVariable(...)

 

rename a Variable named oldname to newname

renameVariable(oldname, newname)

set_fill_off(...)

 

Sets the fill mode for a Dataset open for writing to off.

set_fill_off()

This will prevent the data from being pre-filled with fill values, which may result in some performance improvements. However, you must then make sure the data is actually written before being read.

set_fill_on(...)

 

Sets the fill mode for a Dataset open for writing to on.

set_fill_on()

This causes data to be pre-filled with fill values. The fill values can be controlled by the variable's _Fill_Value attribute, but is usually sufficient to the use the netCDF default _Fill_Value (defined separately for each variable type). The default behavior of the netCDF library correspongs to set_fill_on. Data which are equal to the _Fill_Value indicate that the variable was created, but never written to.

sync(...)

 

Writes all buffered data in the Dataset to the disk file.

sync()

Class Variable Details [hide private]

_grpid

None
Value:
<member '_grpid' of 'netCDF4.Dataset' objects>                         
      

parent

None
Value:
<member 'parent' of 'netCDF4.Dataset' objects>                         
      

Instance Variable Details [hide private]

dimensions

The dimensions dictionary maps the names of dimensions defined for the Group or Dataset to instances of the Dimension class.
Value:
<member 'dimensions' of 'netCDF4.Dataset' objects>                     
      

file_format

The file_format attribute describes the netCDF file format version, one of NETCDF3_CLASSIC, NETCDF4, NETCDF4_CLASSIC or NETCDF3_64BIT. This module can read all formats, but only writes NETCDF4. To write files in the other formats, use the netCDF4_classic module.
Value:
<member 'file_format' of 'netCDF4.Dataset' objects>                    
      

groups

The groups dictionary maps the names of groups created for this Dataset or Group to instances of the Group class (the Dataset class is simply a special case of the Group class which describes the root group in the netCDF file).
Value:
<member 'groups' of 'netCDF4.Dataset' objects>                         
      

path

The path attribute shows the location of the Group in the Dataset in a unix directory format (the names of groups in the hierarchy separated by backslashes). A Dataset, instance is the root group, so the path is simply '/'.
Value:
<member 'path' of 'netCDF4.Dataset' objects>                           
      

variables

The variables dictionary maps the names of variables defined for this Dataset or Group to instances of the Variable class.
Value:
<member 'variables' of 'netCDF4.Dataset' objects>