Module netCDF4
Introduction
Python interface to the netCDF version 4 library. netCDF version 4 has many features not found in
earlier versions of the library and is implemented on top of HDF5. This
module can read files created with netCDF versions 2 and 3, but writes
files which are only compatible with netCDF version 4. To create files
which are compatible with netCDF 3 clients use the companion netCDF4_classic
module. The API modelled after Scientific.IO.NetCDF, and should be familiar to users
of that module.
Many new features of netCDF 4 are implemented, such as multiple
unlimited dimensions, groups and zlib data compression. All the new
primitive data types (such as 64 bit and unsigned integer types) are
implemented, including variable-length strings
(NC_STRING
). The 'vlen'
and
'compound'
user-defined data types are supported. Vlen
types are variable-length, or 'ragged' arrays, while compound types are
similar to C structs (and numpy record arrays). Compound type support
is not complete, since only compound types containing primitive data
types (and not user-defined data types) can be read or written with
this module. In other words, you can't yet use this module to save
nested record arrays (record arrays with fields that are record
arrays), although you can save any record array containing fields with
any of the 'standard' fixed-size data types ('f4', 'f8', 'i1',
'i2', 'i4', 'i8', 'u1', 'u2', 'u4', 'u8'
and
'S1'
).
Download
Requires
Install
-
install the requisite python modules and C libraries (see
above).
-
set the
HDF5_DIR
environment variable to point to
where HDF5 is installed. (the libs in $HDF5_DIR/lib
,
the headers in $HDF5_DIR/include
).
-
set the
NETCDF4_DIR
environment variable to point
to where the netCDF version 4 library and headers are
installed.
-
run 'python setup.py install'
-
run some of the tests in the 'test' directory.
Tutorial
1) Creating/Opening/Closing a netCDF file
To create a netCDF file from python, you simply call the Dataset
constructor. This is also the method used to open an existing netCDF
file. If the file is open for write access (w, r+
or
a
), you may write any type of data including new
dimensions, groups, variables and attributes. netCDF files come in
several flavors (NETCDF3_CLASSIC, NETCDF3_64BIT,
NETCDF4_CLASSIC
, and NETCDF4
). The first two
flavors are supported by version 3 of the netCDF library.
NETCDF4_CLASSIC
files use the version 4 disk format
(HDF5), but do not use any features not found in the version 3 API.
They can be read by netCDF 3 clients only if they have been relinked
against the netCDF 4 library. They can also be read by HDF5 clients.
NETCDF4
files use the version 4 disk format (HDF5) and
use the new features of the version 4 API. The netCDF4
module can read files with any of these formats, but only writes
NETCDF4
formatted files. To write
NETCDF4_CLASSIC
, NETCDF3_CLASSIC
or
NETCDF3_64BIT
formatted files, use the netCDF4_classic
module. To see what how a given file is formatted, you can examine
the file_format
Dataset attribute. Closing the netCDF file is
accomplished via the close
method of the Dataset
instance.
Here's an example:
>>> import netCDF4
>>> rootgrp = netCDF4.Dataset('test.nc', 'w')
>>> print rootgrp.file_format
NETCDF4
>>>
>>> rootgrp.close()
2) Groups in a netCDF file
netCDF version 4 added support for organizing data in hierarchical
groups, which are analagous to directories in a filesystem. Groups
serve as containers for variables, dimensions and attributes, as well
as other groups. A netCDF4.Dataset
defines creates a
special group, called the 'root group', which is similar to the root
directory in a unix filesystem. To create Group instances, use
the createGroup
method of a Dataset or Group instance.
createGroup
takes a single argument, a python string
containing the name of the new group. The new Group instances
contained within the root group can be accessed by name using the
groups
dictionary attribute of the Dataset
instance.
>>> rootgrp = netCDF4.Dataset('test.nc', 'a')
>>> fcstgrp = rootgrp.createGroup('forecasts')
>>> analgrp = rootgrp.createGroup('analyses')
>>> print rootgrp.groups
{'analyses': <netCDF4._Group object at 0x24a54c30>,
'forecasts': <netCDF4._Group object at 0x24a54bd0>}
>>>
Groups can exist within groups in a Dataset, just as
directories exist within directories in a unix filesystem. Each Group instance has a
'groups'
attribute dictionary containing all of the
group instances contained within that group. Each Group instance also
has a 'path'
attribute that contains a simulated unix
directory path to that group.
Here's an example that shows how to navigate all the groups in a
Dataset. The
function walktree
is a Python generator that is used to
walk the directory tree.
>>> fcstgrp1 = fcstgrp.createGroup('model1')
>>> fcstgrp2 = fcstgrp.createGroup('model2')
>>> def walktree(top):
>>> values = top.groups.values()
>>> yield values
>>> for value in top.groups.values():
>>> for children in walktree(value):
>>> yield children
>>> print rootgrp.path, rootgrp
>>> for children in walktree(rootgrp):
>>> for child in children:
>>> print child.path, child
/ <netCDF4.Dataset object at 0x24a54c00>
/analyses <netCDF4.Group object at 0x24a54c30>
/forecasts <netCDF4.Group object at 0x24a54bd0>
/forecasts/model2 <netCDF4.Group object at 0x24a54cc0>
/forecasts/model1 <netCDF4.Group object at 0x24a54c60>
>>>
3) Dimensions in a netCDF file
netCDF defines the sizes of all variables in terms of dimensions,
so before any variables can be created the dimensions they use must
be created first. A special case, not often used in practice, is that
of a scalar variable, which has no dimensions. A dimension is created
using the createDimension
method of a Dataset or Group instance. A
Python string is used to set the name of the dimension, and an
integer value is used to set the size. To create an unlimited
dimension (a dimension that can be appended to), the size value is
set to None
. In this example, there both the
time
and level
dimensions are
unlimited.
>>> rootgrp.createDimension('level', None)
>>> rootgrp.createDimension('time', None)
>>> rootgrp.createDimension('lat', 73)
>>> rootgrp.createDimension('lon', 144)
All of the Dimension instances are stored in a python
dictionary.
>>> print rootgrp.dimensions
{'lat': <netCDF4.Dimension object at 0x24a5f7b0>,
'time': <netCDF4.Dimension object at 0x24a5f788>,
'lon': <netCDF4.Dimension object at 0x24a5f7d8>,
'level': <netCDF4.Dimension object at 0x24a5f760>}
>>>
Calling the python len
function with a Dimension
instance returns the current size of that dimension. The
isunlimited()
method of a Dimension
instance can be used to determine if the dimensions is unlimited, or
appendable.
>>> for dimname, dimobj in rootgrp.dimensions.iteritems():
>>> print dimname, len(dimobj), dimobj.isunlimited()
lat 73 False
time 0 True
lon 144 False
level 0 True
>>>
Dimension
names can be changed using the renameDimension
method of
a Dataset or Group instance.
4) Variables in a netCDF file
netCDF variables behave much like python multidimensional array
objects supplied by the numpy module. However, unlike numpy arrays, netCDF4
variables can be appended to along one or more 'unlimited'
dimensions. To create a netCDF variable, use the
createVariable
method of a Dataset or Group instance. The
createVariable
method has two mandatory arguments, the
variable name (a Python string), and the variable datatype. The
variable's dimensions are given by a tuple containing the dimension
names (defined previously with createDimension
). To
create a scalar variable, simply leave out the dimensions keyword.
The variable primitive datatypes correspond to the dtype.str
attribute of a numpy array, and can be one of 'f4'
(32-bit floating point), 'f8'
(64-bit floating point),
'i4'
(32-bit signed integer), 'i2'
(16-bit
signed integer), 'i8'
(64-bit singed integer),
'i1'
(8-bit signed integer), 'u1'
(8-bit
unsigned integer), 'u2'
(16-bit unsigned integer),
'u4'
(32-bit unsigned integer), 'u8'
(64-bit unsigned integer), or 'S1'
(single-character
string). There is also a 'S'
datatype for variable
length strings, which have no corresponding numpy data type (they are
stored in numpy object arrays). Variables of datatype
'S'
can be used to store arbitrary python objects, since
each element will be pickled into a string (if it is not already a
string) before being saved in the netCDF file (see section 10 for
more on storing arrays of python objects). Pickle strings will be
automatically un-pickled back into python objects when they are read
back in. There is also support for netCDF user-defined datatypes,
such as compound data types and variable length arrays. To create a
Variable with
a user-defined datatype, set the datatype argument to an instance of
the class UserType. See section 9 for more on user-defined
data types. The dimensions themselves are usually also defined as
variables, called coordinate variables. The
createVariable
method returns an instance of the Variable class
whose methods can be used later to access and set variable data and
attributes.
>>> times = rootgrp.createVariable('time','f8',('time',))
>>> levels = rootgrp.createVariable('level','i4',('level',))
>>> latitudes = rootgrp.createVariable('latitude','f4',('lat',))
>>> longitudes = rootgrp.createVariable('longitude','f4',('lon',))
>>>
>>> temp = rootgrp.createVariable('temp','f4',('time','level','lat','lon',))
All of the variables in the Dataset or Group are stored in a Python dictionary, in the same
way as the dimensions:
>>> print rootgrp.variables
{'temp': <netCDF4.Variable object at 0x24a61068>,
'level': <netCDF4.Variable object at 0.35f0f80>,
'longitude': <netCDF4.Variable object at 0x24a61030>,
'pressure': <netCDF4.Variable object at 0x24a610a0>,
'time': <netCDF4.Variable object at 02x45f0.4.58>,
'latitude': <netCDF4.Variable object at 0.3f0fb8>}
>>>
Variable
names can be changed using the renameVariable
method of
a Dataset
instance.
5) Attributes in a netCDF file
There are two types of attributes in a netCDF file, global and
variable. Global attributes provide information about a group, or the
entire dataset, as a whole. Variable attributes provide information about one of
the variables in a group. Global attributes are set by assigning
values to Dataset or Group instance variables. Variable
attributes are set by assigning values to Variable
instances variables. Attributes can be strings, numbers or sequences.
Returning to our example,
>>> import time
>>> rootgrp.description = 'bogus example script'
>>> rootgrp.history = 'Created ' + time.ctime(time.time())
>>> rootgrp.source = 'netCDF4 python module tutorial'
>>> latitudes.units = 'degrees north'
>>> longitudes.units = 'degrees east'
>>> pressure.units = 'hPa'
>>> temp.units = 'K'
>>> times.units = 'days since January 1, 0001'
>>> times.calendar = 'proleptic_gregorian'
The ncattrs()
method of a Dataset, Group or Variable instance
can be used to retrieve the names of all the netCDF attributes. This
method is provided as a convenience, since using the built-in
dir
Python function will return a bunch of private
methods and attributes that cannot (or should not) be modified by the
user.
>>> for name in rootgrp.ncattrs():
>>> print 'Global attr', name, '=', getattr(rootgrp,name)
Global attr description = bogus example script
Global attr history = Created Mon Nov 7 10.30:56 2005
Global attr source = netCDF4 python module tutorial
The __dict__
attribute of a Dataset, Group or Variable instance
provides all the netCDF attribute name/value pairs in a python
dictionary:
>>> print rootgrp.__dict__
{'source': 'netCDF4 python module tutorial',
'description': 'bogus example script',
'history': 'Created Mon Nov 7 10.30:56 2005'}
Attributes can be deleted from a netCDF Dataset, Group or Variable using
the python del
statement (i.e. del grp.foo
removes the attribute foo
the the group
grp
).
6) Writing data to and retrieving data from a netCDF variable
Now that you have a netCDF Variable instance, how do you put data into it? You
can just treat it like an array and assign data to a slice.
>>> import numpy as NP
>>> latitudes[:] = NP.arange(-90,91,2.5)
>>> print 'latitudes =\n',latitudes[:]
latitudes =
[-90. -87.5 -85. -82.5 -80. -77.5 -75. -72.5 -70. -67.5 -65. -62.5
-60. -57.5 -55. -52.5 -50. -47.5 -45. -42.5 -40. -37.5 -35. -32.5
-30. -27.5 -25. -22.5 -20. -17.5 -15. -12.5 -10. -7.5 -5. -2.5
0. 2.5 5. 7.5 10. 12.5 15. 17.5 20. 22.5 25. 27.5
30. 32.5 35. 37.5 40. 42.5 45. 47.5 50. 52.5 55. 57.5
60. 62.5 65. 67.5 70. 72.5 75. 77.5 80. 82.5 85. 87.5
90. ]
>>>
Unlike numpy array objects, netCDF Variable objects
with unlimited dimensions will grow along those dimensions if you
assign data outside the currently defined range of indices.
>>>
>>> nlats = len(rootgrp.dimensions['lat'])
>>> nlons = len(rootgrp.dimensions['lon'])
>>> print 'temp shape before adding data = ',temp.shape
temp shape before adding data = (0, 0, 73, 144)
>>>
>>> from numpy.random.mtrand import uniform
>>> temp[0:5,0:10,:,:] = uniform(size=(5,10,nlats,nlons))
>>> print 'temp shape after adding data = ',temp.shape
temp shape after adding data = (5, 10, 73, 144)
>>>
>>>
>>> print 'levels shape after adding pressure data = ',levels.shape
levels shape after adding pressure data = (10,)
>>>
Note that the size of the levels variable grows when data is
appended along the level
dimension of the variable
temp
, even though no data has yet been assigned to
levels.
Time coordinate values pose a special challenge to netCDF users.
Most metadata standards (such as CF and COARDS) specify that time
should be measure relative to a fixed date using a certain calendar,
with units specified like hours since YY:MM:DD hh-mm-ss
.
These units can be awkward to deal with, without a utility to convert
the values to and from calendar dates. A module called netcdftime.netcdftime is provided with this package
to do just that. Here's an example of how it can be used:
>>>
>>> from datetime import datetime, timedelta
>>> from netcdftime import utime
>>> cdftime = utime(times.units,calendar=times.calendar,format='%B %d, %Y')
>>> dates = [datetime(2001,3,1)+n*timedelta(hours=12) for n in range(temp.shape[0])]
>>> times[:] = cdftime.date2num(dates)
>>> print 'time values (in units %s): ' % times.units+'\n',times[:]
time values (in units hours since January 1, 0001):
[ 17533056. 17533068. 17533080. 17533092. 17533104.]
>>>
>>> dates = cdftime.num2date(times[:])
>>> print 'dates corresponding to time values:\n',dates
dates corresponding to time values:
[2001-03-01 00:00:00 2001-03-01 12:00:00 2001-03-02 00:00:00
2001-03-02 12:00:00 2001-03-03 00:00:00]
>>>
Values of time in the specified units and calendar are converted
to and from python datetime
instances using the
num2date
and date2num
methods of the
utime
class. See the netcdftime.netcdftime documentation for more
details.
7) Efficient compression of netCDF variables
Data stored in netCDF Variable objects is compressed on disk by default.
The parameters for the compression are determined by the
zlib
and complevel
and shuffle
keyword arguments to the createVariable
method. The
default values are zlib=True
, complevel=6
and shuffle=True
. To turn off compression, set
zlib=False
. complevel
regulates the speed
and efficiency of the compression (1 being fastest, but lowest
compression ratio, 9 being slowest but best compression ratio).
shuffle=False
will turn off the HDF5 shuffle filter,
which de-interlaces a block of data by reordering the bytes. The
shuffle filter can significantly improve compression ratios. Setting
fletcher32
keyword argument to
createVariable
to True
(it's
False
by default) enables the Fletcher32 checksum
algorithm for error detection.
If your data only has a certain number of digits of precision (say
for example, it is temperature data that was measured with a
precision of 0.1 degrees), you can dramatically improve compression
by quantizing (or truncating) the data using the
least_significant_digit
keyword argument to
createVariable
. The least significant digit is the power
of ten of the smallest decimal place in the data that is a reliable
value. For example if the data has a precision of 0.1, then setting
least_significant_digit=1
will cause data the data to be
quantized using {NP.around(scale*data)/scale}, where scale = 2**bits,
and bits is determined so that a precision of 0.1 is retained (in
this case bits=4). Effectively, this makes the compression 'lossy'
instead of 'lossless', that is some precision in the data is
sacrificed for the sake of disk space.
In our example, try replacing the line
>>> temp = rootgrp.createVariable('temp','f4',('time','level','lat','lon',))
with
>>> temp = rootgrp.createVariable('temp','f4',('time','level','lat','lon',),
least_significant_digit=3)
and see how much smaller the resulting file is.
8) Converting netCDF 3 files to netCDF 4 files (with compression)
A command line utility (nc3tonc4
) is provided which
can convert a netCDF 3 file (in NETCDF3_CLASSIC
or
NETCDF3_64BIT
format) to a NETCDF4_CLASSIC
file, optionally unpacking variables packed as short integers (with
scale_factor and add_offset) to floats, and adding zlib compression
(with the HDF5 shuffle filter and fletcher32 checksum). Data may also
be quantized (truncated) to a specified precision to improve
compression.
>>> os.system('nc3tonc4 -h')
nc3tonc4 [-h] [-o] [--zlib=(0|1)] [--complevel=(1-9)] [--shuffle=(0|1)]
[--fletcher32=(0|1)] [--unpackshort=(0|1)]
[--quantize=var1=n1,var2=n2,..] netcdf3filename netcdf4filename
-h -- Print usage message.
-o -- Overwite destination file
(default is to raise an error if output file already exists).
--zlib=(0|1) -- Activate (or disable) zlib compression (default is activate).
--complevel=(1-9) -- Set zlib compression level (6 is default).
--shuffle=(0|1) -- Activate (or disable) the shuffle filter
(active by default).
--fletcher32=(0|1) -- Activate (or disable) the fletcher32 checksum
(not active by default).
--unpackshort=(0|1) -- Unpack short integer variables to float variables
using scale_factor and add_offset netCDF
variable attributes (active by default).
--quantize=(comma separated list of "variable name=integer" pairs) --
Truncate the data in the specified variables to a given decimal precision.
For example, 'speed=2, height=-2, temp=0' will cause the variable
'speed' to be truncated to a precision of 0.01,
'height' to a precision of 100 and 'temp' to 1.
This can significantly improve compression. The default
is not to quantize any of the variables.
If --zlib=1
, the resulting
NETCDF4_CLASSIC
file will take up less disk space than
the original netCDF 3 file (especially if the --quantize
option is used), and will be readable by netCDF 3 clients as long as
they have been linked against the netCDF 4 library.
9) Beyond homogenous arrays of a fixed type - User-defined datatypes
User-defined data types make it easier to store data in a netCDF 4
that does not fit well into regular arrays of data with a homogenous
type. NetCDF 4 supports compound types, variable length types, opaque
types and enum types. Currently, only the variable length (or
'vlen'
) type and the 'compound'
type are
supported.
A user-defined data type is created using the
createUserType
method of a Dataset or Group instance. This
method returns an instance of the UserType class,
and takes 3 arguments; the base data type, the type of user-defined
data type ('vlen'
or 'compound'
), and a
identification string. The base data type for a 'vlen'
must be one of the fixed-size primitive data types ('S'
is not allowed). The base data type for a 'compound'
is
a list of 3 element tuples. Each 3-tuple describes the type of one
member of the compound type, and contains a name, a fixed-size
primitive data type, and a shape. The UserType instance
may then be passed to createVariable
(instead of a
string describing one of the primitive data types) to create a Variable with
that user-defined data type. For example,
>>> vleni4 = rootgrp.createUserType('i4', 'vlen', 'vlen_i4')
>>> ragged = rootgrp.createVariable('ragged', vleni4, ('lat','lon'))
creates a Variable which is a variable-length, or 'ragged'
array of 4-byte integers, with dimensions lat
and
lon
.
To fill the variable length array with data, create a numpy object
array of integer arrays and assign it to the variable with a
slice.
>>> import random
>>> data = NP.empty(nlats*nlons,'O')
>>> for n in range(nlats*nlons):
>>> data[n] = NP.arange(random.randint(1,10))+1
>>> data = NP.reshape(data,(nlats,nlons))
>>> ragged[:] = data
>>> print 'ragged array variable =\n',ragged[0:3,0:3]
ragged array variable =
[[[1] [1 2 3 4 5 6 7] [1 2]]
[[1 2 3 4] [1 2 3 4 5 6 7 8] [1]]
[[1 2 3 4 5 6 7] [1 2 3] [1 2 3 4 5 6 7]]]
Compound types are similar to C structs. They can be used to
represent table-like structures composed of different primitive data
types (the netCDF4 library supports nested compound types, but this
module only supports fixed-size primitive data types within compound
types). For example, compound types might be useful for representing
multiple parameter values at each point on a grid, or at each time
and space location for scattered (point) data. You can then access
all the information for a point by reading one variable, instead of
reading different parameters from different variables. Variables of
compound type correspond directly to numpy record arrays. Here's a
simple example using a compound type to represent meteorological
observations at stations:
>>>
>>> rootgrp.createDimension('station',False)
>>>
>>>
>>>
>>>
>>>
>>> datatype = [('latitude', 'f4',1), ('longitude', 'f4',1),
>>> ('sfc_press','i4',1),
>>> ('temp_sounding','f4',10),('press_sounding','i4',10),
>>> ('location_name','S1',80)]
>>>
>>>
>>> table = rootgrp.createUserType(datatype,'compound','station_data')
>>>
>>> statdat = rootgrp.createVariable('station_obs', table, ('station',))
>>>
>>> ra = NP.empty(1,statdat.dtype_base)
>>> ra['latitude'] = 40.
>>> ra['longitude'] = -105.
>>> ra['sfc_press'] = 818
>>> ra['temp_sounding'] = (280.3,272.,270.,269.,266.,258.,254.1,250.,245.5,240.)
>>> ra['press_sounding'] = range(800,300,-50)
>>>
>>>
>>>
>>>
>>>
>>> def stringtoarr(string,NUMCHARS):
>>>
>>> arr = NP.zeros(NUMCHARS,'S1')
>>> arr[0:len(string)] = tuple(string)
>>> return arr
>>> ra['location_name'] = stringtoarr('Boulder, Colorado, USA',80)
>>>
>>> statdat[0] = ra
>>>
>>>
>>> statdat[1] = (40.78,-73.99,1002,
>>> (290.2,282.5,279.,277.9,276.,266.,264.1,260.,255.5,243.),
>>> range(900,400,-50),stringtoarr('New York, New York, USA',80))
This module doesn't support attributes of compound type. To assign
an attribute like units
to each member of the compound
type I do the following:
-
create a python dict with key/value pairs representing the
name of each compound type member and it's units.
-
convert the dict to a string using the repr function.
-
use that string as a variable attribute.
When this attribute is read in it can be converted back to a
python dictionary using the eval
function. It can be
converted into hash-like objects in other languages as well
(including C), since this string is also valid JSON (JavaScript Object Notation).
JSON is a lightweight, language-independent data serialization
format.
>>> units_dict = {'latitude': 'degrees north', 'longitude': 'degrees east',
'sfc_press': 'Pascals', 'temp_sounding': 'Kelvin',
'press_sounding': 'Pascals','location_name': None}
>>> statdat.units = repr(units_dict)
>>>
>>> statdat_units = eval(statdat.units)
>>>
>>> print 'data in a variable of compound type:\n----'
>>> for data in statdat[:]:
>>> for item in statdat.dtype_base:
>>> name = item[0]
>>> type = item[1]
>>> if type == 'S1':
>>> print name,': value =',data[name].tostring(),'units =',statdat_units[name]
>>> else:
>>> print name,': value =',data[name],'units =',statdat_units[name]
>>> print '----'
----
data in a variable of compound type:
latitude : value = 40.0 units = degrees north
longitude : value = -105.0 units = degrees east
sfc_press : value = 818 units = Pascals
temp_sounding : value = [ 280.29998779 272. 270. 269. 266.
258. 254.1000061 250. 245.5 240. ] units = Kelvin
press_sounding : value = [800 750 700 650 600 550 500 450 400 350] units = Pascals
location_name : value = Boulder, Colorado, USA units = None
----
latitude : value = 40.7799987793 units = degrees north
longitude : value = -73.9899978638 units = degrees east
sfc_press : value = 1002 units = Pascals
temp_sounding : value = [ 290.20001221 282.5 279. 277.8999939 276.
266. 264.1000061 260. 255.5 243. ] units = Kelvin
press_sounding : value = [900 850 800 750 700 650 600 550 500 450] units = Pascals
location_name : value = New York, New York, USA units = None
----
10) Storing arrays of arbitrary python objects using the 'S' datatype
Variables with datatype 'S'
can be used to store
variable-length strings, or python objects. Here's an example.
>>> strvar = rootgrp.createVariable('strvar','S',('level'))
Typically, a string variable is used to hold variable-length
strings. They are represented in python as numpy object arrays
containing python strings. Below an object array is filled with
random python strings with random lengths between 2 and 12
characters.
>>> chars = '1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> data = NP.empty(10,'O')
>>> for n in range(10):
>>> stringlen = random.randint(2,12)
>>> data[n] = ''.join([random.choice(chars) for i in range(stringlen)])
Now, we replace the first element of the object array with a
python dictionary.
>>> data[0] = {'spam':1,'eggs':2,'ham':False}
When the data is assigned to the string variable, elements which
are not python strings are converted to strings using the python
cPickle
module.
>>> strvar[:] = data
When the data is read back in from the netCDF file, strings which
are determined to be pickled python objects are unpickled back into
objects.
>>> print 'string variable with embedded python objects:\n',strvar[:]
string variable with embedded python objects:
[{'eggs': 2, 'ham': False, 'spam': 1} QnXTY8B nbt4zisk pMHIn1F wl3suHW0OquZ
wn5kxEzgE nk AGBL pe kay81]
Attributes can also be python objects, although the rules for
whetherr they are saved as pickled strings are different. Attributes
are converted to numpy arrays before being saved to the netCDF file.
If the attribute is cast to an object array by numpy, it is pickled
and saved as a text attribute (and then automatically unpickled when
the attribute is accessed). So, an attribute which is a list of
integers will be saved as an array of integers, while an attribute
that is a python dictionary will be saved as a pickled string, then
unpickled automatically when it is retrieved. For example,
>>> from datetime import datetime
>>> strvar.timestamp = datetime.now()
>>> print strvar.timestamp
2006-02-11 13:26:27.238042
Note that data saved as pickled strings will not be very useful if
the data is to be read by a non-python client (the data will appear
to the client as an ugly looking binary string). A more portable (and
human-readable) way of saving simple data structures like
dictionaries and lists is to serialize them into strings using a
human-readable cross-language interchange format such as JSON or YAML. An example of this is
given in the discussion of compound data types in section 9.
All of the code in this tutorial is available in
examples/tutorial.py, along with several other examples. Unit tests
are in the test directory.
Contact:
Jeffrey Whitaker <jeffrey.s.whitaker@noaa.gov>
Copyright:
2006 by Jeffrey Whitaker.
License:
Permission to use, copy, modify, and distribute this software
and its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all
copies and that both the copyright notice and this permission
notice appear in supporting documentation. THE AUTHOR DISCLAIMS ALL
WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
|
Dataset
A netCDF Dataset is a collection of dimensions, groups,
variables and attributes.
|
|
Dimension
A netCDF Dimension is used to describe the coordinates of a
Variable.
|
|
Group
Groups define a hierarchical namespace within a netCDF file.
|
|
UserType
A UserType instance is used to describe some of the
new data types supported in netCDF 4.
|
|
Variable
A netCDF Variable is used to read and write netCDF
data.
|
|
_get_att(...)
Private function to get an attribute value given its name
|
|
_get_att_names(...)
Private function to get all the attribute names in a group
|
|
_get_dims(...)
Private function to create Dimension instances for all the dimensions in a Group or Dataset
|
|
_get_format(...)
Private function to get the netCDF file format
|
|
_get_grps(...)
Private function to create Group instances for all the groups in a Group or Dataset
|
|
_get_vars(...)
Private function to create Variable instances for all the variables in a Group or Dataset
|
|
_set_att(...)
Private function to set an attribute name/value pair
|
|
_set_default_format(...)
Private function to set the netCDF file format
|
|
__version__ = '0.6.2'
|
|
_key = 'S'
|
|
_nctonptype = {1: 'i1', 2: 'S1', 3: 'i2', 4: 'i4', 5: 'f4', 6: 'f8...
|
|
_nptonctype = {'i8': 10, 'c': 2, 'b': 1, 'f4': 5, 'u8': 11, 'f': 5...
|
|
_npversion = '1.0.1'
|
|
_private_atts = ['_grpid', '_grp', '_varid', 'groups', 'dimensions',...
|
|
_supportedtypes = ['i8', 'f4', 'u8', 'i1', 'u4', 'S1', 'i2', 'u1', 'i4...
|
|
_value = 12
|
Private function to get an attribute value given its name
-
|
Private function to get all the attribute names in a group
-
|
Private function to create Dimension instances for all the dimensions in a Group or Dataset
-
|
Private function to get the netCDF file format
-
|
Private function to create Group instances for all the groups in a Group or Dataset
-
|
Private function to create Variable instances for all the variables in a Group or Dataset
-
|
Private function to set an attribute name/value pair
-
|
Private function to set the netCDF file format
-
|
_nctonptype
None
-
- Value:
{1: 'i1',
2: 'S1',
3: 'i2',
4: 'i4',
5: 'f4',
6: 'f8',
7: 'u1',
8: 'u2',
...
|
|
_nptonctype
None
-
- Value:
{'B': 1,
'S': 12,
'S1': 2,
'b': 1,
'c': 2,
'd': 6,
'f': 5,
'f4': 5,
...
|
|
_private_atts
None
-
- Value:
['_grpid',
'_grp',
'_varid',
'groups',
'dimensions',
'variables',
'dtype',
'file_format',
...
|
|
_supportedtypes
None
-
- Value:
['i8', 'f4', 'u8', 'i1', 'u4', 'S1', 'i2', 'u1', 'i4']
|
|