Version 5.11
11 September 2013
Portability and known problems
Alternative ways of accessing netCDF and OPeNDAP data
People involved in the development of the interface
The CSIRO interface is used in a matlab session to retrieve data from either a local netCDF file or via an OPeNDAP/DODS server. The same commands are used for either type of access in almost every case (some small differences are discussed here).
The interface has options for automatically handling missing values, scalefactors, and the permutation of hyperslabs. It also has a simple syntax.
The method of installing the software is described here.
There are other ways of accessing netCDF files and OPeNDAP/DODS data and links to some of them are given here.
Basic functions
There are ten basic functions which are commonly used. They allow users to access locally held netCDF files or to retrieve data via an OPeNDAP/DODS server.
If dealing with a netCDF file then the first argument to each function will be a file name. For example, '/home/netcdf-data/sst_cac_recon_ltm.nc' is a full file name (including a path) to a test netCDF file at the CSIRO Marine Labs. The same file is available via an OPeNDAP/DODS server with the url 'http://www.marine.csiro.au/dods/nph-dods/dods-data/climatology-netcdf/sst_cac_recon_ltm.nc'. In the examples that follow we will use this test file.
The basic functions are:
attnc : reads attributes from the dataset
ddsnc : returns information about the dataset
get_csiro_access_functions : finds the access functions used in the matlab/netcdf interface
getnc : reads variables from the dataset
inqnc/enqnc : interactive inquiry of the dataset
putnc : puts variables into the dataset
set_csiro_access_functions : sets the default access functions used in the matlab/netcdf interface
set_getnc_repeats : specifies that if a call to getnc fails it should be repeated
whatnc : lists netCDF files in current directory
For a more detailed description of the functions and some examples just follow the links. For an introduction it is suggested that you look at the functions in the order that they are listed above. Of course documentation is also available using the matlab help facility.
Auxiliary functions
There
are also some auxiliary functions which are listed below. The
higher-level ones are in the main directory and you can use the matlab
help facility for a more detailed description of them. You will not
usually want to call these directly although some of the time related
functions may be useful on occasion. There are also other functions in
the private directory and
these will be vary rarely needed.
Those in the main directory:
attnc returns selected attributes of a netcdf file or DODS/OPeNDAP dataset. The general form of an attnc call is:
[att_val, att_name_list, access_function] = attnc(file, var_name, att_name, verbose, preserve_type, access_function_in);
Examples
In the following examples we use our standard OPeNDAP file test_1.nc.
Get all the attributes of a variable
>> file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> [att_val, att_name_list] = attnc(file, 'u');
>> length(att_val)
ans =
10
>> att_val{1}
ans =
u,5_januarys
>> att_name_list{1}
ans =
long_name
Here we retrieve all of the attributes for the variable u. We see that there are 10 elements in each cell and that the first attribute has name long_name and is a string containing u,5_januarys.
Get all of the global attributes
By not giving a variable or attribute name we get information about all of the global attributes.
>> file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> [att_val, att_name_list] = attnc(file, 'u');
>> att_name_list
att_name_list =
'source'
>> att_val
att_val =
'Test program'
In this case there is only one global attribute named source and it is a string containing Test program.
Get just one attribute of a variable
By giving the variable and attribute names we can get simply the value of the attribute.
>> [att_val, att_name_list] = attnc(file, 'u', '_FillValue');
>> att_val
att_val =
1.0000e+16
Get just one global attribute
A single global attribute can be retrieved by using the name 'global' in the call to attnc as below.
>> [att_val, att_name_list] = attnc(file, 'global', 'source');
>> att_val
att_val =
Test program
ddsnc returns information about a netcdf file or DODS/OPEnDAP dataset. The general form of a ddsnc call is:
desc = ddsnc(file, access_function_in)
Examples
Information about the Reynolds data set can be found as follows:
>> file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> desc = ddsnc(file
desc =
variable: [1x14 struct]
dimension: [1x5 struct]
desc has 2 fields - variable and dimension. Looking at one element we see
>> desc.variable(2)
ans =
type: 'Float32'
name: 'u'
dim_statement: {'depth1 = 12' 'depth2 = 11'}
dim_idents: [2x1 double]
The first 2 fields tell us that the variable is named 'u' and is a 32 byte float (single precision real). The dim_statement field tells us that the u variable has 2 dimensions in the order given. For dim_idents we see
>> desc.variable(2).dim_idents
ans =
2
3
These integers refer to the dimensions of the u array. Looking at desc.dimension(2) and desc.dimension(3) we see
>> desc.dimension(2)
ans =
name: 'depth1'
length: 12
>> desc.dimension(3)
ans =
name: 'depth2'
length: 11
That is index 2 points us to the 2nd dimension, depth1 and it has length 12. (We saw the same information in the dim_statement field earlier.) A generic program could then retrieve the information by setting:
>> ii = desc.variable(2).dim_idents(1);
and then referring to desc.dimension(ii).
get_csiro_access_functions finds the names of the access functions used in the matlab/netcdf interface. The user can change the access functions with a call to set_csiro_access_functions. The general form of a get_csiro_access_functions call is:
[access_function_local, access_function_opendap] = get_csiro_access_functions;
Output arguments:
getnc retrieves data in two ways. It can be used used interactively to retrieve data from
a netCDF file.
getnc is more commonly used as a function call - it can then retrieve
data from both netCDF and OPeNDAP files. Because many options are
available getnc can take up to 14 input arguments (although most
have default values). To make things easier for the user there are various ways of specifying
these arguments. Finally, a number of examples
are given.
To retrieve data interactively the user simply types in
>> val = getnc(file);
where file is a string containing the name of the netCDF file. From there the user is prompted for more information.
There are 14 variables that getnc must know. Don't be frightened however as there are some easy ways to specify them and all but two have defaults. The variables are:
file: This is a string containing the name of the netCDF file or the URL to the OPeNDAP dataset. It does not have a default. If describing a netCDF file it is permissible to drop the ".nc" prefix but this is not recommended.
varid: This may be a string or an integer. If it is a string then it should be the name of the variable in the netCDF file or OPeNDAP dataset. The use of an integer is a deprecated way of accessing netCDF file data; if used the integer then must be the menu number of the n dimensional variable as shown by a call to inqnc.
bl_corner: This is a vector of length n specifying the hyperslab corner with the lowest index values (the bottom left-hand corner in a 2-space). The corners refer to the dimensions in the same order that these dimensions are listed in the inqnc description of the variable. For a netCDF file this is the same order that they are returned in a call to "ncdump". With an OPeNDAP dataset it is the same order as in the DDS. Note also that the indexing starts with 1 - as in matlab and fortran, NOT 0 as in C. A negative element means that all values in that direction will be returned. If a negative scalar (or an empty array) is used this means that all of the elements in the array will be returned. This is the default, i.e., all of the elements of varid will be returned.
tr_corner: This is a vector of length n specifying the hyperslab corner with the highest index values (the top right-hand corner in a 2-space). A negative element means that the returned hyperslab should run to the highest possible index (this is the default). Note, however, that the value of an element in the end_point vector will be ignored if the corresponding element in the corner vector is negative.
stride: This is a vector of length n specifying the interval between accessed values of the hyperslab (sub-sampling) in each of the n dimensions. A value of 1 accesses adjacent values in the given dimension; a value of 2 accesses every other value; and so on. If no sub-sampling is required in any direction then it is allowable to just pass the scalar 1 (or -1 to be consistent with the corner and end_point notation). Note, however, that the value of an element in the stride vector will be ignored if the corresponding element in the corner vector is negative.
order:
order == -1 then the n dimensions of the array will be returned in the same order as described by a call to inqnc(file) or "ncdump". It therefore corresponds to the order in which the indices are specified in corner, end_point and stride. This is the default.
order == -2 will reverse the above order. Because matlab's array storage is row-dominant this is actually a little quicker but the difference is rarely significant.
change_miss: Missing data are indicated by the attributes _FillValue, missing_value, valid_range, valid_min and valid_max. The action to be taken with these data are determined by change_miss.
change_miss == 1 causes missing values to be returned unchanged.
change_miss == 2 causes missing values to be changed to NaN (the default).
change_miss == 3 causes missing values to be changed to new_miss (after rescaling if that is necessary).
change_miss < 0 produces the default (missing values to be changed to NaN).
new_miss: This is the value given to missing data if change_miss == 3.
squeeze_it: This specifies whether the matlab function "squeeze" should be applied to the returned array. This will eliminate any singleton array dimensions and possibly cause the returned array to have less dimensions than the full array.
squeeze_it ~= 0 causes the squeeze function to be applied. This is the default. Note also that a 1-d array is returned as a column vector.
squeeze_it == 0 so that squeeze will not be applied.
rescale_opts: This is a 2 element vector specifying whether or not rescaling is carried out on retrieved variables and certain attributes. The relevant attributes are _FillValue', 'missing_value', 'valid_range', 'valid_min' and 'valid_max'; they are used to find missing values of the relevant variable. The option was put in to deal with files that do not follow the netCDF conventions (usually because the creator of the file has misunderstood the convention). For further discussion of the problem see here. Only use this option if you are sure that you know what you are doing.
rescale_opts(1) == 1 causes a variable read in by getnc.m to be rescaled by 'scale_factor' and 'add_offset' if these are attributes of the variable; this is the default.
rescale_opts(1) == 0 disables rescaling of the retrieved variable.
rescale_opts(2) == 1 causes the attributes '_FillValue', etc to be rescaled by 'scale_factor' and 'add_offset'; this is the default.
rescale_opts(2) == 0 disables the rescaling of the attributes '_FillValue', etc.
err_opt: This is an integer that controls the error handling in a call to getnc.
err_opt == 0 on error this prints an error message and aborts.
err_opt == 1 prints a warning message and then returns an empty array. This is the default.
err_opt == 2 returns an empty array. This is a dangerous option and should only be used with caution. It might be used when getnc is called in a loop and you want to do your own error handling without being bothered by warning messages.
'native': (the default) return in the type in which it was read.
'double_or_char': according to whether the variable is a number or a character.
Any strings on the list of allowable datatype strings for the matlab netcdf type. The list consists of 'double', 'single', 'uint64', 'int64', 'uint32', 'int32', 'uint16', 'int16', 'uint8', and 'uint8'.
Specifying up to 14 arguments to getnc can be complicated and confusing. To make the process easier getnc will accept a variety of types of input. These are given as follows:
Specify all 14 arguments. Thus we could make a call like:
>> values = getnc(file, varid, bl_corner, tr_corner, stride, order, change_miss, new_miss, squeeze_it, rescale_opts, err_opt, output_type, file_status, access_function_in);
Use default arguments. Only the first 2 arguments are strictly necessary as the other arguments all have defaults. The following call would retrieve the entire contents of the named variable:
>> values = getnc(file, varid);
If you want non-default behaviour for one or more of the later arguments then you can do something like:
>> values = getnc(file, varid, -1, -1, -1, -1, change_miss, new_miss);
In this case there are 4 arguments specified and 7 with default values used.
Use a structure as an argument. From version 3.3 onwards it is possible to pass a structure to getnc. This is illustrated below:
>> x.file = 'fred.nc';
>> x.varid = 'foo';
>> x.change_miss = 1;
>> values = getnc(x);
This specifies 3 arguments and causes
defaults to be used for the other 8.
Note that it is possible to mix the usual arguments with the passing of
a structure - it is only necessary that the structure be the last
argument passed. We could achieve the same effect as above by doing:
>> x.change_miss = 1;
>> values = getnc('fred.nc', 'foo', x);
In the following examples we use our standard OPeNDAP file "http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc" to illustrate the usage of getnc
Get an entire array
The simplest command line call to make is the following:
>> file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> u = getnc(file, 'u');
The first argument specified is the file name or url. The second argument is the name of the variable - we could have found this by using inqnc. The result is that the entire contents of the u variable will be returned to the matlab session.
Alternatively we could have passed a structure to getnc to get the same answer.
>> x.file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> x.varid = 'u';
>> u = getnc(x);
Get part of an array
We may only want a part of the variable and that is what the 3 arguments (bl_corner, tr_corner, stride) are about. If we use inqnc to consider the u variable described in our example file we see that it has two dimensions ((depth1 depth2) in that order. We say that the variable is in a 2-dimensional rectangle. We also saw that there are 12 and 11 points in each of the directions. Thus we can imagine extracting a subset of the data known as a hyperslab. The argument bl_corner specifies the bottom left hand corner of the hyperslab, tr_corner specifies the top right-hand corner and stride specifies the sampling done. An example to illustrate this is shown below.
>> u = getnc(file, 'u', [-1 3], [-1 9], [-1 2]);
>> size(u)
ans =
12 4
The 1st element in each of these arguments is -1 to indicate that we want to retrieve every point in that direction. Hence the 1st dimension of u is of length 12 – the full number of elements in the depth1 dimension. Now bl_corner(2) = 3, tr_corner(2) = 9 and stride(2) = 2. This means that in the depth2 direction we want every secondpoint from the 3rd to the 9th, i.e., points 3, 5, 7 and 9. Hence the 2nd dimension of u is of length 4.
Changing the order of the dimensions in the returned array
The next argument to discuss is order. In general it is best not use this option and just use the default (-1). The option allows you to reverse the dimensions in the returned value. Since netCDF files store data in row-major order but matlab does the opposite, it is possible, in principle, to make some efficiencies when retrieving data from a local netCDF file. However this is rarely significant and the option is only retained for backwards compatibility with older versions of getnc. (For OPeNDAP files setting order = -2 is always less efficient than -1.)
The following example illustrates this.
>> u = getnc(file, 'u');
>> size(u)
ans =
12 11
>> ut = getnc(file, 'u', -1, -1, -1, -2);
>> size(ut)
ans =
11 12
Note that in the 2nd case we have used -1, -1, -1 for the corner, end_point, stride arguments to indicate that we want the default case of getting all possible values. We could have passed a structure to get the same result as below:
>> x.file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> x.varid = 'u';
>> x.order = -2;
>> ut = getnc(x);
>> size(ut)
ans =
11 12
Missing Values
The default behaviour of getnc is to replace missing values in the data with NaNs. (By missing values we mean those values equal to the _FillValue or missing_value attribute or outside the range determined by the valid_min, valid_max or valid_range attribute. This is discussed in the netCDF user's guide at http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Attribute-Conventions.html#Attribute-Conventions.) The pair of arguments change_miss and new_miss can change this. If change_miss = 1 then any missing values are returned unchanged. If change_miss = 2 then they are changed to a NaN (the default, also available as change_miss = -1). If change_miss = 3 then any missing values are replaced by new_miss.
This is illustrated in the following example – note that we pass a structure, x, here and have made sure that x is empty at the start.
>> x = [];
>> x.file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> x.varid = 'u';
>> x.bl_corner = [12 11];
>> x.tr_corner = [12 11];
>> u = getnc(x)
u =
NaN
We use the simplest version of getnc to retrieve the last value of the array – we get a NaN because the value actually stored in the dataset is marked as a missing value. Next we try change_miss = 1,
>> x.change_miss = 1;
>> u = getnc(x)
u =
3.0000e+16
Now, 3.0000e+16, the value actually stored in the file, is returned. Finally, we use change_miss = 3 to cause the missing value to be replaced by 1.5 in our matlab array.
>> x.change_miss = 3;
>> x.new_miss = 1.5;
>> u = getnc(x)
u =
1.5000
Singleton dimensions
The next argument, squeeze_it, deals with singleton dimensions (i.e., those of length 1). If squeeze_it = 1 (the default behaviour) then any singleton dimension will be eliminated as if the matlab function squeeze had been applied. If squeeze_it = 0 then the singleton dimensions will remain. This is illustrated in the following examples.
>> big_var = getnc(file, 'big_var', [-1 2 2 5 -1], [-1 2 2 5 -1]);
>> size(big_var)
ans =
12 3
>> big_var = getnc(file, 'big_var', [-1 2 2 5 -1], [-1 2 2 5 -1], -1, -1, -1, -1, 0);
>> size(big_var)
ans =
3 1 1 1 12
This option is not really necessary any more because matlab has the squeeze function. It was originally put in to enable backwards compatibility with earlier versions of getnc written before matlab dealt with multi-dimensional arrays and so we are stuck with it.
Error handling
From version 3.3 onwards getnc has given the user some control over error handling. In the examples below we ask for a non-existent variable. The default behaviour (err_opt == 2) returns an empty array and prints a warning message as below.
>> junk = getnc(file, 'junk')
WARNING: junk is not a variable in http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc
junk =
[]
Setting err_opt == 1 causes getnc to be aborted due to the non-existent variable as seen below.
>> x = [];
>> x.err_opt = 1;
>> junk = getnc(file, 'junk', x)
??? Error using ==> getnc_s>error_handle
ERROR: junk is not a variable in http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc
Error in ==> getnc_s at 872
values = error_handle([], mess_str, [], err_opt);
Error in ==> getnc at 211
values = getnc_s(varargin);
Finally, can see the dangerous option err_opt == 3 which causes an empty array to be returned and no error message.
>> x.err_opt = 3;
>> junk = getnc(file, 'junk', x)
junk =
[]
This might be used when getnc is called in a loop and you don't want to get a large number of error messages. Of course you should be careful to handle the returned values properly.
inqnc and enqnc are two slightly different versions of an interactive function that is used to find out about the structure of a netCDF file or OPeNDAP dataset. (In the latter case you could use a web browser for the same purpose.)
enqnc is an older, command-line driven version that some people prefer and it is described below. inqnc returns the same information but uses pop-ups to ask the user question. The general form of the calls are:
inqnc(file, access_function_in, menu_type)
enqnc(file, access_function_in)
Input arguments:
Try clicking on the url 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc' to see a typical structure. The same information is found in the matlab example below. Of course, the output from enqnc will be almost identical if we look at the netCDF file on a local disk.
Example
>> file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> inqnc(file)
--- Global attributes ---
source: Test program
The 5 dimensions are 1) dim_unlmited = 3 2) depth1 = 12 3) depth2 = 11 4) dim3 = 3 5) dim4 = 4.
dim_unlmited is unlimited in length
----- Get further information about the following variables -----
-1) None of them (no further information)
0) All of the variables
1) time 2) u 3) ureverse
4) uchar1 5) uchar2 6) uchar3
7) ushort 8) ulong 9) udouble
10) no_atts 11) big_var 12) depth1
13) depth2 14) dim3
Select a menu number: 1
--- Information about time(dim_unlmited ) ---
*units: days since 1990-1-1 00:00:0.0 *long_name: Time
----- Get further information about the following variables -----
-1) None of them (no further information)
0) All of the variables
1) time 2) u 3) ureverse
4) uchar1 5) uchar2 6) uchar3
7) ushort 8) ulong 9) udouble
10) no_atts 11) big_var 12) depth1
13) depth2 14) dim3
Select a menu number: 2
--- Information about u(depth1 depth2 ) ---
*long_name: u,5_januarys *units: cm/sec
*ml__FillValue: 10000000270000000 *missing_value: 10000000270000000
*valid_range: -10000000270000000 10000000270000000
*test_double: 100 2000 *test_short: 25 -3 19
*test_long: -4 333 -17 *scale_factor: 3
*add_offset: 0.5
----- Get further information about the following variables -----
-1) None of them (no further information)
0) All of the variables
1) time 2) u 3) ureverse
4) uchar1 5) uchar2 6) uchar3
7) ushort 8) ulong 9) udouble
10) no_atts 11) big_var 12) depth1
13) depth2 14) dim3
Select a menu number: -1
>> global CSIRO_getnc_error_handlingThe variable CSIRO_getnc_error_handling.num_failures tells you the total number of failures since set_getnc_repeats was last called (when the field num_failures is reset to zero).
timenc finds the time vector and the corresponding base date for a netCDF file or DODS/OPeNDAP dataset that follows the CF conventions (or the older COARDS conventions). In practice this means that time-like variable should have a units attribute of a certain form. An example is:
'seconds since 1992-10-8 15:15:42.5 -6:00'.
This indicates seconds since October
8th, 1992 at 3 hours, 15 minutes and 42.5 seconds in the afternoon in
the time zone which is six hours to the west of Coordinated Universal
Time (i.e. Mountain Daylight Time). Instead
of 'seconds' the string may contain 'minutes', 'hours', 'days' and
'weeks' and all of these may be singular or plural; they are not case-sensitive.
The time zone specification can also be written without a colon using one or two-digits (indicating hours) or three or four digits (indicating hours and minutes). The letters 'UTC' or 'UT' are allowed at the end of the string, but these are ignored. Or the time zone may be entirely omitted.
Over the years parsetnc (the matlab
function that actually parses the string) has been extensively modified
so that it can handle many variations of the unit string. These are not
documented but are not believed to have any bugs in them.
The general form of a timenc call is:
[gregorian_time, serial_time, gregorian_base, serial_base, sizem, serial_time_jd, serial_base_jd] = timenc(file, time_var, bl_corner, tr_corner, calendar)
Input arguments:
It is possible to have many different types of calendars but timenc only implements five at present.
These are necessary because there is some confusion with dates before October 15 1582 when the Gregorian calendar was introduced. A problem also arises when the reference date in the units attribute is before this. timenc deals with this by recognising some of the CF conventions and returns different answers depending on the value of the calendar attribute of the time-like variable. Also, some numerical models like to pretend that every year has the same number of days - 365, 366 and 360 are all used.
calendar = 'standard', 'gregorian' or is not specified. In this case the relevant calculations are done for the true Gregorian calendar as decreed by Pope Gregory XIII. This has a discontinuity so that the day after 4 October 1582 is 15 October 1582. This is the calendar almost universally used today and what udunits works with today. timenc has worked this way since revision 1.10 in 2000.
calendar = 'proleptic_gregorian'. In this case the relevant calculations are done using the matlab functions datenum and datevec which simply extend the way our present calendar works backwards into the past. This is called the proleptic Gregorian calendar. Accordingly, dates are continuous, i.e., the day after 4 October 1582 is 5 October 1582, but does NOT correspond to historical time anywhere. As well there is a year zero. timenc used to work this way before revision 1.10 in the year 2000 and I believe that udunits also did at some stage in the past.
calendar == 'noleap', '365_day'. Here it is assumed that every year has 365 days.
calendar == 'all_leap', '366_day'. Here it is assumed that every year has 366 days.
Note that other values of the calendar attribute produce an error message. This can usually be avoided by the user specifying the calendar explicitly in the call to timenc.
Examples
In the following examples we use our standard OPeNDAP file sst_cac_recon_ltm.nc.
Get an entire array
The simplest command line call to make is the following:
>> file = 'http://www.marine.csiro.au/dods/nph-dods/dods-data/test_data/test_1.nc';
>> [gregorian_time, serial_time] = timenc(file);
Note that since the time-like variable is named 'time' we did not even have to put in its name. We now look at the matrix that contains the gregorian time.
>> gregorian_time(1, :)
ans =
1.0e+03 *
1.9900 0.0010 0.0010 0 0 0.0000
>> gregorian_time
ans =
1990 1 1 0 0 0
1990 1 2 0 0 0
1990 2 10 12 0 0
Each row of the the matrix gregorian_time contains a time in year, month, day, minute, hour, second format. Thus the last date is for noon, 10 February, 1990. We can see the same thing by looking at the vector serial_time.
>> size(serial_time)
ans =
3 1
>> datestr(serial_time)
ans =
01-Jan-1990 00:00:00
02-Jan-1990 00:00:00
10-Feb-1990 12:00:00
serial_time gives the time in the format used by matlab's functions datenum, datevec and datestr. Thus we can use datestr to print out the last date.
Get part of an array
Here we get the 1st and 2nd dates.
>> [gregorian_time, serial_time] = timenc(file, 'time', 1, 2);
>> datestr(serial_time)
ans =
01-Jan-1990 00:00:00
02-Jan-1990 00:00:00
whatnc lists all of the netCDF files (including compressed ones) in the current directory. It also lists all of the netCDF files in the common data set.
Example
Below is a possible listing returned by whatnc.
>> whatnc
----- current directory netCDF files -----
bar.cdf foo.cdf mycdf.cdf test_1.nc test_timenc.nc
----- current directory compressed netCDF files -----
EMPTY
----- common data set of netCDF files -----
bath_agso_2002.nc soc_climatology.nc
bath_agso_98.nc sst.mnmean.1981-present.nc
The list under the 1st heading shows all of the files in the current directory that seem to be netCDF files. This is based simply on whether they end in .cdf or .nc. Note that the .cdf suffix was used in the past to indicate a netCDF file but is no longet reccommended.
The list under the 2nd heading shows all of the files that end in nc.gz, nc.Z, cdf.gz or cdf.Z. These are presumed to be compressed netCDF files.
The 3rd list shows netCDF files in the area referred to as the common data directory. This directory will be searched by the inqnc, attnc and getnc commands and is set by the local system manager. This is done by simply editing the pos_cds.m file.
The CSIRO interface has been installed on both unix and Windows pc
systems. Installation is mostly a matter of copying the appropriate
files to directories and then making them visible to matlab.
Accordingly the experience should easily translate to other operating
systems. Note that steps 4, 5 and 6 will improve the user experience, but are not necessary.
Download either matlab_netCDF_OPeNDAP.tar.gz or matlab_netCDF_OPeNDAP.zip(the files in each are identical). Copy the downloaded file to a chosen directory (let's call it $MATLAB) and expand it using either gunzip and tar or unzip as appropriate.
The directory $MATLAB needs
to be in the matlab search path. One way to do this is to use the matlab command addpath
in your startup.m file. Alternatively, see this discussion of the matlab search path.
The toolsUI.jar driver is required to read netcdf files with old versions of matlab (version 7.6 and earlier) and to read opendap files. Download the latest version of toolsUI.jar and copy it to the same $MATLAB directory as before.
The CSIRO interface is only a wrapper that makes it easier to get
data. The actual retrieval of the data is carried out by either the native netcdf api (which is available in later versions of matlab) or the java interface which uses toolsUI.jar
as described below. Note that earlier versions of the CSIRO netcdf
interface were able to use other drivers (such as mexnc and loaddap)
but we no longer support these as they don't add any extra
functionality. To find out which drivers are used there is the function get_csiro_access_functions. The function set_csiro_access_functions allows the user to change the default drivers.
Disadvantages
Brief description: toolsUI.jar enables access to the netCDF and OpeNDAP libraries via the java virtual machine that comes included with matlab.
Advantages:
Disadvantages:
It has lower memory limits than the native api – resulting in “java.lang.OutOfMemoryError: Java heap space” messages when retrieving files larger than 147 mB. (This is discussed here).
It does not allow access to opeNDAP data that requires username:password authentication.
It may be slower to read netCDF files than the native api (although this has not been tested).
In older versions of matlab the interface may fail due to a namespace clash. The solution to the problem is discussed here.
The software in this package is entirely made up of matlab script files and works for all versions of matlab later than and including matlab 7.5 (2007b)
There are no known bugs in the CSIRO interface although there are
some problems with the netcdf api that is built in to matlab and also
the java interface. There are also some
common problems when using the interface.
The native netcdf api can be used to read opendap variable (for
details type "help netcdf" in matlab). When retrieving character arrays
it adds an extra dimension of length 64. In our test netcdf file there
is a variable uchar2 that is dimensioned uchar2(depth1 depth2). However
if the native api reads it from an opendap server it "seems" to have
dimensions uchar2(depth1 depth2 maxStrlen64). Here maxStrlen64 is actually 64. The extra 63 slices are all nulls.
This seems to be a problem with the c code in the underlying DAP libraries since the other software developers have seen the same problem. See, for example, http://sourceforge.net/projects/nco/forums/forum/9829/topic/5474372 where the author says "In this case DAP translates scalar characters into NUL-terminated character arrays (of length 64).".
The problem does not occur with the java interface and so we will
continue to use that for opendap access until the Mathworks or unidata
fixes the problem. Neither body has shown any interest in doing so.
When using the java virtual machine to retrieve OpeNDAP data there may be “java.lang.OutOfMemoryError: Java heap space” errors due to running out of heap space. In some tests the limit has been around 147 mB. The mathworks has a web page here explaining how to increase the limit on heap space.
Using the java virtual machine to retrieve opeNDAP data will fail if the opeNDAP server requires username:password athentication. The native netcdf API will handle this properly though with version 7.14 (2012a) and later.
In older versions of matlab there may be a namespace clash with the
mwucarunits.jar file. This results in java.io.IOException error
messages that mention ucar.nc2.dataset. You can check this by running
the following code fragment
p = javaclasspath('-static');
for ii = 1:length(p)
if ~isempty(strfind(p{ii}, 'mwucarunits'))
disp(p{ii})
end
end
This would typically indicate that there is a file like
"/home/matlab7.6/java/jarext/mwucarunits.jar".The way to deal with this
is to edit the file classpath.txt and remove (or
comment out) the line that contains "mwucarunits.jar".
The default classpath.txt file resides in the toolbox/local
subdirectory of your MATLAB root directory.
Reference to the "mwucarunits.jar" file is eliminated because the file contains an old implementation of the Unidata udunits package that conflicts with the more recent version that NetCDF-Java uses. (It appears that "mwucarunits.jar" is only used by the Mathworks "Model-Based Calibration Toolbox" so this should not cause a problem.)
When reading some netCDF files getnc will return a missing value indicator (by default a NaN) in some places where there shouldn't be one. This is not due to a bug in getnc but occurs when the netCDF file is not following the attribute conventions (see http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html#Attribute-Conventions). Two relevant quotes from the documentation are:
The type of each valid_range, valid_min and valid_max attribute should match the type of its variable
(except that for byte data, these can be of a signed integral type to specify the intended range).
and
If _FillValue is defined then it should be scalar and of the same type as the variable.
To illustrate what this means and how a problem can occur consider the following extract from an example cdl file.
short airtemp(time, lat, lon) ;
airtemp:long_name = "Air temperature at surface" ;
airtemp:valid_range = -10000s, 10000s ;
airtemp:units = "degC" ;
airtemp:scale_factor = 0.01f ;
airtemp:_FillValue = 32766s ;
What has happened here is that the creator of the netCDF file has chosen to save space by storing the data as shorts (2 byte integers). The software reading the data will then multiply the add_offset of 0.01 by the integer values to produce the floating point value of the air temperature. Since the integers can take values between -32768 and 32767 then this can represent temperatures of between -327.68 and 327.67 degrees with a resolution of 0.01 degrees.
Note, however, that the valid_range goes from -10000 to 10000. Generic software interprets values outside of this range as faulty in some way and the default behaviour of getnc is to replace such values with a NaN. The creator of the file can use this to mark missing or contaminated data. Since the temperatures implied by these limits are -100 and 100 Celsius then the limits are “safe” since they represent physically unreasonable data.
This way of defining the valid_range is what is specified in the earlier quote.
A problem arises when the creator of the netCDF file misunderstands the attribute convention. They choose an “intuitive” definition of the attribute like:
airtemp:valid_range = -100.0f, 100.0f ;
Here they are thinking in terms of the true air temperature rather than the scaled version stored as integers. When getnc reads the valid_range attribute it then multiplies it by 0.01 and concludes that any temperatures outside the range of -1.0 to 1.0 are to be replaced by NaNs. Note that the same problem occurs when the file's creator makes the same error with other attributes – valid_min, valid_max, _FillValue and missing_value.
There are several workarounds for this problem. The simplest is to pass getnc the argument change_miss = 1. This will cause all values to be passed unchanged (apart from the rescaling implied by the scale_factor attribute). The disadvantage is that when very large values were used to indicate faulty data these will also be returned - in the example above you might end up with some temperatures greater than 100C.
The trickier, but more satisfactory option, it to use the rescale_opts option in getnc. It was designed to deal with errant netCDF files and is described here.
The following is a partial history of revisions. I intend to keep it more up-to-date from version 3.0 onwards. In particular, bug fixes will be recorded.
Version 5.11 : September 11 2013: Documentation improved and some minor new functionality added to attnc, inqnc and enqnc.
Version 5.1 : July 22 2013: New functions get_csiro_access_functions and set_csiro_access_functions were added.
Version 4.03 : October 17 2007: Made the way err_opt is handled consistent with the documentation. Changed the default setup for reading OpeNDAP files to work around a matlab bug with javaaddpath.
Version 4.02 : September 7 2007: Things now behave more gracefully if you run matlab with the -nojvm option but have things set to use toolsUi.jar to read OpeNDAP files.
Version 4.01 : September 3 2007: A minor bug in the getnc bounds checking was corrected.
Version 4.0 : August 20 2007: Can now use the toolsUI.jar package to read both netCDF and OpeNDAP files. inqnc and the interactive version of getnc had major re-writes.
Version 3.33 : January 12 2007: Handles the latest version of the Matlab Struct Tools under Windows (loaddap.dll, etc).
Version 3.32 : December 4 2006: get_dods_dds can now deal with spaces in directory names under windows.
Version 3.31 : November 29 2006: timenc can handle more calendar types.
Version 3.30 : June 30 2006: getnc error handling and input method made more versatile.
Version 3.22 : April 24 2006: Minor changes to timenc and getnc_s to work around a bug in loaddods (and maybe loaddap) which cause errors when retrieving an array of characters.
Version 3.21 : January 27 2006: Minor changes were made to enable it the work properly on matlab 6.1 (R12).
Version 3.2 : December 19 2005: The new function ddsnc was added to the interface.
Version 3.1 : December 6 2005: Handles OPeNDAP files other than those that are native netCDF files; works (mostly) on windows boxes.
Version 3.01 : October 27 2005: Bug fix; it caused attnc to fall over when 3 input arguments were passed for an OPeNDAP file.
Version 3.0 : October 24 2005: Able to access OPeNDAP data sets as well as netCDF ones. Calls mexnc. Minor bug fix in fill_var.m.
Version 2.4 : July 3 2000: timenc was generalised so that it could handle dates before the introduction of the Gregorian calendar on October 15, 1582; it now works back to the year -4712. The earlier version could give incorrect dates for files which used the pre-Gregorian calendar in either the time vector values or in its 'units' attribute.
Version 2.3 : April 22 1998: timenc was generalised so that when requested it will return only part of the time vector. It can also return the length of the time vector as a separate variable.
Version 2.2 : December 12 1997: attnc was generalised so that when the user does not specify the name of the variable's attribute then all of the attributes (and their names) will be returned in cells.
Version 2.1 : September 24 1997: Explanatory web page first made public.
Version 2.0 : April 28 1997: Functions renamed to getnc, timenc, etc.
Version 1.0: June 1 1993: Initial release.
There are a number of alternative ways of reading netCDF and OPeNDAP data into matlab. In most cases the time and computer resources taken to retrieve data will depend mostly on external factors such as internet bandwidth and disk access speed. Hence it would be surprising if one of these methods was significantly more efficient than any of the others.
The most up-to-date place to look for netcdf software is probably here . A good place for opendap software is here.
This software is provided "as is" without warranty of any kind. It is covered by a general CSIRO Legal Notice and Disclaimer.
The CSIRO matlab interface has been mostly written by Jim Mansbridge with some welcome input from Peter McIntosh and Rose O'Connor (all of CSIRO).
This web page is maintained by Jim Mansbridge, CSIRO Marine and Atmospheric Research.
Postal address: GPO Box 1538, Hobart, Tasmania 7001, Australia
Phone: +61-3-62 32 5416
Fax: +61-3-62 32 5123
![]()
This page is http://www.marine.csiro.au/sw/matlab-netcdf.html.
Further details on the research of the CSIRO Marine and Atmospheric Research are available through the CMAR Home Page.
For more information contact reception@marine.csiro.au or telephone +61-3-62325222. Unless otherwise indicated all contents in these web documents are copyright © 1997 CSIRO.