------ Filtering: (applied to any except MB TE BA or BO)
5 point binomial filter (ie coeffs 1 4 6 4 1) with some tricks (over-weighting
available data) to avoid distortions due to gaps and at ends of profiles.
To
maintain integrity of results, where gap corrections are made the depth
is
correspondingly corrected.
------- Interp to 2m: (only applied to MB TE BA or BO)
Interpolate to 2m in the range 0 to max depth of cast, using a 4 point
quadratic fit. Maximum gap size limits increase with depth.
------- Binning:
In the first 3 cases, binning is applied to generate binned profiles
from
which stats are generated
-- Boxcar: for each cast, a simple average is made of all data in each
depth
(or temperature) bin. The fastest algorithms cannot handle inversions
in
the binning variable. Mine will but it does require constant
bin size!
-- Subsample: for each cast, the nearest value (within the bin only)
to the
the bin centre is used for each bin. Fastest method.
-- Central filter: a locally-weighted mean is evaluated for each bin
of each
cast. The weights start a fairly constant fall off close
to the bin centre
and the lengthscale of the filter gives 93% of the weight
inside the bin
and approx zero weight at 1 binlength beyond the bin.
A total weight
threshold means that a value will not be returned for
a bin centred on a
data gap of more than 1.3 binlengths.
- weight function 1/(1+(dist/lscl)^4)
lscl=binlength/3
Speed is between boxcar and sub-sample, and is faster
with constant binsize
-- All Values: no binning of individual profiles - stats done on ALL
observations in a given bin. This means that high res
casts have much more
influence on stats than low res (esp. if latter are not
interp to 2m).
This is also the slowest method.
--------- The stats:
The stats calculated are:
# mean (1 value [for given depth for given time period]
required to calc)
# standard deviation
(6 values required to calc)
# 1 2.5 97.5 & 99th percentiles (200 values required to calc)
Time periods are:
# no separate time periods if < 60 profiles
# yr/4, from start of year, if 60-120 profiles
# yr/6, from start of year, if 120-200 profiles
# monthly if >200 profiles
-- t0, t100, t250: simply finds the nearests values to the given
depths, if
within 5.5m of that depth.
-- MLD1: interpolated depths of T(ref) +/- 0.5, where T(ref) is first
value
between 5 and 10m. Note can be triggered by a T inversion! Does
not look
below 20m gaps. All sorts of issues, like what to do when water
mixed to
bottom of profile (eg inshore). Just return a value where test
succeeds, for
now.
-- MLD2 non-interpolated depth (ie depth of observation) where gradient
first
exceeds +/- 0.015 C/m below 9m. If true at first obs below 9m,
work upwards
until it fails. Idea is that we want to avoid near-surface stuff
transients
unless is clearly a very shallow ML, in which case find just
how shallow it
is. Don't look below 10m gaps. Again issues of profiles mixed
to bottom.
-- t(z). Simple stats using one of the binning methods.
-- z(t). Simple stats using one of the binning methods.
-- integrated raw t: will crash of depth inversions!
Simply integrate t observations from surface to bin boundaries,
stopping if
a data gap of 10+m.
-- integrated bin t: will crash of depth inversions!
Integrate t interpolated to bin boundaries, from surface to
bin boundaries.
Presently no gap criterion, so linear interolation could give
poor
results for gappy or non-filled low-res casts.
-- dtdz: apply binning method then calc dt/dz between bin values (so
dtdz
applies between bin centres and is allocated to the boundaries
between those
bins - so its a bit tricky to work out how to define the bins
to end of with
the desired result.
-- dzdt: same as for dtdz, except z gradient in t space.