spike.plugins package¶
Submodules¶
spike.plugins.Bruker_NMR_FT module¶
This plugin implement the set of Fourier Transform used for NMR
and some other related utilities commands starting with bk_ emulates (kind of) the topspin commands
MAD September 2015
-
class
spike.plugins.Bruker_NMR_FT.
Bruker_NMR_FT
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
spike.plugins.Bruker_NMR_FT.
bruker_corr
(self)[source]¶ applies a correction on the spectrum for the time offset in the FID. time offset is stored in the axis property zerotime
-
spike.plugins.Bruker_NMR_FT.
bruker_proc_phase
(self)[source]¶ applies a correction on the spectrum for the time offset in the FID and from proc file parameters.
-
spike.plugins.Bruker_NMR_FT.
ftF1
(data)[source]¶ emulates Bruker ft of a 2D in F1 depending on FnMode
None 0 QF 1 QSEQ 2 TPPI 3 States 4 States-TPPI 5 Echo-AntiEcho 6
-
spike.plugins.Bruker_NMR_FT.
ftF2
(data)[source]¶ emulates Bruker ft of a 2D in F1 depending on FnMode
-
spike.plugins.Bruker_NMR_FT.
ft_n_p
(data, axis='F1')[source]¶ F1-Fourier transform for N+P (echo/antiecho) 2D
-
spike.plugins.Bruker_NMR_FT.
ft_phase_modu
(data, axis='F1')[source]¶ F1-Fourier transform for phase-modulated 2D
-
spike.plugins.Bruker_NMR_FT.
ft_seq
(data)[source]¶ performs the fourier transform of a data-set acquired on a Bruker in simultaneous mode Processing is performed only along the F2 (F3) axis if in 2D (3D)
(Bruker QSIM mode)
-
spike.plugins.Bruker_NMR_FT.
ft_sh_tppi
(data, axis='F1')[source]¶ States-Haberkorn / TPPI F1 Fourier Transform
spike.plugins.Bucketing module¶
A set of tools for computing bucketing for 1D and 2D NMR spectra
First version by DELSUC Marc-André on 2015-09-06. extended in 2017
This plugin implements the bucketing routines developped in the work
Automatic differential analysis of NMR experiments in complex samples Laure Margueritte, Petar Markov, Lionel Chiron, Jean-Philippe Starck, Catherine Vonthron-Sénécheau, Mélanie Bourjot, and Marc-André Delsuc Magn. Reson. Chem., (2018) 80 (5), 1387. http://doi.org/10.1002/mrc.4683
It implements 1D and 2D bucketing each bucket has a constant progammable size in ppm, for each buckets, following properties are computes:
center, normalized area, max, min, standard deviation, bucket_size
The results are printed in cvs format either on screen or into a file
-
class
spike.plugins.Bucketing.
BucketingTests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
spike.plugins.Bucketing.
bucket1d
(data, zoom=(0.5, 9.5), bsize=0.04, pp=False, sk=False, thresh=10, file=None)[source]¶ This tool permits to realize a bucket integration from the current 1D data-set. You will have to give (all spectral values are in ppm)
zoom (low,high), : the starting and ending ppm of the integration zone in the spectrum
bsize: the size of the bucket
- pp: if True, the number of peaks in the bucket is also added
peaks are detected if intensity is larger that thresh*noise
sk: if True, skewness and kurtosis computed for each bucket
file: the filename to which the result is written
- For a better bucket integration, you should be careful that :
the bucket size is not too small, size is better than number !
the baseline correction has been carefully done
the spectral window is correctly determined to encompass the meaningfull spectral zone.
-
spike.plugins.Bucketing.
bucket2d
(data, zoom=((0.5, 9.5), (0.5, 9.5)), bsize=(0.1, 0.1), pp=False, sk=False, thresh=10, file=None)[source]¶ This tool permits to realize a bucket integration from the current 2D data-set. You will have to give the following values: (all spectral values are in ppm)
zoom (F1limits, F2limits), : the starting and ending ppm of the integration zone in the spectrum
bsize (F1,F2): the sizes of the bucket
- pp: if True, the number of peaks in the bucket is also added
peaks are detected if intensity is larger that thresh*noise
sk: if True, skewness and kurtosis computed for each bucket
file: the filename to which the result is written
- For a better bucket integration, you should be careful that :
the bucket size is not too small, size is better than number !
the baseline correction has been carefully done
the spectral window is correctly determined to encompass the meaningfull spectral zone.
spike.plugins.FTMS_calib module¶
A utility for calibration of MS experiments
based on a list of experimentally measured m/z and theoretical ones The method will reduce the difference between lists.
- adds ppm() ppm_error() and display_icalib() methods to the FTMSAxis object
and imzmeas and mzref attributes
Adds the following methods(): to FTICR datasets
set_calib(mzmeas, mzref, axis=1) calib(axis=1, method=’l1’, verbose=False) display_calib(axis=1, compare=False)
- to FTICR axes
display_icalib(xref, mzref, symbol=’bo’) ppm_error(xref, mzref) ppm(xref, mzref) display_icalib(xref, mzref, symbol=’bo’)
- and the following attributes:
RefAxis: a backup FTICRAxis, used to store the previous calibration
- to FTICR axes
mzref : list of m/z of reference values imzmeas : list of pea indices of reference peaks (to be match with mzref)
-
spike.plugins.FTMS_calib.
calib
(npk, axis=1, method='l1', verbose=False)[source]¶ the current FTMS experiment is recalibrated optimaly along its axis ‘axis’ (usefull only in 2D) using parameters provided with set_calib() uses the current (2 or 3 parameters) calibration
method is either ‘l1’ (robust) or ‘l2’ (classic)
The current calibration is copied to a new unused axis called RefAxis
-
spike.plugins.FTMS_calib.
calib_loadref
(npkd, fname, recalibrate=False, axis=1)[source]¶ Reads in a *.ref Bruker file holding a set of calibrating values for MS with the following format:
# comments
# TuneMixPos m/z charge C5H12O2N 118.086255 +1 C6H19O6N3P3 322.048121 +1 …
and associate the reference list to the experiment, preparing for recalibration
if recalibrate is True, values will be recomputed, with values interpreted as formula or peptides (A letter code)
-
spike.plugins.FTMS_calib.
display_calib
(npkd, axis=1, compare=False)[source]¶ generates a plot of the current calibration if compare is True, will try to draw the previous calibration curve along with the current one
-
spike.plugins.FTMS_calib.
distcalib
(param, xref, mzref, axis)[source]¶ computes the residual to minimize when calibrating basically a wrapper around axis.ppm_error
-
spike.plugins.FTMS_calib.
icalib
(npk, xind, ref, axis=1, method='l1', verbose=False)[source]¶ given a list of location in index ‘xind’ and of theoretical values ‘ref’ the current FTMS experiment is recalibrated optimatly along its axis ‘axis’ (usefull only in 2D) uses the current (2 or 3 parameters) calibration
method is either ‘l1’ (robust) or ‘l2’ (classic)
The current calibration is copied to a new unused axis called RefAxis
-
spike.plugins.FTMS_calib.
l1calib
(param, xref, mzref, axis)[source]¶ computes the sum of residual to minimize when calibrating basically a wrapper around axis.ppm
-
spike.plugins.FTMS_calib.
mzcalib
(xind, ref, axis, method='l1')[source]¶ fit axis parameters so that points located at xind be closest to ref values fits on two parameters if the 3rd (ML3 ie axis.calibC) is zero method = l2 uses levenberg-marqtart on l2 norm : classical method method = l1 uses powell on l1 norm : robust method
-
spike.plugins.FTMS_calib.
ppm
(axis, xref, mzref)[source]¶ computes the mean error in ppm from a array of positions (xref) and the theoretical m/z (mzref) uses l1 norm ! xref : array of point coordinates of the reference points mzref: array of reference m/z
-
spike.plugins.FTMS_calib.
ppm_error
(axis, xref, mzref)[source]¶ computes the error from a array of positions (xref) and the theoretical m/z (mzref) returns an array with errors in ppm xref : array of point coordinates of the reference points mzref: array of reference m/z
spike.plugins.Fitter module¶
set of function for Peak fitter
Very First functionnal - Not finished !
reqruies the Peaks plugin installed
July 2016 M-A Delsuc
-
class
spike.plugins.Fitter.
FitTests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
Test for fitter, assumes Peaks plugin is loaded
-
spike.plugins.Fitter.
Lor
(Amp, Pos, Width, x)[source]¶ One Lorentzian Param contains in sequence Amp_i, Pos_i, Width_i
-
spike.plugins.Fitter.
Spec
(Param, x)[source]¶ x is the spectral coordinates Param contains in sequence Amp_i, Pos_i, Width_i all coordinates are in index
-
spike.plugins.Fitter.
dSpec
(Param, x, y=None)[source]¶ Param contains in sequence Amp_i, Pos_i, Width_i
-
spike.plugins.Fitter.
display_fit
(npkd, **kw)[source]¶ displays the result of the fit accept the same arguments than display()
-
spike.plugins.Fitter.
fit
(npkd, zoom=None)[source]¶ fit the 1D npkd data-set for Lorentzian line-shape current peak list is used as an initial values for fitting Only peaks within the zoom windows are fitted
- fitting is contraint from the initial values
intensity will not allowed to change by more than x0.5 to x2
positions by more than 5 points
width by more than x5
(constraints work only for scipy version >= 0.17 )
It may help to use centroid() to pre-optimize the peak list before calling fit(), or calling fit() twice (slower)
-
spike.plugins.Fitter.
residu
(Params, x, y)[source]¶ The residue function, returns a vector Ycalc(Params) - y_experimental can be used by leastsq
spike.plugins.Integrate module¶
A set of tools for computing Integrals for 1D NMR spectra
If present, it can guess integral zones from an existing peak-list Adds .integrals into NPKDataset which is an object with its own methods.
First version by DELSUC Marc-André on May-2019.
-
class
spike.plugins.Integrate.
Integrals
(data, *args, calibration=None, bias=0.0, separation=3, wings=5, compute=True, **kwds)[source]¶ Bases:
list
the class to hold a list of Integral
an item is [start, end, curve as np.array(), value] start and end are in index !
-
calibrate
(calibration=None)[source]¶ computes integration values from curves either use calibration value as a scale, if calibration is None put the largest to 100.0
-
display
(integoff=0.3, integscale=0.5, color='red', label=False, labelyposition=None, regions=False, zoom=None, figure=None)[source]¶ displays integrals
-
property
integvalues
¶ the list of calibrated values
-
property
integzones
¶ the list of (start, end) of integral zones
-
peakstozones
()[source]¶ computes integrals zones from peak list, separation : if two peaks are less than separation x width n they are aggregated, default = 3 wings : integrals sides are extended by wings x width, default = 5
-
-
spike.plugins.Integrate.
calibrate
(npkd, entry, calib_value)[source]¶ on a dataset already integrated, the integrals are adapted so that the given entry is set to the given value.
-
spike.plugins.Integrate.
display
(npkd, integoff=0.3, integscale=0.5, color='red', label=False, labelyposition=None, regions=False, zoom=None, figure=None)[source]¶
-
spike.plugins.Integrate.
integrate
(npkd, **kw)[source]¶ computes integral zones and values from peak list,
separation : if two peaks are less than separation x width n they are aggregated, default = 3 wings : integrals sides are extended by wings x width, default = 5 bias: this value is substracted to data before integration calibration: a coefficient to multiply all integrals / if None (default) largest is set at 100
spike.plugins.Linear_prediction module¶
plugin for the Linear Prediction algos into NPKDATA
-
class
spike.plugins.Linear_prediction.
LinpredicTests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
spike.plugins.Linear_prediction.
lpext
(npkd, final_size, lprank=10, algotype='burg')[source]¶ extends a 1D FID or 2D FID in F1 up to final_size, using lprank coefficients, and algotype mode
spike.plugins.PALMA module¶
complete DOSY processing, using the PALMA algorithm
This program uses the PALMA algorithm, presented in the manuscript
Cherni, A., Chouzenoux, E., & Delsuc, M.-A. (2017). PALMA, an improved algorithm for DOSY signal processing. Analyst, 142(5), 772–779. http://doi.org/10.1039/c6an01902a
see manuscript for details.
Authors: Afef Cherni, Emilie Chouzenoux, and Marc-André Delsuc
Licence: CC-BY-NC-SA https://creativecommons.org/licenses/by-nc-sa/4.0/
-
spike.plugins.PALMA.
Import_DOSY
(fname, nucleus=None, verbose=False)[source]¶ Import and calibrate DOSY data-set from a Bruker ser file
-
spike.plugins.PALMA.
Import_DOSY_proc
(fname, nucleus='1H', verbose=False)[source]¶ Import and calibrate DOSY data-set from a Bruker 2rr file
-
spike.plugins.PALMA.
PPXAplus
(K, Binv, y, eta, nbiter=1000, lamda=0.1, prec=1e-12, full_output=False)[source]¶ performs the PPXA+ algorithm K : a MxN matrix which transform from data space to image space Binv : inverse of (Id + K.t K) y : a M vector containing the data a : an estimate of $sum{x}$ where x is final image - used as a bayesian prior of x eta : an estimate of the standard deviation of the noise in y nbiter : maximum number of iteration to do lamda: is in [0..1], the weigth of l1 vs MaxEnt regularisation
lamda = 0 is full L1 lamda = 1 is full MaxEnt
prec: precision of the result, algo will stop if steps are below this evel full_output: if True, will compute additional terms during convergence (slower):
- parameters = (lcrit, lent, lL1, lresidus)
with lcrit: the optimized criterion len: evolution of -entropy lL1: evolution of L1(x) lresidus: evolution of the distance ||Kx-y||
if False, returns the number of performed iterations
returns (x, parameters), where x is the computed optimal image
-
spike.plugins.PALMA.
approx_lambert
(x)[source]¶ approximation of W( exp(x) ) no error below 50, and less than 0.2% error for x>50 and converges toward 0 for x->inf does not overflow ! does not NaN
-
spike.plugins.PALMA.
auto_damp_width
(d)[source]¶ uses the tab buffer to determine the optimum dmin and dmax for ILT processing
-
spike.plugins.PALMA.
calibdosy
(litdelta, bigdelta, recovery=0.0, seq_type='ste', nucleus='1H', maxgrad=50.0, maxtab=50.0, gradshape=1.0, unbalancing=0.2, os_tau=None, os_version=1)[source]¶ returns the DOSY calibrating factor from the parameters
- bigdelta float
“Big Delta” : diffusion delay in msec
- litdelta float
“little Delta” : gradient duration in msec
- seq_type enum “pgse”,”ste”,”bpp_ste”,”ste_2echoes”,”bpp_ste_2echoes”,”oneshot” / default ste
- the type of DOSY sequence used
pgse : the standard hahn echoe sequence ste : the standard stimulated echoe sequence bpp_ste : ste with bipolar gradient pulses ste_2echoes : ste compensated for convection bpp_ste_2echoes : bpp_ste compensated for convection oneshot : the oneshot sequence from Pelta, Morris, Stchedroff, Hammond, 2002, Magn.Reson.Chem. 40, p147
uses unbalancing=0.2, os_tau=None, os_version=1 unbalancing is called alpha in the publication os_tau is called tau in the publication os_version=1 corresponds to equation(1) / os_version=2 to (2)
- nucleus enum “1H”,”2H”,”13C”,”15N”,”17O”,”19F”,”31P” / default 1H
the observed nucleus
- recovery float
Gradient recovery delay
- maxgrad float
Maximum Amplificator Gradient Intensity, in G/cm / default 50.0
- maxtab float
Maximum Tabulated Gradient Value in the tabulated file. / default 50.0 Bruker users with gradient list in G/cm (difflist) use maxgrad here Bruker users with gradient list in % use 100 here Varian users use 32768 here
- gradshape float
integral factor depending on the gradient shape used / default 1.0 typical values are :
1.0 for rectangular gradients 0.6366 = 2/pi for sine bell gradients 0.4839496 for 4% truncated gaussian (Bruker gauss.100 file)
Bruker users using difflist use 1.0 here, as it is already included in difflist
-
spike.plugins.PALMA.
criterion
(x, K, y, lamda, a)[source]¶ Compute regularization function, (not used during iteration without full_output)
-
spike.plugins.PALMA.
dcalibdosy
(npk, nucleus='1H')[source]¶ use stored parameters to determine correct DOSY calbiration
-
spike.plugins.PALMA.
determine_seqtype
(pulprog)[source]¶ given the PULPROG name, determines which seq_type is to be used PULPROG should be follow the standard Bruker naming scheme
-
spike.plugins.PALMA.
do_palma
(npkd, miniSNR=32, mppool=None, nbiter=1000, lamda=0.1, uncertainty=1.2, precision=1e-08)[source]¶ realize PALMA computation on each column of the 2D datasets dataset should have been prepared with prepare_palma()
the noise in the initial spectrum is analysed on the first DOSY increment then each column is processed with palma() if its intensity is sufficient
miniSNR: determines the minimum Signal to Noise Ratio of the signal for allowing the processing mppool: if passed as a multiprocessing.Pool, it will be used for parallel processing
the other parameters are transparently passed to palma()
-
spike.plugins.PALMA.
eval_dosy_noise
(x, window_size=9, order=3)[source]¶ we estimate the noise in x by computing difference from polynomial fitting input: x - a real vector return: the noise level
-
spike.plugins.PALMA.
palma
(npkd, N, nbiter=1000, uncertainty=1.0, lamda=0.1, precision=1e-08, full_output=False)[source]¶ realize PALMA computation on a 1D dataset containing a decay dataset should have been prepared with prepare_palma on each column, the noise is estimated, then PPXA+ algorithm is applied nbiter: maximum iteration number of the PALMA algo
- uncertainty: noise is estimated on the dataset, and then multiplied by this value
so uncertainty=1 is full confidence in the noise evaluation algo uncertainty>1 allows more room for algo when data quality is poor
- lamda is the balance between entropy and L1
lamda = 0 is full L1 lamda = 1 is full Ent
precision: is the required precision for the convergence full_output is used for debugging purposes, do not use in production
check PPXAplus() doc for details
-
spike.plugins.PALMA.
prepare_palma
(npkd, finalsize, Dmin, Dmax)[source]¶ this method prepares a DOSY dataset for processing - computes experimental values from imported parameter file - prepare DOSY transformation matrix
-
spike.plugins.PALMA.
prox_l1_Sent
(x, lamda, a)[source]¶ Compute the proximity operator of L1 + Shannon Entropy
spike.plugins.Peaks module¶
set of function for Peak detections and display - 1D and 2D
Very First functionnal - Not finished !
- Peak1D and Peak2D are simple objects
with attributes like Id, label, intens(ity), pos(ition), or width the only added method is report() (returns a string)
- Peak1DList and Peak2DList are python list, with a few added methods
report (to stdio or to a file)
- largest sort in decreasing order of intensity
- other sorts can simply done by peaklist.sort(key = lambda p: p.XXX)
where XXX is any peak attribute (see largest code)
Example of usage:
# assuming d is a 2D NPKData / 1D will be just as simple d.pp() # computes a peak picking over the whole spectrum using 3 x standard_deviation(d)
# This is just a detection of all local maxima
# We can be more specific: d.pp(threshold=5E5, zoom=((700,1050),(300,350)) ) # zoom is always in the currently active unit, defined with d.unit
# this attached the peak list to the dataset as d.peaks, # it is a list of Peaks2D objects, with some added properties print( “number of detected peaks: %d” % len(d.peaks))
p0 = d.peaks[0] # peaks have label, intensity and positions attributes
- print( p0.report() ) # and a report method
# report has an additional format parameter which enables control on the output
# we can call centroid to improve the accuracy and move the position to center of a fitted (2D) parabola d.centroid()
# The peak list can be displayed on screen as simple crosses d.display_peaks()
# The label can be modififed for specific purposes: for p in d.peaks:
- if 150 < p.posF2 < 1500 :
p.label = “%.2f x %.f”%(p.posF1,p.posF2) # for instance changing to the coordinates for a certain zone
- else:
p.label = “” # and removing elsewhere
d.display_peaks(peak_label=True)
# peak lists can also be reported d.report_peak()
# but also as a formatted stream, and redirected to a file: output = open(“my_peak_list.csv”,”w”) # open the file output.write(“# LABEL, INTENSITY, F1, Width, F2, width”) d.report_peak(file=output, format=”{1}, {4:.2f}, {2:.7f}, {5:.2f}, {3:.7f}, {6:.2f}”)
# arguments order order is id, label, posF1, posF2, intensity, widthF1, widthF2
output.close()
Sept 2015 M-A Delsuc
-
class
spike.plugins.Peaks.
Peak
(Id, label, intens)[source]¶ Bases:
object
a generic class to store peaks defines : Id a unique integer intens The intensity (the height of the largest point) area The area/volume of the peak label a string intens_err The uncertainty of the previous values area_err ..
-
class
spike.plugins.Peaks.
Peak1D
(Id, label, intens, pos)[source]¶ Bases:
spike.plugins.Peaks.Peak
a class to store a single 1D peak defines in addition to Peak pos position of the peak in index relative to the typed (real/complex) buffer width width of the peak in index pos_err uncertainty of the previous values width_err …
-
full_format
= '{}, {}, {}, {}, {}, {}, {}, {}, '¶
-
report
(f=<function _identity>, format=None)[source]¶ print the peak list f is a function used to transform the coordinate indentity function is default, for instance you can use something like peaks.report(f=s.axis1.itop) to get ppm values on a NMR dataset order is “id, label, position, intensity”
parameters are : Id label positions intens width intens_err pos_err width_err in that order.
By default returns only the 4 first fields with 2 digits, but the format keyword can change that. format values: - None or “report”, the standard value is used: “{}, {}, {:.2f}, {:.2f}”
(so only the four first parameters are shown)
“full” is all parrameters at full resolution ( “{}; “*8 )
- any othe string following the format syta will do.
you can use any formating syntax. So for instance the following format “{1} : {3:.2f} F1: {2:.7f} +/- {4:.2f}” will remove the Id, show position with 7 digits after the comma, and will show width
you can change the report and full default values by setting pk.__class__.report_format and pk.__class__.full_format which class attributes
-
report_format
= '{}, {}, {:.2f}, {:.2f}'¶
-
-
class
spike.plugins.Peaks.
Peak1DList
(*arg, **kwds)[source]¶ Bases:
spike.plugins.Peaks.PeakList
store a list of 1D peaks contains the array version of the Peak1D object : self.pos is the numpy array of the position of all the peaks and self[k] is the kth Peak1D object of the list
-
display
(peak_label=False, peak_mode='marker', zoom=None, show=False, f=<function _identity>, color='red', markersize=None, figure=None, scale=1.0, NbMaxPeaks=1000)[source]¶ displays 1D peaks zoom is in index peak_mode is either “marker” or “bar” NbMaxPeaks is the maximum number of peaks to displayin the zoom window (show only the largest) f() should be a function which converts from points to current display scale - typically npk.axis1.itoc
-
property
pos
¶ returns a numpy array of the positions in index
-
report
(f=<function _identity>, file=None, format=None, NbMaxPeaks=1000)[source]¶ print the peak list f is a function used to transform respectively the coordinates indentity function is default, for instance you can use something like d.peaks.report(f=d.axis1.itop) to get ppm values on a NMR dataset
check documentation for Peak1D.report() for details on output format
-
-
class
spike.plugins.Peaks.
Peak2D
(Id, label, intens, posF1, posF2)[source]¶ Bases:
spike.plugins.Peaks.Peak
a class to store a single 2D peak defines in addition to Peak posF1 posF2 positions in F1 and F2 of the peak in index relative to the typed (real/complex) axis widthF1 …F2 widthes of the peak in index posF1_err … uncertainty of the previous values widthF1_err …
-
full_format
= '{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, '¶
-
report
(f1=<function _identity>, f2=<function _identity>, format=None)[source]¶ print the peak list f1, f2 are two functions used to transform respectively the coordinates in F1 and F2 indentity function is default, for instance you can use something like peaks.report(f1=s.axis1.itop, f2=s.axis2.itop) to get ppm values on a NMR dataset order is “id, label, posF1, posF2, intensity, widthF1, widthF2”
printed parameters are : Id label posF1 posF2 intens widthF1 widthF2 posF1_err posF2_err intens_err widthF1_err widthF2_err
By default returns only the 5 first fields with 2 digits, but the format keyword can change that. format values: - None or “report”, the standard value is used: “{}, {}, {:.2f}, {:.2f}, {:.2f}”
(so only the four first parameters are shown)
“full” is all parrameters at full resolution ( “{}; “*12 )
- any othe string following the format syntxa will do.
you can use any formating syntax. So for instance the following format “{1} : {4:.2f} F1: {2:.7f} +/- {5:.2f} X F2: {3:.7f} +/- {6:.2f}” will remove the Id, show position with 7 digits after the comma, and will show widthes
-
report_format
= '{}, {}, {:.2f}, {:.2f}, {:.2f}'¶
-
-
class
spike.plugins.Peaks.
Peak2DList
(*arg, **kwds)[source]¶ Bases:
spike.plugins.Peaks.PeakList
store a list of 2D peaks contains the array version of the Peak2D object : self.posF1 is the numpy array of the position of all the peaks and self[k] is the kth Peak2D object of the list
-
display
(axis=None, peak_label=False, zoom=None, show=False, f1=<function _identity>, f2=<function _identity>, color=None, markersize=6, figure=None, NbMaxPeaks=1000)[source]¶ displays 2D peak list zoom is in index f1 and f2 should be functions which convert from points to current display scale - typically npk.axis1.itoc npk.axis2.itoc
-
property
posF1
¶ returns a numpy array of the F1 positions in index
-
property
posF2
¶ returns a numpy array of the F2 positions in index
-
report
(f1=<function _identity>, f2=<function _identity>, file=None, format=None, NbMaxPeaks=1000)[source]¶ print the peak list f1, f2 are two functions used to transform respectively the coordinates in F1 and F2 indentity function is default, for instance you can use something like d.peaks.export(f1=s.axis1.itop, f2=s.axis2.itop) to get ppm values on a NMR dataset the file keyword allows to redirect the output to a file object
check documentation for Peak2D.report() for details on output format
-
-
class
spike.plugins.Peaks.
PeakList
(*arg, **kwds)[source]¶ Bases:
list
the class generic to all peak lists
-
property
intens
¶ returns a numpy array of the intensities
-
property
label
¶ returns an array of the labels
-
property
-
spike.plugins.Peaks.
center
(x, xo, intens, width)[source]¶ the centroid definition, used to fit the spectrum x can be a nparray FWMH is sqrt(2) x width.
-
spike.plugins.Peaks.
center2d
(yx, yo, xo, intens, widthy, widthx)[source]¶ the 2D centroid, used to fit 2D spectra - si center() xy is [x0, y_0, x_1, y_1, …, x_n-1, y_n-1] - is 2*n long for n points, returns [z_0, z_1, … z_n-1]
-
spike.plugins.Peaks.
centroid1d
(npkd, npoints=3, reset_label=True)[source]¶ from peak lists determined with peak() realize a centroid fit of the peak summit and width, will use npoints values around center (npoints has to be odd) computes Full width at half maximum updates in data peak list reset_label when True (default) reset the labels of FTMS datasets TODO : update uncertainties
-
spike.plugins.Peaks.
centroid2d
(npkd, npoints_F1=3, npoints_F2=3)[source]¶ from peak lists determined with peak() realize a centroid fit of the peak summit and width, computes Full width at half maximum updates in data peak list
TODO : update uncertainties
-
spike.plugins.Peaks.
display_peaks
(npkd, peak_label=False, peak_mode='marker', zoom=None, show=False, color=None, markersize=6, figure=None, scale=1.0, NbMaxPeaks=1000)[source]¶ display the content of the peak list, peak_mode is either “marker” (default) or “bar” (1D only) zoom is in current unit.
-
spike.plugins.Peaks.
peak_aggreg
(pklist, distance)[source]¶ aggregates peaks in peaklist if peaks are closer than a given distance in pixel distance : if two peaks are less than distance (in points), they are aggregated
-
spike.plugins.Peaks.
peakpick
(npkd, threshold=None, zoom=None, autothresh=3.0, verbose=True)[source]¶ performs a peak picking of the current experiment threshold is the level above which peaks are picked
None (default) means that autothresh*(noise level of dataset) will be used - using d.robust_stats() as proxy for noise-level
- zoom defines the region on which detection is made
zoom is in currentunit (same syntax as in display) None means the whole data
-
spike.plugins.Peaks.
peaks1d
(npkd, threshold, zoom=None)[source]¶ math code for NPKData 1D peak picker
-
spike.plugins.Peaks.
pk2pandas
(npkd, full=False)[source]¶ export extract of current peak list to pandas Dataframe - in current unit if full is False (default), the uncertainty are not listed uses nmr or ms version depending on data_type
-
spike.plugins.Peaks.
pk2pandas_ms
(npkd, full=False)[source]¶ export extract of current peak list to pandas Dataframe for MS datasets
-
spike.plugins.Peaks.
pk2pandas_nmr
(npkd, full=False)[source]¶ export extract of current peak list to pandas Dataframe for NMR datasets
-
spike.plugins.Peaks.
report_peaks
(npkd, file=None, format=None, NbMaxPeaks=1000)[source]¶ print the content of the peak list, using the current unit
if file should be an already opened writable file stream. if None, output will go to stdout
for documentation, check Peak1D.report() and Peak2D.report()
spike.plugins.apmin module¶
Autmomatic phase correction for 1D NMR spectra
based on an earlier version from NPK
works by minimizing the negative part of the spectrum WILL NOT WORK on positive/negative spectra (JMOD, W-LOGSY, etc.)
Created by DELSUC Marc-André on 2016-05-23. Copyright (c) 2016 IGBMC. All rights reserved.
-
spike.plugins.apmin.
apmin
(d, first_order=True, inwater=False, baselinecorr=True, apt=False, debug=False)[source]¶ automatic 1D phase correction phase by minimizing the negative wing of the 1D spectrum
first_order = False inhibit optimizing 1st order phase inwater = True does not look to the central zone of the spectrum baselinecorr = True, an advanced baseline correction is applied on the final steps apt = True (Attached proton test) performs the phasing on up-down spectra, such as APT / DEPT 13C spectra.
performs a grid/simplex search on P0 first then on (P0 P1) the dataset is returned phased and the values are stored in d.axis1.P0 and d.axis1.P1
P1 is kept to 0 if first_order=False
- note that if baselinecorr is True
the algo becomes quite slow !
a simple linear baseline correction is always applied anyway anyhow
adapted from NPK v1 MAD, may 2016
-
spike.plugins.apmin.
neg_wing
(d, bcorr=False, inwater=False, apt=False)[source]¶ measure negative wing power of NPKData d
if bcorr == True, a baseline correction is applied if inwater == True, the 10% central zone is just zeroed if apt == False, computes the std() of the negative points (distance to mean)
== True, computes the sum(abs()) of all the points (l_1 norm)
spike.plugins.bcorr module¶
set of function for the baseline correction
First version - Not finished !
improved July 2016
-
spike.plugins.bcorr.
autopoints
(npkd, Npoints=8)[source]¶ computes Npoints (defaut 8) positions for a spline baseline correction
-
spike.plugins.bcorr.
bcorr
(npkd, method='spline', xpoints=None, nsmooth=0)[source]¶ recapitulate all baseline correction methods, only 1D so far
- method is either
- auto:
use bcorr_auto, uses an automatic determination of the baseline does not work with negative peaks.
- linear:
simple 1D correction
- spline:
a cubic spline correction
both linear and spline use an additional list of pivot points ‘xpoints’ used to calculate the baseline if xpoints absent, pivots are estimated automaticaly if xpoints is integer, it determines the number of computed pivots (defaut is 8 if xpoints is None) if xpoints is a list of integers, there will used as pivots
if nsmooth >0, buffer is smoothed by moving average over 2*nsmooth+1 positions around pivots. default is spline with automatic detection of 8 baseline points
-
spike.plugins.bcorr.
bcorr_auto
(npkd, iterations=10, nbchunks=40, degree=1, nbcores=2, smooth=True)[source]¶ applies an automatic baseline correction
Find baseline by using low norm value and then high norm value to attract the baseline on the small values. Parameters : iterations : number of iterations for convergence toward the small values. nbchunks : number of chunks on which is done the minimization. Typically, each chunk must be larger than the peaks. degree : degree of the polynome used for approaching each signal chunk. nbcores : number of cores used for minimizing in parallel on many chunks (if not None)
smooth i True, applies a final Savitsky-Golay smoothing
-
spike.plugins.bcorr.
get_ypoints
(buff, xpoints, nsmooth=0)[source]¶ from buff and xpoints, returns ypoints = buff[xpoints] eventually smoothed by moving average over 2*nsmooth+1 positions
spike.plugins.bokeh_display module¶
displays interactive plots through bokeh, allows to save them in html format.
This plugin uses bokeh (https://bokeh.pydata.org/en/latest/) to build and display interactive plots. It adds npkd.bokeh_fig, a python dictionary containing the styles (colors, lines…) and npkd.bokeh_plot containing the plot itself (bokeh figure format).
- To make it work in the notebook add:
from bokeh.io import show,output_notebook output_notebook()
- And do:
npkd.bokeh(show=True)
- To save the plot in html format after doing npkd.bokeh(show=True):
from bokeh.resources import CDN from bokeh.embed import file_html html_text = file_html(npkd.bokeh_plot, CDN, “Title”) with open(“bokehplot.html”, “w”) as file:
file.write(html_text)
-
spike.plugins.bokeh_display.
bokeh_display
(npkd, scale=1.0, autoscalethresh=3.0, absmax=None, show=False, title=None, label=None, xlabel='_def_', ylabel='_def_', axis=None, image=False, mode3D=False, zoom=None, mpldic={}, dbkdic={}, dfigdic={}, linewidth=1, color=None, plot_width=600, plot_height=400, sizing_mode=None, redraw=False, tools='pan, box_zoom, box_select, reset, save')[source]¶ Display using bokeh instead of matplotlib
scale allows to increase the vertical scale of display absmax overwrite the value for the largest point, which will not be computed
display is scaled so that the largest point is first computed (and stored in absmax), and then the value at absmax/scale is set full screen
- show will call bk.show() at the end, allowing every declared display to be shown on-screen
useless in ipython/jupyter notebook
title add a title to the bokeh plot label add a label text to plot xlabel, ylabel axes label (default is self.currentunit - use None to remove) axis used as axis if present, axis length should match experiment length
in 2D, should be a pair (xaxis,yaxis)
- image if True, the function will generate the 2D NMR FID of data,
if False (Default), the function present contour plots.
mode3D not implemented zoom is a tuple defining the zoom window (left,right) or ((F1_limits),(F2_limits))
defined in the current axis unit (points, ppm, m/z etc ….)
mpldic a dictionnary passed as is to the matplotlib plot command dbkdic a dictionnary passed as is to populated the parameters of the bokeh graph dfigdic a dictionnary passed as is to populated the content of the bokeh figure linewidth linewidth for the plots (useful for example when using seaborn) color if 1D is the color of the curve,
if 2D FID is the palette name to be used, if 2D contour is the color set to be used by matplotlib.
plot_width, plot_height width and height of the plot sizing_mode if provided, resize plot according to the window chosen sizes.
e.g. “scale_width”, “scale_height”, “scale_both”
- tools a string containing the tools to be available for bokeh interactivity.
e.g. “pan, box_zoom, box_select, reset, save” (see bokeh doc for more info)
-
spike.plugins.bokeh_display.
get_contour_data
(ax)[source]¶ Get informations about contours created by matplotlib. ax is the input matplotlob contour ax (cf. fig,ax produced by matplotlib) xs and ys are the different contour lines got out of the matplotlib. col is the color corresponding to the lines.
spike.plugins.fastclean module¶
A utility to set to zero all points below a ratio
-
spike.plugins.fastclean.
fastclean
(npkd, nsigma=2.0, nbseg=20, axis=0)[source]¶ set to zeros all points below nsigma times the noise level This allows the corresponding data-set, once stored to file, to be considerably more compressive.
- nsigma: float
the ratio used, typically 1.0 to 3.0 (higher compression)
- nbseg: int
the number of segments used for noise evaluation, see util.signal_tools.findnoiselevel
- axis: int
the axis on which the noise is evaluated, default is fastest varying dimension
spike.plugins.gaussenh module¶
Gaussian enhancement apodisation
d.gaussenh(width, enhancement=1.0, axis=0)
apply an gaussian enhancement, width is in Hz enhancement is the strength of the effect axis is either F1, or F2 in 2D, 0 is default axis. multiplies by gauss(width) * exp(-enhancement*width)
Created by DELSUC Marc-André on February 2019 Copyright (c) 2019 IGBMC. All rights reserved.
spike.plugins.pg_sane module¶
plugin for PG_Sane algorithm, used for NUS processing
It takes a NUS acquired transient, and fill it by estimating the missing values.
associated publications
Lionel Chiron, Afef Cherni, Christian Rolando, Emilie Chouzenoux, Marc-André Delsuc Fast Analysis of Non Uniform Sampled DataSets in 2D-FT-ICR-MS. - in progress
Bray, F., Bouclon, J., Chiron, L., Witt, M., Delsuc, M.-A., & Rolando, C. (2017). Nonuniform Sampling Acquisition of Two-Dimensional Fourier Transform Ion Cyclotron Resonance Mass Spectrometry for Increased Mass Resolution of Tandem Mass Spectrometry Precursor Ions. Anal. Chem., acs.analchem.7b01850. http://doi.org/10.1021/acs.analchem.7b01850
Chiron, L., van Agthoven, M. A., Kieffer, B., Rolando, C., & Delsuc, M.-A. (2014). Efficient denoising algorithms for large experimental datasets and their applications in Fourier transform ion cyclotron resonance mass spectrometry. PNAS , 111(4), 1385–1390. http://doi.org/10.1073/pnas.1306700111
-
spike.plugins.pg_sane.
HT
(x, thresh)[source]¶ returns the Hard Thresholding of x, i.e. all points such as x_i <= thresh are set to 0.0
-
spike.plugins.pg_sane.
HTproj
(x, k)[source]¶ returns the Hard Thresholding of x, on the ball of radius ell_o = k i.e. the k largest values are kept, all other are set to 0
-
spike.plugins.pg_sane.
pg_sane
(npkd, HTmode='projection', axis=1, iterations=10, HTratio=None, rank=20, Lthresh=2.0, sampling=None, size=None, final='SANE')[source]¶ Papoulis Gershberg algorithm - stabilized with SANE
This function takes FID with a partial sampling, and fills the holes with estimated values. The FID can then be processed normally, as it was completely acquired.
mode (threshold or projection) determine PG algorithm iterations : the number of iterations used for the program
pg_sane converges quite rapidely, and usually 5 to 20 iterations are suffisant. when the SNR is high, or for large data-sets (>8k) more iterations might be needed
- HTratio: number of lines left by HT/projection - usually a few % of the whole spectrum
default is 0.01 ( 1% )
- ranka rough estimate of the number of lines present in the FT spectrum of the FID - not stringent -
if in doubt, use a slightly larger value.
- Lthreshthe multiplier for HT/threshold, usually 2.0
lower recovers more weak signals but requires more iterations higher requires less iteration but look only to larger peaks.
- samplingif provided, the sampling index list and npkd is supposed to be zerofilled (missing values set to zero)
if None, npkd.axis1.sampled should be True, and sampling it will be fetched via npkd.axis1.get_sampling()
- final: the final step after iterations,
default is ‘SANE’: better (noise-less) data-sets, the best reconstruction quality in simulations ‘PG’ to reapply PG step - produces the cleanest/more compressible data-sets ‘Reinject’: the closest to acquired data - to use if there is very little noise.
size: if different from None, pg_sane will also extrapolate to size
spike.plugins.rem_ridge module¶
removes ridges in 2D
Created by Marc-André on 2011-08-15. Copyright (c) 2011 IGBMC. All rights reserved.
spike.plugins.sane module¶
plugin for Sane denoising
This plugin implements the SANE denoising algorithm, SANE is inspired from urQRd algorithm, but is improved in several points
faster on vector length != 2**n
much more efficient on weak signals
requires less iterations and less overestimate of rank
however, a non productive iteration is always performed, so processing time for I iterations of SANE should be compared with I+1 iterations of urQRd.
associated publications - Bray, F., Bouclon, J., Chiron, L., Witt, M., Delsuc, M.-A., & Rolando, C. (2017).
Nonuniform Sampling Acquisition of Two-Dimensional Fourier Transform Ion Cyclotron Resonance Mass Spectrometry for Increased Mass Resolution of Tandem Mass Spectrometry Precursor Ions. Analytical Chemistry, acs.analchem.7b01850. http://doi.org/10.1021/acs.analchem.7b01850
Chiron, L., van Agthoven, M. A., Kieffer, B., Rolando, C., & Delsuc, M.-A. (2014). Efficient denoising algorithms for large experimental datasets and their applications in Fourier transform ion cyclotron resonance mass spectrometry. PNAS , 111(4), 1385–1390. http://doi.org/10.1073/pnas.1306700111
-
spike.plugins.sane.
sane_plugin
(npkd, rank, orda=None, iterations=1, axis=0, trick=True, optk=False, ktrick=False)[source]¶ Apply “sane” denoising to data rank is about 2 x number_of_expected_lines Manages real and complex cases. Handles the case of hypercomplex for denoising of 2D FTICR for example.
sane algorithm. Name stands for Support Selection for Noise Elimination. From a data series return a denoised series denoised data : the series to be denoised - a (normally complex) numpy buffer rank : the rank of the analysis orda : is the order of the analysis
internally, a Hankel matrix (M,N) is constructed, with M = orda and N = len(data)-orda+1 if None (default) orda = (len(data)+1)/2
iterations : the number of time the operation should be repeated optk : if set to True will calculate the rank giving the best recovery for an automatic estimated noise level. trick : permits to enhanced the denoising by using a cleaned signal as the projective space. “Support Selection” ktrick : if a value is given, it permits to change the rank on the second pass.
The idea is that for the first pass a rank large enough as to be used to compensate for the noise while for the second pass a lower rank can be used.
spike.plugins.sg module¶
set of function Savitsky-Golay smoothing
-
spike.plugins.sg.
sg
(npkd, window_size, order, deriv=0, axis=0)[source]¶ applies Savitzky-Golay of order filter to data window_size : int
the length of the window. Must be an odd integer number.
- orderint
the order of the polynomial used in the filtering. Must be less than window_size - 1.
- deriv: int
the order of the derivative to compute (default = 0 means only smoothing)
- axis: int
the axis on which the filter is to be applied, default is fastest varying dimension
-
spike.plugins.sg.
sg2D
(npkd, window_size, order, deriv=None)[source]¶ applies a 2D Savitzky-Golay of order filter to data window_size : int
the length of the square window. Must be an odd integer number.
- orderint
the order of the polynomial used in the filtering. Must be less than window_size - 1.
- deriv: None, ‘col’, or ‘row’. ‘both’ mode does not work.
the direction of the derivative to compute (default = None means only smoothing)
can be applied to a 2D only.
spike.plugins.test module¶
Test procedure for plugins
spike.plugins.urQRd module¶
plugin for the urQRd denoising method
spike.plugins.wavelet module¶
A plugin which install wavelet denoising
This plugin is based on the PyWavelet library, which should be installed independently before trying to use this plugin It can be found at: http://www.pybytes.com/pywavelets/
M-A Delsuc april 2016 from an idea by L Chiron
-
class
spike.plugins.wavelet.
WaveLetTest
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
Testing Wavelet plugin-
-
spike.plugins.wavelet.
denoise1D
(data, noiseSigma, wavelet='db3')[source]¶ performed the 1D denoising data : a 1D numpy array wavelet : the wavelet basis used,
-
spike.plugins.wavelet.
denoise2D
(data, noiseSigma, wavelet='db3')[source]¶ performed the 2D denoising data : a 2D numpy array wavelet : the wavelet basis used
-
spike.plugins.wavelet.
wavelet
(npkd, nsigma=1.0, wavelet='db3')[source]¶ Performs the wavelet denoising of a 1D or 2D spectrum.
- nsigma the threshold is nsigma times the estimate noise level,
default 1.0 - corresponds to a relatively strong denoising
- wavelet the wavelet basis used, default ‘db3’ (Daubechies 3)
check pywt.wavelist() for the list of possible wavelet
eg: d.wavelet(nsigma=0.5) # d is cleaned after execution
ref: Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41:613–621.
Based on the PyWavelet library
spike.plugins.zoom3D module¶
Module contents¶
Plug-ins for the Spike package
All the plugin files located in the spike/plugins folder are loaded automatically when import importing spike the first time.
the variable spike.plugins.plugins contains the list of the loaded plugin modules.
It is allways possible to load a plugin afterward by import a plugin definition at a later time during run-time.
Each plugin file should define the needed functions :
- def myfunc(npkdata, args):
“myfunc doc” …do whatever, assuming npkdata is a NPKData return npkdata # THIS is important, that is the standard NPKData mechanism
and register them into NPKData as follows : NPKData_plugin(“myname”, myfunc)
then, any NPKData will inherit the myname() method
For the moment, only NPKData plugins are handled.
-
spike.plugins.
load
(debug=True)[source]¶ the load() function is called at initialization, and loads all files found in the plugins folder