PFChrom v5 Documentation Contents            AIST Software Home            AIST Software Support

Smooth


This option is only available from the main window's Smooth item in the Data menu.

The Smooth option is used to smooth noisy data. This is not normally needed. The placement algorithms within PFChrom use smoothing algorithms to create a smoothed versions of the data which are used only to detect and place peaks. You should not need to smooth data in order to isolate peaks of interest and fit them successfully. Smoothing can introduce bias and artifacts if care is not taken.

There are few instances where pre-smoothing may be desirable or necessary:

1. If you have cumulative data which you wish to fit with PFChrom, you will need to generate smooth first derivative information containing the peaks which can then be fit.

2. If you have very low signal data with digitization effects, those step effects can interfere with the baseline identification/fitting and IRF Deconvolution algorithms. These can also make hidden peak fitting more difficult. A separate smoothing will ensure adverse effects from integer signal digitization are minimized.

3. If you have a noisy baseline where very low signal peak information of interest is buried in that baseline, a separate visual smoothing will allow you to optimize the S/N extraction of those peaks. You may find one particular smoothing procedure extracts the peaks you feel are likely to be real. Although all of these smoothing methods will be available within the Hidden Peaks - Residuals procedure for peak identification, you may find it easier to visually extract the peaks from the baseline in a separate smoothing step. Preprocessing what is likely to be a rather delicate baseline will also be appreciably easier after such smoothing.

4. If you have noisy baselines with a sharp curvature and multiple peaks, especially those resting on oscillations or turns in that baseline, and you wish to use a non-parametric or spline baseline model to retain as much of the curvature as possible below the peaks, smoothing may allow a more accurate baseline.

PFChrom Graph

The raw and smoothed data are shown in a PFChrom Graph for rapid determination of the effectiveness of smoothing.

v5_SmoothingDlg.png

There are four methods of smoothing available:

FFT Filtering

Loess

Gaussian Convolution

Savitzky-Golay

The Savitzky-Golay algorithm can also be used to create a smooth first through fourth derivative directly from the raw data.

Level (% Smoothing)

The % smoothing for each of the algorithms defines the breadth of a smoothing or filtering window.

For the FFT Filtering, the % smoothing controls the channels that are zeroed in the frequency domain. A 10% smoothing level zeroes the upper 1/2 of the frequency channels. A value of 20% zeroes the upper 3/4 of the channels. A value of 50 zeroes the upper 9/10 of the channels. For peak-type data, the optimum smoothing level is generally between 25 and 40%.

For the Loess and Savitzky-Golay procedures, the % smoothing determines the actual smoothing window. A 10% smoothing level results in a smoothing window containing 10% of the data points.

For the Gaussian Convolution smoothing, the % smoothing corresponds with the Gaussian FWHM being convolved with the data. A 5% smoothing levels convolves a Gaussian having a FWHM 5 times the average X spacing (sampling interval) of the data set. The Gaussian convolution also includes an intrinsic frequency domain filtration which is automatically determined.

You can modify the % smoothing by entering the desired value, by changing the value with the up or down spin buttons, or you may right click and hold down the right mouse button on the edit field and spin controls, selecting the level from the popup menu. After a brief delay, the smoothed data will be reflected within the graph.

AI Expert

Generate/6994.gif In order to make the determination of smoothing levels as automatic as possible, PFChrom offers an AI option which seeks to automatically determine the optimum smoothing level. This is that level of smoothing which offers the greatest possible noise reduction without adversely affecting features within the data.

Depending on the algorithm, the AI Expert is based upon either a first derivative of the overall standard error (between raw and smoothed) relative to smoothing power, or a first derivative of the noise reduction relative to smoothing power.

The AI Expert option will work very well for most cases. It is most easily fooled on data sets where relatively few points determine an individual peak, and on data sets where the noise is not consistent across the X range of the data.

FFT Filtering

This is essentially the automated version of what you can do manually in the Fourier Domain Filtering option. This option removes any linear trend which might appear as a low frequency component in the FFT, assumes the data is linearly spaced, performs a forward FFT, zeroes the higher frequency components, performs an inverse FFT, restores the linear trend, and presents the smoothed data, all in a single automated step.

Because the signal tends to appear only at low frequencies, PFChrom uses the aforementioned non-linear smoothing scale.

Linearly-spaced X values are recommended but not always required for this algorithm to offer a very effective smoothing. Any non-uniformity of X values does introduce noise, but such may be small compared to the overall noise reduction achieved by the filtering.

Because of the effectiveness of the FFT, this algorithm is quite fast with large data sets.

In lowpass filtering, any low frequency harmonic oscillations that are present will pass without being filtered. If you see unwanted sinusoidal behavior within the baseline areas, you should probably use a time domain procedure (Savitzky-Golay or Loess).

Loess

The Loess procedure is a locally weighted regression smoothing algorithm that performs a full least-squares fit for each data point. Its strength lay in its ability to effectively manage data with non-uniformly spaced X values. This is most numerically intense of the smoothing algorithms and quite slow with large data sets.

The PFChrom implementation of Loess fits a linear model. In general, this means a tendency to see some attenuation of the peaks at higher smoothing levels. As such, you will probably find the Loess procedure the least attractive when you are working with data having equally spaced X-values. If you are working with data containing non-uniform X-values, and you do not wish to create a uniform set using PFChrom's Interpolate/Upsample option, Loess may be the only algorithm that produces acceptable results.

Gaussian Convolution

This algorithm is unique to PFChrom. It uses an automatic FFT Filtering for global smoothing, and convolves a narrow width Gaussian for local smoothing.

In the Gaussian Convolution smoothing, any linear trend in the data is removed, the data is transformed to the frequency domain, and there a Gaussian response function is convolved with the data. The frequency domain data is then automatically filtered, the inverse FFT is made, and any linear trend is restored.

The frequency domain filtering is automatically determined with this algorithm, and it will have some dependency on the Gaussian width used in the convolution. The % smoothing specifies only the FWHM of the Gaussian convolving the data, this as a multiple of the average X-spacing in the data set. As such, this algorithm will generally require % smoothing levels between 2 and 5, this corresponding with Gaussian response function FWHM of 2x to 5x the average sampling interval. Anything greater is likely to produce unwanted attenuation of the peaks. A response function convolution procedure does conserve peak area, however, so such attenuation is not as harmful as might be the case in a time domain procedure.

This algorithm will probably produce the highest degree of noise reduction of the four methods, but bears the same limitations as the FFT Filtering. As such, this algorithm is recommended only for uniformly spaced X-data, although very satisfactory results can sometimes be achieved lacking such. Since the FFT procedures are quite fast, the algorithm is efficient with large data sets, though somewhat slower than the simple FFT Filtering.

Savitzky-Golay

This time-domain method of smoothing is based on least squares quartic polynomial fitting across a moving window within the data. The method was originally designed to preserve the higher moments within spectral data. The algorithm has been modified by PFChrom and will offer a higher level smoothing than that traditionally associated with Savitzky-Golay. The PFChrom implementation of the Savitzky-Golay algorithm employs sequential internal smoothing passes to improve overall noise reduction.

All derivatives also use sequential smoothing passes. The number of such passes is automatically determined.

The higher order of polynomial makes it possible to achieve a high level of smoothing without peak attenuation. Because this algorithm relies on the linearity of an unweighted polynomial model, it is also quite efficient with large data sets.

Unlike the FFT procedures which have some tolerance for non-uniformly spaced X-values, the Savitzky-Golay procedure does require a constant X-spacing within the data. If you wish to use the Savitzky-Golay procedure to smooth your data, or if you wish to find hidden peaks using the Hidden Peaks - Second Derivative option, you should use PFChrom's Interpolate/Upsample option to create data with a constant X spacing.

PFChrom uses the Savitzky-Golay procedure exclusively to produce a smoothed second derivative in the Hidden Peaks - Second Derivative option.

Equivalent Noise %

PFChrom contains an algorithm which estimates the measure of Gaussian (white) noise present within a given data set. For the principal data set in the main window, PFChrom reports the equivalent noise levels for both the raw and smoothed data as well as an overall % noise reduction.

This estimate only determines how smooth given data is, and does not serve as an indicator for oversmoothing. In general, oversmoothing is easily observed visually in the attenuation of peak amplitudes.

Multiple Data Sets

If you have multiple data sets present, all will be shown in this procedure, irrespective of whether or not a data set is selected in the main window of the program. The selection in the main screen applies only to the View and Compare Data options and the Local Maxima Peaks, Hidden Peaks - Residuals, and the Hidden Peaks - Second Derivative fitting options.

When the OK is selected with all data sets shown, all of the data sets will be smoothed as shown and saved as a smoothed level in the overall PDF data file. Click Cancel to exit without altering the data.

There is no option to custom smooth a specific data set. If you wish to see one data set more closely, you can double click that graph (or right click and select the Plot This Data Set option from the popup menu) to see or smooth that specific data set. If you smooth this set, however, the global smoothing settings will update, and all data sets will be smoothed with these new settings.

After viewing or smoothing a given set, you can click OK, double click the graph, or right click the graph and choose Plot All Data Sets from the popup menu. Clicking OK when you are within a single plot returns you to the dialog with all data sets rendered.

Right Click Menu Options

When you right click a graph a popup menu will offer the following options:

Restore Scaling - Undo Zoom

If you have zoomed in the graphs, this option can be used to undo the zoom-in and restore the automatic scaling.

Plot this Data Set

Use this right-click menu option to view or independently smooth the data sets using just one of the sets as a point of reference.

Plot All Data Sets

You can use this right-click option to close a the local viewing step. You can also close the local viewing by clicking OK-the dialog is restored to where all data sets are shown. You can also double click the graph to restore all data plots.



c:\1pf\v5 help\home.gif Section Data View Function(X)