Data handler for MALDI mass spectrometry dilution experiments.
-HOW TO- GUIDE
1.Data source definition:
1a. Browse and select folders which contain a) MS intensity tables (for .xls* files: rows are m/z peak IDs, columns are intensities with sample IDs as headers; or for .txt files: each row contains sample data with embedded intensities); b) lookup tables (sample definition tables with sample IDs in rows and associated information (nested local IDs, file infos, etc.) in columns either as .xls*, .txt, or .log files); c) m/z peak tables with peak IDs in rows and associated information in columns. Select 'Cancel' to set to 'none'. Select 'use headers' if multiple files in the folder might not have the same order of columns.
1b. Paste MS intensity table in sheet 1 and sample lookup table in sheet 2. m/z data will be automatically added to the final output table if a m/z reference file has been selected.
The software will use either 1a or 1b depending on whether folders have been selected or not.
2. Define data layout:
2a. Enter the row numbers of the global sample IDs and the start of the intensity data in the MS intensity table(s) (source data). These have to contain m/z peak IDs as first column!
2b. Enter the column numbers of the location of the same global sample IDs and the nested sub-ID (GlasgowID) in the lookup table(s).
2c. Define the structure of
the nested sub-ID (
2d. Select 'acid' tag if you wish to extract this tag from your nested sub-ID.
2e. Enter the value of the normalisation factor (100 will render your intensity to percentages).
3. Calculate the sum of intensities per sample: adds up all intensities per sample and reports the number of peaks with intensity values (i.e. values above zero) per sample.
4. Take Glasgow ID apart: will extract information such as dilutions and localised sample IDs from the nested sub-ID.
5. Remove zeros in source data: removes null values in the intensity tables. This has no impact on any functions in the software (aesthetics only).
6. Normalise source data: tallies the intensities by sample (column) and divides each value by the sum total. Generates new table.
7. Merge by: select if and how you wish to merge the source data. Generates new table.
8. Average data after merge: select if you want to average the data after the merging operation. This will omit null values. Choose to merge it on a new worksheet or merge it directly or leave the data side-by-side (using '|' as a data separator). Generates new table.
9. Minimum number of replicates: define how many replicates you accept as a valid peak (lower boundary). If this number is not reached (i.e. the peak was not observed with the minimal number defined), then the intensity of this peak will be set to zero or omitted. Enter '0' if you accept all peaks.
10. Maximum number of replicates: define how many replicates you accept as a valid peak (higher boundary). If there are more valid measurements for a specific peak, then the software will either calculate the average and drop the one(s) furthest away until the number of intensities reaches the value entered, or (if this number specified is preceeded with a minus sign) drops the lowest values until the number of intensities matches the specified value (without the minus sign).
11. Include number of replicates when averaging: adds the number of replicates with observed intensities after the averaged intensity in brackets.
12. Normalise after merging and averaging: tallies the intensities by sample (column) and divides each value by the sum total. Generates new table.
13. Sort data: select from the drop-down list how you want the data to be sorted. Generates new table.
14. Regression analysis: Select if you want to do regression analysis. This requires that the source data has been merged. Generates new table.
15. Use data: applies only to data which has not already been averaged. Either average it or use every single datapoint for curve fitting.
16. Transform data: choose to use the dilution data as is, or transform it to square root (brings the data together) or quadratic (expands the data). The final output will be adjusted accordingly (un-transformed).
17. Curve fitting: Choose between linear, exponential or polynomial (n=2) curve fitting.
18. Drop outliers: Define how many dilutions can be removed per m/z peak (applies only to valid measurements). This is done by linear regression analysis, followed by determination and removal of the outlier, and re-analysis by linear regression. It will repeat this step until either the number stated is reached, or the datapoint threshold value is reached.
19. Datapoint threshold: Define the number of minimal dilution data points per m/z peak. If this number is not reached for a m/z peak then the intensity will be set to null.
20. Intensity output: Select whether the reported output intensity value should be calculated for a dilution of 1 by either following the calculated regression curve, or calculating the average of the valid dilution values (after removal of outliers if selected), calculating the associated intensity based on the regression curve, and reporting the ratio of this intensity value and the average dilution value.
21. Dilution factor: the output will be calculated based on the defined dilution factor for all values.
22. Normalise after regression analysis: tallies the intensities by sample (column) and divides each value by the sum total.
23. Parameter output to report: Select which parameters the software should include in the output (this will be placed in the column next to the data output column).
23a. Number of dilutions: number of merged dilutions with data (after data point adjustments).
23b. N(sample depth): total number of datapoints used in calculation.
23c. R-square: the R-square value (coefficient of determination).
23d. average dilution: the average of all dilution-values used for the calculation (after data point adjustment).
23e. average intensity: the intensity value at the average dilution point.
23f. standard error intensity: the standard error of the intensity after calculation.
23g. slope: the slope of aligned curve.
23h. offset: the offset of the aligned curve.
23i. F-statistic: F-observed value. Determines whether the observed relationship between the dependent and independent variables occurs by chance.
23j. dF: degree of freedom.
23k. F distribution: F probability distribution.
23l. all: selects all output options.
24. Select either 'Cancel' to quit or 'Do it' to proceed. A summary window will pop-up where you can review your settings and either go back to adjust the settings or continue to do the data manipulations.
Source code is available upon request by sending us an email.