Processing utilities¶
append_results(results, fn)
¶
Append results to a CSV file with file locking for thread safety.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
results |
DataFrame
|
Results DataFrame to append. |
required |
fn |
str
|
Filename to append to. |
required |
Source code in src/ms_mint/processing.py
extract_chromatogram_from_ms1(df, mz_mean, mz_width=10)
¶
Extract single chromatogram of specific m/z value from MS-data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
MS-data with columns ['scan_time', 'mz', 'intensity']. |
required |
mz_mean |
float
|
Target m/z value. |
required |
mz_width |
float
|
m/z width in ppm. Default is 10. |
10
|
Returns:
Type | Description |
---|---|
Series
|
Chromatogram as a pandas Series with scan_time as index and intensity as values. |
Source code in src/ms_mint/processing.py
extract_ms1_properties(array, mz_mean)
¶
Extract peak properties from an MS-1 data slice.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
array |
ndarray
|
MS-1 data slice array with columns [scan_time, mz, intensity]. |
required |
mz_mean |
float
|
Mean m/z value for calculating mass accuracy. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dictionary of extracted peak properties. |
Source code in src/ms_mint/processing.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 |
|
get_chromatogram_from_ms_file(ms_file, mz_mean, mz_width=10)
¶
Get chromatogram data from an MS file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ms_file |
Union[str, Path]
|
Path to the MS file. |
required |
mz_mean |
float
|
Mean m/z value to extract. |
required |
mz_width |
float
|
Width around the mean m/z in ppm to extract. |
10
|
Returns:
Type | Description |
---|---|
Series
|
Chromatogram data as a pandas Series with scan_time as index |
Series
|
and intensity as values. |
Source code in src/ms_mint/processing.py
process_ms1(df, targets)
¶
Process MS-1 data with a target list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
MS-1 data with columns ['scan_time', 'mz', 'intensity']. |
required |
targets |
DataFrame
|
Target list DataFrame with required columns. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with peak integration results. |
Source code in src/ms_mint/processing.py
process_ms1_file(filename, targets)
¶
Perform peak integration using a filename as input.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filename |
Union[str, Path]
|
Path to mzxml or mzml file. |
required |
targets |
DataFrame
|
DataFrame in target list format. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with processed peak intensities. |
Source code in src/ms_mint/processing.py
process_ms1_files_in_parallel(args)
¶
Process MS files in parallel using the provided arguments.
This is a pickleable function for parallel peak integration that can be used with multiprocessing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
args |
Dict[str, Any]
|
Dictionary containing the following keys: - filename: Path to the MS file to process. - targets: DataFrame with target compounds information. - output_fn: Optional output filename to save results. - queue: Optional queue for progress reporting. |
required |
Returns:
Type | Description |
---|---|
Optional[DataFrame]
|
DataFrame with processing results, or None if results were saved to a file. |
Source code in src/ms_mint/processing.py
process_ms1_from_numpy(array, peaks)
¶
Process MS1 data in numpy array format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
array |
ndarray
|
Input data array with columns [scan_time, mz, intensity]. |
required |
peaks |
ndarray
|
Peak data array with columns [mz_mean, mz_width, rt_min, rt_max, intensity_threshold, peak_label]. |
required |
Returns:
Type | Description |
---|---|
List[List[Any]]
|
List of extracted data for each peak. |
Source code in src/ms_mint/processing.py
score_peaks(mint_results)
¶
Score the peak quality (experimental).
Calculates a score from 0 to 1 where: - 1 means a good peak shape - 0 means a bad peak shape
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mint_results |
DataFrame
|
DataFrame in ms_mint results format. |
required |
Returns:
Type | Description |
---|---|
Series
|
Series of scores for each peak. |
Source code in src/ms_mint/processing.py
slice_ms1_array(array, rt_min, rt_max, mz_mean, mz_width, intensity_threshold)
¶
Slice MS1 data by m/z, retention time, and intensity threshold.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
array |
ndarray
|
Input MS-1 data array with columns [scan_time, mz, intensity]. |
required |
rt_min |
float
|
Minimum retention time for slice. |
required |
rt_max |
float
|
Maximum retention time for slice. |
required |
mz_mean |
float
|
Mean m/z value for slice. |
required |
mz_width |
float
|
Width of slice in ppm of mz_mean. |
required |
intensity_threshold |
float
|
Noise filter value. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
Filtered numpy array containing only data points meeting the criteria. |
Source code in src/ms_mint/processing.py
options: show_root_heading: true show_root_full_path: true show_submodules: true members_order: source