Helper tools¶
df_diff(df1, df2, which='both')
¶
Find differences between two dataframes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df1 |
DataFrame
|
Reference DataFrame. |
required |
df2 |
DataFrame
|
DataFrame to compare. |
required |
which |
str
|
Direction in which to compare. Options are "both", "left_only", "right_only". |
'both'
|
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame containing only the rows that differ according to the specified direction. |
Source code in src/ms_mint/tools.py
find_peaks_in_timeseries(series, prominence=None, plot=False, rel_height=0.9, **kwargs)
¶
Find peaks in a time series using scipy's peak finding algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
series |
Series
|
Time series data to find peaks in. |
required |
prominence |
Optional[float]
|
Minimum prominence of peaks. If None, all peaks are detected. |
None
|
plot |
bool
|
Whether to generate a plot of the detected peaks. |
False
|
rel_height |
float
|
Relative height from the peak at which to determine peak width. |
0.9
|
**kwargs |
Additional arguments passed to scipy.signal.find_peaks. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame containing peak properties including retention times and heights. |
Source code in src/ms_mint/tools.py
fn_to_label(fn)
¶
Convert a filename to a label by removing the file extension.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Union[str, Path]
|
Filename or path. |
required |
Returns:
Type | Description |
---|---|
str
|
Filename without extension. |
formula_to_mass(formulas, ms_mode=None)
¶
Calculate m/z values from molecular formulas for specific ionization mode.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
formulas |
Union[str, List[str]]
|
List of molecular formulas (e.g., ['H2O']) or a single formula. |
required |
ms_mode |
Optional[Literal['negative', 'positive', 'neutral']]
|
Ionization mode. One of "negative", "positive", "neutral", or None. |
None
|
Returns:
Type | Description |
---|---|
List[Optional[float]]
|
List of calculated masses. None values are included for invalid formulas. |
Raises:
Type | Description |
---|---|
AssertionError
|
If ms_mode is not one of the allowed values. |
Source code in src/ms_mint/tools.py
gaussian(x, mu, sig)
¶
Generate values for a Gaussian function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Union[List[float], ndarray]
|
x-values to generate function values. |
required |
mu |
float
|
Mean of the Gaussian. |
required |
sig |
float
|
Standard deviation of the Gaussian. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
Array of Gaussian function values at the input x-values. |
Source code in src/ms_mint/tools.py
get_ms_files_from_results(results)
¶
Extract MS filenames from Mint results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
results |
DataFrame
|
DataFrame in Mint results format. |
required |
Returns:
Type | Description |
---|---|
List[Union[str, Path]]
|
List of MS filenames. |
Source code in src/ms_mint/tools.py
get_targets_from_results(results)
¶
Extract targets DataFrame from MS-MINT results table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
results |
DataFrame
|
Mint results table. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame containing target information extracted from results. |
Source code in src/ms_mint/tools.py
init_metadata()
¶
Initialize an empty metadata DataFrame with the standard columns.
Returns:
Type | Description |
---|---|
DataFrame
|
Empty DataFrame with standard metadata columns and 'ms_file_label' as index. |
Source code in src/ms_mint/tools.py
is_ms_file(fn)
¶
Check if a file is a recognized MS file format based on its extension.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Union[str, Path]
|
Filename or path to check. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the file has a recognized MS file extension, False otherwise. |
Source code in src/ms_mint/tools.py
lock(fn)
¶
Create a file lock to ensure safe writing to file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Union[str, Path]
|
Filename to lock. |
required |
Returns:
Type | Description |
---|---|
FileLock
|
File lock object. |
log2p1(x)
¶
Apply log2(x+1) transformation to numeric data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Union[float, ndarray, Series]
|
Numeric value or array to transform. |
required |
Returns:
Type | Description |
---|---|
Union[float, ndarray, Series]
|
Transformed value(s). |
Source code in src/ms_mint/tools.py
mz_mean_width_to_min_max(mz_mean, mz_width)
¶
Convert m/z mean and width (in ppm) to min and max m/z values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mz_mean |
float
|
Mean m/z value. |
required |
mz_width |
float
|
Width in parts-per-million (ppm). |
required |
Returns:
Type | Description |
---|---|
Tuple[float, float]
|
Tuple of (mz_min, mz_max) defining the m/z range. |
Source code in src/ms_mint/tools.py
scale_dataframe(df, scaler='standard', **kwargs)
¶
Scale all columns in a dense dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
DataFrame to scale. |
required |
scaler |
Union[str, Any]
|
Scaler to use. Either a string ('robust', 'standard', 'minmax') or a scikit-learn scaler instance. |
'standard'
|
**kwargs |
Additional arguments passed to the scaler constructor. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
Scaled DataFrame with the same shape as the input. |
Source code in src/ms_mint/tools.py
options: show_root_heading: true show_root_full_path: true show_submodules: true members_order: source