ms_mint package

Submodules

ms_mint.Mint module

Main module of the ms-mint library.

class ms_mint.Mint.Mint(verbose: bool = False, progress_callback: Callable = None, time_unit: str = 's', wdir: str = None)[source]

Bases: object

Main class of the ms_mint package, which processes metabolomics files.

Parameters:
  • verbose (bool) – Sets verbosity of the instance.

  • progress_callback (Callable[]) – A callback for a progress bar.

Parm wdir:

Working directory

_scale_group(group, scaler)[source]

Helper function to scale groups individually.

clear_ms_files()[source]

Reset ms files.

clear_results()[source]

Reset results.

clear_targets()[source]

Reset target list.

crosstab(var_name: str = None, index: str = None, column: str = None, aggfunc: str = 'mean', apply: Callable = None, scaler: Callable = None, groupby: str = None)[source]

Create condensed representation of the results. More specifically, a cross-table with filenames as index and target labels. The values in the cells are determined by col_name.

Parameters:
  • var_name (str, optional) – Name of the column from mint.results table that is used for the cell values. If None, defaults to ‘peak_area_top3’.

  • index (str, optional) – Name of the column to be used as index in the resulting cross-tabulation. If None, defaults to ‘ms_file_label’.

  • column (str, optional) – Name of the column to be used as columns in the resulting cross-tabulation. If None, defaults to ‘peak_label’.

  • aggfunc (str, optional) – Aggregation function to be used for aggregating values. Defaults to ‘mean’.

  • apply (Callable, optional) – Function to be applied to the resulting cross-tabulation. If None, no function is applied.

  • scaler (Callable, optional) – Function to scale the data in the resulting cross-tabulation. If None, no scaling is performed.

  • groupby (str, optional) – Name of the column to group data before scaling. If None, scaling is applied to the whole data, not group-wise.

Returns:

DataFrame representing the cross-tabulation.

Return type:

pandas.DataFrame

digest_results()[source]
export(fn=None)[source]

Export current results to file.

Parameters:
  • fn (str, optional) – Filename, defaults to None

  • filename (str, optional) – deprecated

Returns:

file buffer if filename is None otherwise returns None

Return type:

io.BytesIO

get_chromatograms(fns=None, peak_labels=None, filters=None, **kwargs)[source]
get_target_params(peak_label)[source]
load(fn)[source]

Load results into Mint instance.

Parameters:

fn (str) – Filename (csv, xlsx)

Returns:

self

Return type:

ms_mint.Mint.Mint

load_files(obj)[source]

Load ms_files as a function that returns the Mint instance for chaining.

Parameters:

list_of_files (str or list[str]) – Filename or list of file names.

Returns:

self

Return type:

ms_mint.Mint.Mint

load_metadata(fn=None)[source]
load_targets(list_of_files)[source]

Load targets from a file (csv, xslx)

Parameters:

list_of_files (str or list[str]) – Filename or list of file names.

Returns:

self

Return type:

ms_mint.Mint.Mint

property ms_files

Get/set ms-files to process.

Getter:

Returns:

List of filenames.

Return type:

list[str]

Setter:

Parameters:

list_of_files (str or list[str]) – Filename or list of file names of MS-files.

property n_files

Number of currently stored ms filenames.

Returns:

Number of files stored in self.ms_files

Return type:

int

property peak_labels
property progress

Shows the current progress.

Getter:

Returns the current progress value.

Setter:

Set the progress to a value between 0 and 100 and calls the progress callback function.

reset()[source]

Reset Mint instance. Removes targets, MS-files and results.

Returns:

self

Return type:

ms_mint.Mint.Mint

property results

Get/Set the Mint results.

Getter:

Returns:

Results

Return type:

pandas.DataFrame

Setter:

Parameters:

df (pandas.DataFrame) – DataFrame with MINT results.

run(nthreads=None, rt_margin=0.5, mode='standard', fn=None, **kwargs)[source]

Main routine to run MINT and process MS-files with current target list.

Parameters:
  • nthreads (int * None - Run with min(n_cpus, c_files) CPUs * 1: Run without multiprocessing on one CPU * >1: Run with multiprocessing enabled using nthreads threads.) – Number of cores to use, defaults to None

  • mode (str * 'standard': calculates peak shaped projected to RT dimension * 'express': omits calculation of other features, only peak_areas) – Compute mode (‘standard’ or ‘express’), defaults to ‘standard’

  • fn (str) – Output filename to not keep results in memory.

  • kwargs – Arguments passed to the procesing function.

save_metadata(fn=None)[source]
property status

Returns current status of Mint instance.

Returns:

[‘waiting’, ‘running’, ‘done’]

Return type:

str

property targets

Set/get target list.

Getter:

Returns:

Target list

Return type:

pandas.DataFrame

Setter:

Parameters:

targets (pandas.DataFrame) – Sets the target list of the instance.

version = '1.0.1.dev20'

ms_mint.Chromatogram module

class ms_mint.Chromatogram.Chromatogram(scan_times: List[float] | ndarray | None = None, intensities: List[float] | ndarray | None = None, filters: List[Filter] | None = None, expected_rt: float | None = None)[source]

Bases: object

__init__(scan_times: List[float] | ndarray | None = None, intensities: List[float] | ndarray | None = None, filters: List[Filter] | None = None, expected_rt: float | None = None)[source]

Initialize a Chromatogram object.

Parameters:
  • scan_times – Array-like object containing the scan times.

  • intensities – Array-like object containing the intensities.

  • filters – List of filters to be applied.

  • expected_rt – Expected retention time.

apply_filters()[source]
property data
estimate_noise_level(window=20)[source]
find_peaks(prominence=None, rel_height=0.9, **kwargs)[source]
from_file(fn, mz_mean, mz_width=10, expected_rt=None)[source]
optimise_peak_times_with_diff(rolling_window=20, plot=False)[source]
plot(label=None, **kwargs)[source]
select_peak_by_highest_intensity()[source]
select_peak_by_rt(expected_rt=None)[source]
select_peak_with_gaussian_weight(expected_rt=None, sigma=50)[source]
property selected_peaks

ms_mint.filelock module

A platform independent file lock that supports the with-statement.

class ms_mint.filelock.BaseFileLock(lock_file, timeout=-1)[source]

Bases: object

Implements the base class of a file lock.

__init__(lock_file, timeout=-1)[source]
_acquire()[source]

Platform dependent. If the file lock could be acquired, self._lock_file_fd holds the file descriptor of the lock file.

_release()[source]

Releases the lock and sets self._lock_file_fd to None.

acquire(timeout=None, poll_intervall=0.05)[source]

Acquires the file lock or fails with a Timeout error.

# You can use this method in the context manager (recommended)
with lock.acquire():
    pass

# Or use an equivalent try-finally construct:
lock.acquire()
try:
    pass
finally:
    lock.release()
Parameters:
  • timeout (float) – The maximum time waited for the file lock. If timeout < 0, there is no timeout and this method will block until the lock could be acquired. If timeout is None, the default timeout is used.

  • poll_intervall (float) – We check once in poll_intervall seconds if we can acquire the file lock.

Raises:

Timeout – if the lock could not be acquired in timeout seconds.

Changed in version 2.0.0: This method returns now a proxy object instead of self, so that it can be used in a with statement without side effects.

property is_locked

True, if the object holds the file lock.

Changed in version 2.0.0: This was previously a method and is now a property.

property lock_file

The path to the lock file.

release(force=False)[source]

Releases the file lock.

Please note, that the lock is only completly released, if the lock counter is 0.

Also note, that the lock file itself is not automatically deleted.

Parameters:

force (bool) – If true, the lock counter is ignored and the lock is released in every case.

property timeout

You can set a default timeout for the filelock. It will be used as fallback value in the acquire method, if no timeout value (None) is given.

If you want to disable the timeout, set it to a negative value.

A timeout of 0 means, that there is exactly one attempt to acquire the file lock.

Added in version 2.0.0.

ms_mint.filelock.FileLock

Alias for the lock, which should be used for the current platform. On Windows, this is an alias for WindowsFileLock, on Unix for UnixFileLock and otherwise for SoftFileLock.

class ms_mint.filelock.SoftFileLock(lock_file, timeout=-1)[source]

Bases: BaseFileLock

Simply watches the existence of the lock file.

exception ms_mint.filelock.Timeout(lock_file)[source]

Bases: TimeoutError

Raised when the lock could not be acquired in timeout seconds.

__init__(lock_file)[source]
lock_file

The path of the file lock.

class ms_mint.filelock.UnixFileLock(lock_file, timeout=-1)[source]

Bases: BaseFileLock

Uses the fcntl.flock() to hard lock the lock file on unix systems.

class ms_mint.filelock.WindowsFileLock(lock_file, timeout=-1)[source]

Bases: BaseFileLock

Uses the msvcrt.locking() function to hard lock the lock file on windows systems.

ms_mint.filters module

class ms_mint.filters.Filter[source]

Bases: object

transform(t: List[float], x: List[float]) Tuple[List[float], List[float]][source]
class ms_mint.filters.GaussFilter(sigma=5)[source]

Bases: Filter

Filter for time series that applies a Gaussian filter.

__init__(sigma=5)[source]

Filter for time series that applies a Gaussian filter.

Parameters:

sigma (int, optional) – Sigma value for Gaussian function, defaults to 5

transform(t, x)[source]

Transformation method.

Parameters:
  • t (Array or List) – Time points of series

  • x (Array or List) – Data points of series

Returns:

Resampled time series (x, t)

Return type:

tuple

class ms_mint.filters.Resampler(tau='500ms', input_unit='seconds')[source]

Bases: Filter

Filter for time series that resamples the data in a certain frequency.

__init__(tau='500ms', input_unit='seconds')[source]

Filter for time series that resamples the data in a certain frequency. The default is 500ms.

Parameters:
  • tau (str, optional) – Sampling frequency, defaults to “500ms”

  • unit – Time unit of input series (t)

transform(t, x)[source]

Transformation method.

Parameters:
  • t (Array or List) – Time points of series

  • x (Array or List) – Data points of series

Returns:

Resampled time series (x, t)

Return type:

tuple

class ms_mint.filters.Smoother(windows=None)[source]

Bases: Filter

Filter for time series that smoothes the x values by running one or more rolling averages.

__init__(windows=None)[source]

Filter for time series that smoothes the x values by running one or more rolling averages.

Parameters:

windows (: List[int], optional) – Window sizes of rolling averages applied to time series, defaults to [30, 20]

transform(t, x)[source]

Transformation method.

Parameters:
  • t (Array or List) – Time points of series

  • x (Array or List) – Data points of series

Returns:

Resampled time series (x, t)

Return type:

tuple

ms_mint.io module

Funtions to read and write MINT files.

ms_mint.io.convert_ms_file_to_feather(fn, fn_out=None)[source]

Convert MS file to feather format.

Parameters:
  • fn (str or PosixPath) – Filename to convert

  • fn_out (str or PosixPath, optional) – Output filename, defaults to None

Returns:

Filename of generated file

Return type:

str

ms_mint.io.convert_ms_file_to_parquet(fn, fn_out=None)[source]

Convert MS file to parquet format.

Parameters:
  • fn (str or PosixPath) – Filename to convert

  • fn_out (str or PosixPath, optional) – Output filename, defaults to None

Returns:

Filename of generated file

Return type:

str

ms_mint.io.df_to_numeric(df)[source]

Converts dataframe to numeric types if possible.

ms_mint.io.export_to_excel(mint, fn=None)[source]

Export MINT state to Excel file.

Parameters:

mint (ms_mint.Mint.Mint) – Mint instance

Returns:

None, or file buffer (if fn is None)

Return type:

None or io.BytesIO

ms_mint.io.format_thermo_raw_file_reader_parquet(df)[source]
ms_mint.io.ms_file_to_df(fn, read_only: bool = False)[source]

Read MS file and convert it to a pandas.DataFrame.

Parameters:
  • fn (str or PosixPath) – Filename

  • read_only (bool, optional) – Whether or not to apply convert to dataframe (for testing purposes), defaults to False

Returns:

MS data as DataFrame

Return type:

pandas.DataFrame

ms_mint.io.mzml_to_df(fn, read_only=False)[source]

Reads mzML file and returns a pandas.DataFrame using the mzML library.

Parameters:
  • fn (str or PosixPath) – Filename

  • explode (bool, optional) – Whether to explode the DataFrame, defaults to True

Returns:

MS data

Return type:

pandas.DataFrame

ms_mint.io.mzml_to_pandas_df_pyteomics(fn, **kwargs)[source]
ms_mint.io.mzmlb_to_df__pyteomics(fn, read_only=False)[source]

Reads mzMLb file and returns a pandas.DataFrame using the pyteomics library.

Parameters:
  • fn (str or PosixPath) – Filename

  • read_only (bool, optional) – Whether or not to convert to dataframe, defaults to False

Returns:

MS data

Return type:

pandas.DataFrame

ms_mint.io.mzxml_to_df(fn: str | Path, read_only: bool = False, time_unit_in_file: str = 'min') DataFrame | None[source]

Read mzXML file and convert it to pandas.DataFrame.

Parameters:
  • fn (Union[str, pathlib.Path]) – Filename

  • read_only (bool, optional) – Whether or not to convert to dataframe (for testing purposes), defaults to False

  • time_unit_in_file (str, optional) – The time unit used in the mzXML file (either ‘sec’ or ‘min’), defaults to ‘min’

Returns:

MS data

Return type:

Optional[pd.DataFrame]

ms_mint.io.read_parquet(fn, read_only=False)[source]

Reads parquet file and returns a pandas.DataFrame.

Parameters:
  • fn (str or PosixPath) – Filename

  • read_only (bool, optional) – Whether or not to convert to dataframe, defaults to False

Returns:

MS data

Return type:

pandas.DataFrame

ms_mint.io.set_dtypes(df)[source]

ms_mint.matplotlib_tools module

ms_mint.matplotlib_tools.hierarchical_clustering(df, vmin=None, vmax=None, figsize=(8, 8), top_height=2, left_width=2, xmaxticks=None, ymaxticks=None, metric='cosine', cmap=None)[source]

Performs and plot hierarchical clustering on dataframe in dense format.

Parameters:
  • df (pandas.DataFrame) – Input data.

  • vmin (int, optional) – Minimum value to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.

  • vmin – Maximum value to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.

  • figsize (tuple, optional) – Size of the main figure in inches, defaults to (8, 8)

  • top_height (int, optional) – Height of the top dendrogram, defaults to 2

  • left_width (int, optional) – Width of the left dendrogram, defaults to 2

  • xmaxticks (int, optional) – Maximum number of x-ticks to display, defaults to None

  • ymaxticks (int, optional) – Maxiumum number of y-ticks to display, defaults to None

  • metric (str, optional) – Metric to be used for distance calculation (both axes), defaults to “cosine”

  • cmap (str, optional) – Matplotlib color map name, defaults to None

Returns:

Matplotlib figure

Return type:

matplotlib.pyplot.Figure

ms_mint.matplotlib_tools.plot_metabolomics_hist2d(df, figsize=(4, 2.5), dpi=300, set_dim=True, cmap='jet', rt_range=None, mz_range=None, mz_bins=100, **kwargs)[source]
ms_mint.matplotlib_tools.plot_peak_shapes(mint_results, mint_metadata=None, fns=None, peak_labels=None, height=3, aspect=1.5, legend=False, col_wrap=4, hue='ms_file_label', title=None, dpi=None, sharex=False, sharey=False, kind='line', **kwargs)[source]

Plot peak shapes of mint results.

Parameters:
  • mint_results (pandas.DataFrame) – DataFrame in Mint results format.

  • mint_metadata (pandas.DataFrame) – DataFrame in Mint metadata format.

  • fns (list, optional) – Filenames to include, defaults to None

  • peak_labels (list, optional) – Peak-labels to include, defaults to None

  • height (int, optional) – Height of the figure facets, defaults to 4

  • aspect (int, optional) – Aspect ratio of the figure facets, defaults to 1

  • legend (bool, optional) – Whether or not to add a legend, defaults to False

  • col_wrap (int, optional) – Number of columns for sub-plots, defaults to 4

  • hue (str, optional) – Column name for color groups, defaults to “ms_file”

  • title (str, optional) – Title to add, defaults to None

  • dpi (int, optional) – Resolution of generated image, defaults to None

  • sharex (bool, optional) – Whether or not to share x-axis range between subplots, defaults to False

  • sharey (bool, optional) – Whether or not to share y-axis range between subplots, defaults to False

  • kind (str, optional) – Kind of seaborn relplot

Returns:

Generated figure object.

Return type:

matplotlib.pyplot.Figure

ms_mint.matplotlib_tools.plot_peaks(series, peaks, highlight=None, expected_rt=None, weights=None, legend=True, label=None, **kwargs)[source]

ms_mint.notebook module

Experimental module to run Mint interactively inside the Jupyter notebook.

code-block:

from ms_mint.notebook import Mint

mint = Mint()

mint.display()
class ms_mint.notebook.Mint(*args, **kwargs)[source]

Bases: Mint

MINT with added functions for interactive use in Jupyter Notebook (experimental).

display()[source]

Display control elements in Jupyter notebook.

Returns:

IPython Widgets elements.

property messages

ms_mint.pca module

class ms_mint.pca.PCA_Plotter(pca)[source]

Bases: object

Class for plotting Mint PCA results.

__init__(pca)[source]

Class for plotting Mint PCA results.

Parameters:

pca (ms_mint.pca.PrincipalComponentsAnalyser) – PrincipalComponentsAnalyser instance

cumulative_variance(interactive=False, **kwargs)[source]
cumulative_variance_px(**kwargs)[source]

After running mint.pca() this function can be used to plot the cumulative variance of the principal components.

Returns:

Returns a plotly express figure.

Return type:

plotly.graph_objs._figure.Figure

cumulative_variance_sns(**kwargs)[source]

After running mint.pca() this function can be used to plot the cumulative variance of the principal components.

Returns:

Returns a matplotlib figure.

Return type:

matplotlib.figure.Figure

loadings(interactive=False, **kwargs)[source]
loadings_plotly(**kwargs)[source]
loadings_sns(**kwargs)[source]
pairplot(n_components=3, hue=None, fig_kws=None, interactive=False, **kwargs)[source]

After running mint.pca() this function can be used to plot a scatter matrix of the principal components.

Parameters:
  • n_components (int, optional) – Number of principal components to plot, defaults to 3.

  • hue (List[str] or str, optional) – Labels used for hue. If string, the data will be taken from the mint.meta dataframe.

Returns:

Returns a matplotlib figure.

Return type:

seaborn.axisgrid.PairGrid

pairplot_plotly(df, color_col=None, **kwargs)[source]
pairplot_sns(df, fig_kws=None, **kwargs)[source]
class ms_mint.pca.PrincipalComponentsAnalyser(mint=None)[source]

Bases: object

Class for applying PCA to Mint instance.

__init__(mint=None)[source]

Class for applying PCA to Mint instance.

Parameters:

mint (ms_mint.Mint.Mint, optional) – Mint instance, defaults to None

run(n_components=3, on=None, var_name='peak_max', fillna='median', apply=None, groupby=None, scaler='standard')[source]

Run Principal Component Analysis on current results. Results are stored in self.decomposition_results.

Parameters:
  • on (str, optional) – Column name to use for pca, defaults to “peak_max”

  • n_components (int, optional) – Number of PCA components to return, defaults to 3

  • fillna (str, optional) – Method to fill missing values, defaults to “median”

  • scaler (str, optional) – Method to scale the columns, defaults to “standard”

ms_mint.plotly_tools module

ms_mint.plotly_tools.get_palette_colors(palette_name, num_colors)[source]
ms_mint.plotly_tools.plotly_heatmap(df, normed_by_cols=False, transposed=False, clustered=False, add_dendrogram=False, name='', x_tick_colors=None, height=None, width=None, correlation=False, call_show=False, verbose=False)[source]

Creates an interactive heatmap from a dense-formated dataframe.

Parameters:
  • df (pandas.DataFrame) – Input data

  • normed_by_cols (bool, optional) – Whether or not to normalize column vectors, defaults to False

  • transposed (bool, optional) – Whether or not to transpose the generated image, defaults to False

  • clustered (bool, optional) – Whether or not to apply hierarchical clustering or rows, defaults to False

  • add_dendrogram (bool, optional) – Whether or not to show a dendrogram (only with clustered=True), defaults to False

  • title (str, optional) – Title for figure, defaults to “”

  • x_tick_colors (str, optional) – Color of x-ticks, defaults to None

  • height (int, optional) – Image height in pixels, defaults to None

  • width (int, optional) – Image width in pixels, defaults to None

  • correlation (bool, optional) – Whether or not to convert the table to a correlation matrix, defaults to False

  • call_show (bool, optional) – Whether or not to call fig.show() to show image in new browser tab, defaults to False

  • verbose (bool, optional) – Whether or not to be loud, defaults to False

Returns:

Returns a plotly image object.

Return type:

plotly.Figure

ms_mint.plotly_tools.plotly_peak_shapes(mint_results, mint_metadata=None, color='ms_file_label', fns=None, col_wrap=1, peak_labels=None, legend=True, verbose=False, legend_orientation='v', call_show=False, palette='Plasma')[source]

Plot peak shapes of mint results.

Parameters:
  • mint_results (pandas.DataFrame) – DataFrame in Mint results format.

  • mint_metadata (pandas.DataFrame, optional) – DataFrame in Mint metadata format, defaults to None.

  • color (str, optional) – Column name determining color-coding of plots, defaults to ‘ms_file_label’.

  • fns (list, optional) – Filenames to include, defaults to None.

  • col_wrap (int, optional) – Maximum number of subplot columns, defaults to 1.

  • peak_labels (list, optional) – Peak-labels to include, defaults to None.

  • legend (bool, optional) – Whether to display legend, defaults to True.

  • verbose (bool, optional) – If True, prints additional details, defaults to False.

  • legend_orientation (str, optional) – Legend orientation, defaults to ‘v’.

  • call_show (bool, optional) – If True, displays the plot immediately, defaults to False.

  • palette (str, optional) – Color palette to use, defaults to ‘Plasma’.

Returns:

Plotly Figure object or None if call_show is True.

Return type:

plotly.graph_objs._figure.Figure or None

ms_mint.plotly_tools.set_template()[source]

A function that sets a template for plotly figures.

ms_mint.plotting module

ms_mint.processing module

ms_mint.processing.append_results(results, fn)[source]

Appends results to file.

Parameters:
  • results (pandas.DataFrame) – New results.

  • fn (str) – Filename to append to.

ms_mint.processing.extract_chromatogram_from_ms1(ms1: DataFrame, mz_mean: float, mz_width: float = 10) DataFrame[source]

Extract chromatogram from MS1 data.

Parameters:
  • ms1 – MS1 data as a DataFrame.

  • mz_mean – Mean m/z value.

  • mz_width – Width around the mean m/z to extract.

Returns:

Chromatogram data as a DataFrame.

ms_mint.processing.extract_ms1_properties(array, mz_mean)[source]

Process MS-1 data in array format.

Parameters:
  • array (numpy.array) – MS-1 data slice.

  • mz_mean (float) – mz_mean value to calculate mass accuracy.

Returns:

Extracted data.

Return type:

dict

ms_mint.processing.get_chromatogram_from_ms_file(ms_file: str, mz_mean: float, mz_width: float = 10) DataFrame[source]

Get chromatogram data from an MS file.

Parameters:
  • ms_file – Path to the MS file.

  • mz_mean – Mean m/z value.

  • mz_width – Width around the mean m/z to extract.

Returns:

Chromatogram data as a DataFrame.

ms_mint.processing.process_ms1(df, targets)[source]

Process MS-1 data with a target list.

Parameters:
  • df (pandas.DataFrame) – MS-1 data.

  • targets (pandas.DataFrame) – Target list

Returns:

Mint results.

Return type:

pandas.DataFrame

ms_mint.processing.process_ms1_file(filename, targets)[source]

Peak integration using a filename as input.

Parameters:
  • filename (str or PosixPath) – Path to mzxml or mzml filename

  • targets (pandas.DataFrame) – DataFrame in target list format.

Returns:

DataFrame with processd peak intensities.

Return type:

pandas.DataFrame

ms_mint.processing.process_ms1_files_in_parallel(args)[source]

Pickleable function for (parallel) peak integration.

ms_mint.processing.process_ms1_from_numpy(array, peaks)[source]

Process MS1 data in numpy array format.

Parameters:
  • array (numpy.Array) – Input data.

  • peaks (numpy.Array) – Peak data np.array([[mz_mean_1, mz_width_1, rt_min_1, rt_max_1, intensity_threshold_1, peak_label_1], …])

Returns:

Extracted data.

Return type:

list

ms_mint.processing.score_peaks(mint_results)[source]

Score the peak quality (experimental).

1 - means a good shape

0 - means a bad shape

Parameters:

mint_results (pandas.DataFrame) – DataFrame in ms_mint results format.

Returns:

Score

Return type:

float

ms_mint.processing.slice_ms1_array(array: array, rt_min, rt_max, mz_mean, mz_width, intensity_threshold)[source]

Slice MS1 data by m/z, mz_width, rt_min, rt_max

Parameters:
  • array (np.array) – Input MS-1 data.

  • rt_min (float) – Minimum retention time for slice

  • rt_max (float) – Maximum retention time for slice

  • mz_mean (float) – Mean m/z value for slice

  • mz_width (float (>0)) – Width of slice in [ppm] of mz_mean

  • intensity_threshold (float (>0)) – Noise filter value

Returns:

Slice of numpy array

Return type:

np.Array

ms_mint.standards module

Contains standard column names and other values.

ms_mint.targets module

Everything related to target lists.

ms_mint.targets.check_targets(targets)[source]

Check if targets are formated well.

Parameters:

targets (pandas.DataFrame) – Target list

Returns:

Returns True if all checks pass, else False

Return type:

bool

ms_mint.targets.convert_to_seconds(targets)[source]

Convert time units to seconds.

Parameters:

targets (pandas.DataFrame) – Mint target list to modify.

ms_mint.targets.diff_targets(old_pklist, new_pklist)[source]

Get the difference between two target lists.

Parameters:
  • old_pklist (pandas.DataFrame) – Old target list

  • new_pklist (pandas.DataFrame) – New target list

Returns:

Target list with new/changed targets

Return type:

pandas.DataFrame

ms_mint.targets.fill_missing_rt_values(targets)[source]

If rt values are missing fill with mean of rt_min, rt_max.

Parameters:

targets (pandas.DataFrame) – Mint target list to modify.

ms_mint.targets.gen_target_grid(masses, dt, rt_max=10, mz_ppm=10, intensity_threshold=0)[source]

Creates a targets from a list of masses.

Parameters:
  • masses – Target m/z values.

  • dt – Size of peak windows in time dimension [min]

  • rt_max – Maximum time

  • mz_ppm – Width of peak window in m/z dimension [ppm].

ms_mint.targets.read_targets(fns, ms_mode='negative')[source]

Extracts peak data from csv files that contain peak definitions.

Parameters:
  • fns – List of filenames of target lists.

  • ms_mode – “negative” or “positive”

ms_mint.targets.standardize_targets(targets, ms_mode='neutral')[source]

Standardize target list.

  • updates the target lists to newest format

  • ensures peak labels are strings

  • replaces np.nan with None

Parameters:
  • targets (pandas.DataFrame) – DataFrame in target-list format.

  • ms_mode (str, optional) – Ionization mode, defaults to “neutral”

Returns:

DataFrame in formated target-list format

Return type:

pandas.DataFrame

ms_mint.tools module

ms_mint.tools.df_diff(df1, df2, which='both')[source]

Difference between two dataframes.

Parameters:
  • df1 (pandas.DataFrame) – Reference dataframe

  • df2 (pandas.DataFrame) – Dataframe to compare

  • which (str, optional) – Direction in which to compare, defaults to “both”

Returns:

DataFrame that contains unique rows.

Return type:

pandas.DataFrame

ms_mint.tools.find_peaks_in_timeseries(series, prominence=None, plot=False, rel_height=0.9, **kwargs)[source]

_summary_

Parameters:
  • series (_type_) – _description_

  • prominence (_type_, optional) – _description_, defaults to None

  • plot (bool, optional) – _description_, defaults to False

Returns:

_description_

Return type:

_type_

ms_mint.tools.fn_to_label(fn)[source]
ms_mint.tools.formula_to_mass(formulas, ms_mode=None)[source]

Calculate mz-mean vallue from formulas for specific ionization mode.

Parameters:
  • formulas (list[str]) – List of molecular formulas e.g. [‘H2O’]

  • ms_mode (str, optional) – Ionization mode, defaults to None

Returns:

List of calculated masses

Return type:

list

ms_mint.tools.gaussian(x, mu, sig)[source]

Simple gaussian function generator.

Parameters:
  • x (np.array) – x-values to generate function values

  • mu (float) – Mean of gaussian

  • sig (float) – Sigma of gaussian

Returns:

f(x)

Return type:

np.array

ms_mint.tools.get_ms_files_from_results(results)[source]

Extract MS-filenames from Mint results.

Parameters:

results (pandas.DataFrame) – DataFrame in Mint fesults format

Returns:

List of filenames

Return type:

list

ms_mint.tools.get_targets_from_results(results)[source]

Extract targets dataframe from ms-mint results table.

Parameters:

results (pandas.DataFrame) – Mint results table

Returns:

Mint targets table

Return type:

pandas.DataFrame

ms_mint.tools.init_metadata()[source]
ms_mint.tools.is_ms_file(fn)[source]

Check if file is a MS-file based on filename.

Parameters:

fn (str or PosixPath) – Filename

Returns:

Whether or not the file is recognized as MS-file

Return type:

bool

ms_mint.tools.lock(fn)[source]

File lock to ensure safe writing to file.

Parameters:

fn (str or PosixPath) – Filename to lock.

Returns:

File lock object.

Return type:

FileLock

ms_mint.tools.log2p1(x)[source]
ms_mint.tools.mz_mean_width_to_min_max(mz_mean, mz_width)[source]
ms_mint.tools.scale_dataframe(df, scaler='standard', **kwargs)[source]

Scale all columns in a dense dataframe.

Parameters:
  • df (pandas.DataFrame) – Dataframe to scale

  • scaler (str, optional) – Scaler to use [‘robust’, ‘standard’], defaults to “standard”

Returns:

Scaled dataframe

Return type:

pandas.DataFrame

Module contents