Skip to content

Developer Notes

Building and Uploading the Package

To build the source distribution and wheel for ms-mint-app2, follow these steps:

  1. Navigate to the root directory of the project.
  2. Run the following command to create the source distribution (sdist) and wheel (wheel):
python -m build

This will generate distribution archives in the dist directory.

  1. To upload the built package, use twine. Ensure you have twine installed (pip install twine if not). Then, run:
python -m twine upload dist/*

If you use a custom repository, configure it in ~/.pypirc and pass --repository <name>.

Executables

To create executables for the ms-mint-app2 application, use pyinstaller. Follow the full guide in pyinstaller/BUILD_GUIDE.md. The short version is:

cd pyinstaller
python create_asari_env.py
python prebuild_matplotlib_cache.py
pyinstaller Mint.spec

This will generate a standalone executable based on pyinstaller/Mint.spec.

Documentation Deployment

To build and deploy the documentation using mkdocs, follow these steps:

  1. Ensure you have mkdocs and the Material theme installed (pip install mkdocs mkdocs-material if not), or use pip install -r requirements-dev.txt.
  2. Run the following commands to build the documentation and deploy it to GitHub Pages:
mkdocs build && mkdocs gh-deploy

The mkdocs build command generates the static site in the site directory, and mkdocs gh-deploy pushes it to the gh-pages branch of your GitHub repository.

DuckDB Performance Tuning

MINT includes an automated resource detection system to optimize DuckDB performance:

  • CPU & RAM Auto-detection: The application automatically detects physical CPU cores and available RAM. It targets 50% of available RAM and matches CPU count to a minimum of 1.5GB RAM per core.
  • Batch Processing: Chromatogram extraction and results processing are performed in batches. The optimal batch size is dynamically calculated (typically between 1000 and 5000 pairs for MS1 data, and starting with default 1000 pairs for MS2 data) to balance memory pressure and throughput.
  • Progressive Loading: Large chromatogram datasets use LTTB downsampling and pre-calculated envelopes to ensure the UI remains responsive.