{ "cells": [ { "cell_type": "markdown", "id": "19917597", "metadata": {}, "source": [ "# Quick Start\n", "\n", "This python script describes how to use `LiRTMaTS` python package. The\n", "input data and retention time reference files used here are in\n", "https://github.com/wanchanglin/lirtmats/tree/master/examples/data.\n", "\n", "## Setup\n", "\n", "The users need to load python package `LAMP` before using `LiRTMaTS`. It's\n", "functions used here are for loading data set and summarising the matching\n", "results. For details, see https://github.com/wanchanglin/lamp." ] }, { "cell_type": "code", "execution_count": 1, "id": "b045fb5c", "metadata": { "execution": { "iopub.execute_input": "2025-12-04T12:29:41.718675Z", "iopub.status.busy": "2025-12-04T12:29:41.718675Z", "iopub.status.idle": "2025-12-04T12:29:42.611230Z", "shell.execute_reply": "2025-12-04T12:29:42.610197Z" } }, "outputs": [], "source": [ "import sqlite3\n", "import pandas as pd\n", "from lamp import anno\n", "import lirtmats.lirtmats as rtm" ] }, { "cell_type": "markdown", "id": "f4c99a20", "metadata": {}, "source": [ "## Data Loading\n", "\n", "`LiRTMaTS` supports text files separated by comma (`,`) or tab (`\\t`).\n", "The Microsoft's XLSX is also supported, using argument `sheet_name` to\n", "indicate which sheet is used for input data. The default is 0 for the\n", "first sheet.\n", "\n", "Here we use a small example data set with `tsv` format. This data set\n", "includes peak list and intensity data matrix. `LiRTMaTS` requires peak\n", "list's name, m/z value and retention time. User needs to indicate the\n", "locations of feature name, m/z value, retention time and starting points\n", "of data matrix from data. Here they are 1, 2, 3 and 4, respectively.\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "aae9685b", "metadata": { "execution": { "iopub.execute_input": "2025-12-04T12:29:42.611230Z", "iopub.status.busy": "2025-12-04T12:29:42.611230Z", "iopub.status.idle": "2025-12-04T12:29:42.662591Z", "shell.execute_reply": "2025-12-04T12:29:42.662591Z" } }, "outputs": [ { "data": { "text/html": [ "
| \n", " | name | \n", "mz | \n", "rt | \n", "D121 | \n", "A122 | \n", "A125 | \n", "A126 | \n", "A127 | \n", "A128 | \n", "B131 | \n", "... | \n", "E214 | \n", "E215 | \n", "E216 | \n", "H234 | \n", "H235 | \n", "H236 | \n", "H237 | \n", "H238 | \n", "H239 | \n", "H240 | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "M102T899 | \n", "102.034153 | \n", "898.850160 | \n", "1.404584e+07 | \n", "3.689953e+06 | \n", "3.598363e+06 | \n", "1.138875e+07 | \n", "4.887524e+06 | \n", "2.104782e+06 | \n", "7.288258e+06 | \n", "... | \n", "3.125203e+06 | \n", "3.608369e+06 | \n", "NaN | \n", "4.763811e+06 | \n", "2.281365e+06 | \n", "NaN | \n", "3.404450e+06 | \n", "3.720441e+06 | \n", "4.539032e+05 | \n", "NaN | \n", "
| 1 | \n", "M102T849 | \n", "102.034154 | \n", "849.085350 | \n", "1.473961e+07 | \n", "NaN | \n", "5.934387e+06 | \n", "NaN | \n", "4.607624e+06 | \n", "5.969186e+06 | \n", "3.367949e+06 | \n", "... | \n", "1.276006e+07 | \n", "1.490770e+07 | \n", "2.880142e+06 | \n", "4.263577e+06 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "4.437697e+06 | \n", "6.777076e+06 | \n", "6.341930e+06 | \n", "
| 2 | \n", "M105T45 | \n", "105.042677 | \n", "45.353942 | \n", "5.520865e+05 | \n", "1.813279e+05 | \n", "2.734923e+05 | \n", "2.342655e+05 | \n", "6.241395e+04 | \n", "1.068277e+05 | \n", "1.192451e+05 | \n", "... | \n", "3.092946e+04 | \n", "1.788324e+05 | \n", "1.810794e+05 | \n", "3.225256e+05 | \n", "NaN | \n", "3.734778e+05 | \n", "1.935349e+05 | \n", "NaN | \n", "1.094705e+05 | \n", "1.946732e+05 | \n", "
| 3 | \n", "M105T54 | \n", "105.054961 | \n", "54.350049 | \n", "6.669635e+05 | \n", "4.833251e+06 | \n", "2.137479e+06 | \n", "1.552473e+06 | \n", "1.753294e+06 | \n", "2.301363e+06 | \n", "NaN | \n", "... | \n", "1.186390e+06 | \n", "3.001167e+06 | \n", "2.558921e+06 | \n", "NaN | \n", "NaN | \n", "1.695460e+06 | \n", "NaN | \n", "1.834140e+06 | \n", "1.029692e+06 | \n", "4.382618e+05 | \n", "
| 4 | \n", "M105T48_1 | \n", "105.074216 | \n", "47.538626 | \n", "6.310113e+05 | \n", "NaN | \n", "5.199302e+05 | \n", "4.302566e+05 | \n", "5.650141e+05 | \n", "3.635406e+05 | \n", "1.096530e+06 | \n", "... | \n", "7.882748e+05 | \n", "NaN | \n", "9.822090e+05 | \n", "4.974403e+05 | \n", "3.604541e+05 | \n", "1.340656e+06 | \n", "NaN | \n", "NaN | \n", "6.020203e+05 | \n", "3.597655e+05 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1995 | \n", "M299T296 | \n", "299.233645 | \n", "295.569540 | \n", "8.125150e+04 | \n", "1.020165e+05 | \n", "2.209362e+05 | \n", "3.557402e+05 | \n", "6.039153e+05 | \n", "NaN | \n", "2.330915e+05 | \n", "... | \n", "3.872671e+05 | \n", "1.632064e+05 | \n", "7.224218e+04 | \n", "3.678394e+04 | \n", "9.526812e+04 | \n", "5.785549e+04 | \n", "6.183749e+05 | \n", "NaN | \n", "2.915690e+04 | \n", "NaN | \n", "
| 1996 | \n", "M300T43_1 | \n", "299.919504 | \n", "42.832066 | \n", "5.042924e+04 | \n", "NaN | \n", "NaN | \n", "2.222376e+05 | \n", "3.763288e+05 | \n", "2.094474e+05 | \n", "1.163715e+05 | \n", "... | \n", "4.035525e+05 | \n", "2.032260e+05 | \n", "2.700920e+05 | \n", "NaN | \n", "2.675647e+05 | \n", "2.695188e+05 | \n", "2.750383e+05 | \n", "2.882957e+05 | \n", "6.720465e+04 | \n", "3.352428e+05 | \n", "
| 1997 | \n", "M300T62 | \n", "300.119720 | \n", "62.428854 | \n", "NaN | \n", "3.914945e+05 | \n", "5.182468e+05 | \n", "7.492101e+05 | \n", "1.546338e+06 | \n", "5.741346e+05 | \n", "9.712791e+05 | \n", "... | \n", "8.554399e+05 | \n", "7.431820e+05 | \n", "8.878200e+05 | \n", "NaN | \n", "3.625514e+05 | \n", "4.987110e+05 | \n", "1.393237e+06 | \n", "5.217566e+05 | \n", "NaN | \n", "1.257126e+05 | \n", "
| 1998 | \n", "M300T285_2 | \n", "300.124255 | \n", "285.061758 | \n", "NaN | \n", "4.602130e+05 | \n", "4.559729e+05 | \n", "9.718658e+05 | \n", "3.864969e+05 | \n", "3.877729e+05 | \n", "1.315307e+06 | \n", "... | \n", "2.418197e+06 | \n", "2.917536e+06 | \n", "9.108396e+05 | \n", "4.583314e+05 | \n", "4.022556e+05 | \n", "2.673259e+05 | \n", "NaN | \n", "NaN | \n", "8.926295e+04 | \n", "2.126753e+04 | \n", "
| 1999 | \n", "M300T288 | \n", "300.181271 | \n", "287.944377 | \n", "7.880306e+05 | \n", "1.738638e+06 | \n", "1.113482e+06 | \n", "4.063701e+06 | \n", "3.788191e+06 | \n", "1.201084e+06 | \n", "2.988076e+06 | \n", "... | \n", "2.907005e+06 | \n", "3.365814e+06 | \n", "2.761628e+06 | \n", "1.865813e+06 | \n", "1.956308e+06 | \n", "NaN | \n", "2.918514e+06 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
2000 rows × 40 columns
\n", "| \n", " | identifier | \n", "metabolite_name | \n", "rt_lib | \n", "inchikey | \n", "ion_mod | \n", "
|---|---|---|---|---|---|
| 0 | \n", "ACMG_aqC18_POS_0001 | \n", "MS5029_Isovaleraldehyde | \n", "24.6 | \n", "QPUYECUOLPXSFR-UHFFFAOYSA-N | \n", "positive | \n", "
| 1 | \n", "ACMG_aqC18_POS_0002 | \n", "LO57_Dihydroxyfumaric acid hydrate | \n", "27.0 | \n", "SEKGMJVHSBBHRD-WZHZPDAFSA-M | \n", "positive | \n", "
| 2 | \n", "ACMG_aqC18_POS_0003 | \n", "LO61_Benzoic acid | \n", "27.0 | \n", "DMBUODUULYCPAK-UHFFFAOYSA-N | \n", "positive | \n", "
| 3 | \n", "ACMG_aqC18_POS_0004 | \n", "LO52_Spermine | \n", "28.2 | \n", "XDSPGKDYYRNYJI-IUPFWZBJSA-N | \n", "positive | \n", "
| 4 | \n", "ACMG_aqC18_POS_0005 | \n", "LO21_Spermidine | \n", "30.0 | \n", "HELXLJCILKEWJH-NCGAPWICSA-N | \n", "positive | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2827 | \n", "ACMG_aqC18_POS_1412 | \n", "LIM3312_Cholesterol | \n", "659.4 | \n", "ASOSVCXGWPDUGN-UHFFFAOYSA-N | \n", "negative | \n", "
| 2828 | \n", "ACMG_aqC18_POS_1413 | \n", "LO13_5alpha-Cholestan-3-one | \n", "672.6 | \n", "XQCZBXHVTFVIFE-UHFFFAOYSA-N | \n", "negative | \n", "
| 2829 | \n", "ACMG_aqC18_POS_1414 | \n", "LIM3310_5alpha-Cholest-7-en-3beta-ol | \n", "675.0 | \n", "WLFXSECCHULRRO-UHFFFAOYSA-N | \n", "negative | \n", "
| 2830 | \n", "ACMG_aqC18_POS_1415 | \n", "LO302_5alpha-Cholestanol | \n", "681.6 | \n", "YCIMNLLNPGFGHC-UHFFFAOYSA-N | \n", "negative | \n", "
| 2831 | \n", "ACMG_aqC18_POS_1416 | \n", "LO45_10Z-Nonadecenoic acid | \n", "723.6 | \n", "QIGBRXMKCJKVMJ-UHFFFAOYSA-N | \n", "negative | \n", "
2832 rows × 5 columns
\n", "| \n", " | id | \n", "rt | \n", "identifier | \n", "metabolite_name | \n", "rt_lib | \n", "inchikey | \n", "ion_mod | \n", "rt_range | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "M105T45 | \n", "45.353942 | \n", "ACMG_aqC18_POS_0280 | \n", "LO309_Asymmetric dimethylarginine | \n", "40.5 | \n", "ZDLDXNCMJBOYJV-YFKPBYRVSA-N | \n", "positive | \n", "5 | \n", "
| 1 | \n", "M105T45 | \n", "45.353942 | \n", "ACMG_aqC18_POS_0281 | \n", "MS5037_Ribonic acid gamma-lactone | \n", "40.5 | \n", "DAUAQNGYDSHRET-UHFFFAOYSA-N | \n", "positive | \n", "5 | \n", "
| 2 | \n", "M105T45 | \n", "45.353942 | \n", "ACMG_aqC18_POS_0282 | \n", "LO18_L-Dihydroorotic acid | \n", "40.5 | \n", "KCDXJAYRVLXPFO-UHFFFAOYSA-N | \n", "positive | \n", "5 | \n", "
| 3 | \n", "M105T45 | \n", "45.353942 | \n", "ACMG_aqC18_POS_0283 | \n", "LO30_Stachydrine | \n", "40.5 | \n", "ITECRQOOEQWFPE-UHFFFAOYSA-N | \n", "positive | \n", "5 | \n", "
| 4 | \n", "M105T45 | \n", "45.353942 | \n", "ACMG_aqC18_POS_0284 | \n", "LO72_Aminoadipic acid | \n", "40.5 | \n", "JYPHNHPXFNEZBR-UHFFFAOYSA-N | \n", "positive | \n", "5 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 150065 | \n", "M300T288 | \n", "287.944377 | \n", "ACMG_aqC18_POS_0942 | \n", "MS5008_Ethyl crotonate | \n", "291.0 | \n", "OZWKMVRBQXNZKK-UHFFFAOYSA-N | \n", "negative | \n", "5 | \n", "
| 150066 | \n", "M300T288 | \n", "287.944377 | \n", "ACMG_aqC18_POS_0943 | \n", "MS5032_2-Phenyl-1-propanol | \n", "291.0 | \n", "DKYWVDODHFEZIM-UHFFFAOYSA-N | \n", "negative | \n", "5 | \n", "
| 150067 | \n", "M300T288 | \n", "287.944377 | \n", "ACMG_aqC18_POS_0944 | \n", "LO15_Methyl indole-3-acetate | \n", "291.0 | \n", "RTIXKCRFFJGDFG-UHFFFAOYSA-N | \n", "negative | \n", "5 | \n", "
| 150068 | \n", "M300T288 | \n", "287.944377 | \n", "ACMG_aqC18_POS_0945 | \n", "LO03_Cinnamic aldehyde | \n", "291.6 | \n", "FNYLWPVRPXGIIP-UHFFFAOYSA-N | \n", "positive | \n", "5 | \n", "
| 150069 | \n", "M300T288 | \n", "287.944377 | \n", "ACMG_aqC18_POS_0945 | \n", "LO03_Cinnamic aldehyde | \n", "291.6 | \n", "FNYLWPVRPXGIIP-UHFFFAOYSA-N | \n", "negative | \n", "5 | \n", "
150070 rows × 8 columns
\n", "| \n", " | name | \n", "mz | \n", "rt | \n", "rt_range | \n", "identifier | \n", "metabolite_name | \n", "rt_lib | \n", "inchikey | \n", "ion_mod | \n", "
|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "M100T54 | \n", "100.075925 | \n", "53.810924 | \n", "5.0 | \n", "ACMG_aqC18_POS_0389::ACMG_aqC18_POS_0389::ACMG... | \n", "LO488_Maleic acid::LO488_Maleic acid::LO321_L-... | \n", "48.9::48.9::49.2::49.2::50.4::50.4::50.4::50.4... | \n", "JJVNINGBHGBWJH-UHFFFAOYSA-N::JJVNINGBHGBWJH-UH... | \n", "positive::negative::positive::negative::positi... | \n", "
| 1 | \n", "M1015T254 | \n", "1014.985384 | \n", "253.626177 | \n", "5.0 | \n", "ACMG_aqC18_POS_0782::ACMG_aqC18_POS_0783::ACMG... | \n", "LO481_3-Hydroxydecanedioic acid::LIM3308_Suber... | \n", "249.00000000000003::249.00000000000003::249.00... | \n", "FVWJYYTZTCVBKE-ROUWMTJPSA-N::TVZGACDUOSZQKY-UH... | \n", "positive::positive::negative::negative::positi... | \n", "
| 2 | \n", "M101T228 | \n", "101.060060 | \n", "228.125403 | \n", "5.0 | \n", "ACMG_aqC18_POS_0654::ACMG_aqC18_POS_0654::ACMG... | \n", "MS5018_Dimethyl maleate::MS5018_Dimethyl malea... | \n", "223.2::223.2::223.8::223.8::223.8::223.8::223.... | \n", "KIWQWJKWBHZMDT-UHFFFAOYSA-N::KIWQWJKWBHZMDT-UH... | \n", "positive::negative::positive::positive::positi... | \n", "
| 3 | \n", "M102T849 | \n", "102.034154 | \n", "849.085350 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 4 | \n", "M102T899 | \n", "102.034153 | \n", "898.850160 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1995 | \n", "M865T700 | \n", "865.244172 | \n", "700.365420 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 1996 | \n", "M919T647 | \n", "918.701782 | \n", "646.988220 | \n", "5.0 | \n", "ACMG_aqC18_POS_1407::ACMG_aqC18_POS_1407::ACMG... | \n", "LO05_Vitamin K1::LO05_Vitamin K1::LIM3314_Phyl... | \n", "642.6::642.6::643.2::643.2::647.1::647.1 | \n", "ZFDIRQKJPRINOQ-HYXAFXHYSA-N::ZFDIRQKJPRINOQ-HY... | \n", "positive::negative::positive::negative::positi... | \n", "
| 1997 | \n", "M925T237_1 | \n", "924.898294 | \n", "236.964462 | \n", "5.0 | \n", "ACMG_aqC18_POS_0690::ACMG_aqC18_POS_0691::ACMG... | \n", "LO306_Syringic acid::LO315_ortho-Hydroxyphenyl... | \n", "232.2::232.2::232.2::232.2::232.2::232.2::232.... | \n", "AFBPFSWMIHJQDM-UHFFFAOYSA-N::OISVCGZHLKNMSJ-UH... | \n", "positive::positive::positive::positive::positi... | \n", "
| 1998 | \n", "M933T267 | \n", "933.410460 | \n", "266.976471 | \n", "5.0 | \n", "ACMG_aqC18_POS_0839::ACMG_aqC18_POS_0840::ACMG... | \n", "MS5012_Diethyl malonate::MS5019_Trimethylaceti... | \n", "262.2::262.2::262.2::262.2::262.2::262.2::262.... | \n", "KEVYVLWNCKMXJX-UHFFFAOYSA-N::WTTJVINHCBCLGX-ZD... | \n", "positive::positive::positive::negative::negati... | \n", "
| 1999 | \n", "M934T242 | \n", "933.932365 | \n", "242.395371 | \n", "5.0 | \n", "ACMG_aqC18_POS_0720::ACMG_aqC18_POS_0721::ACMG... | \n", "LIM3312_Aspartame::MS5023_Ethyl levulinate::MS... | \n", "237.6::237.6::237.6::237.6::237.6::237.6::237.... | \n", "MBDOYVRWFFCFHM-SNAWJCMRSA-N::XPFVYQJUAUNWIW-UH... | \n", "positive::positive::positive::positive::negati... | \n", "
2000 rows × 9 columns
\n", "| \n", " | name | \n", "mz | \n", "rt | \n", "identifier | \n", "metabolite_name | \n", "rt_lib | \n", "inchikey | \n", "ion_mod | \n", "rt_range | \n", "
|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "M100T54 | \n", "100.075925 | \n", "53.810924 | \n", "ACMG_aqC18_POS_0389 | \n", "LO488_Maleic acid | \n", "48.9 | \n", "JJVNINGBHGBWJH-UHFFFAOYSA-N | \n", "positive | \n", "5.0 | \n", "
| 1 | \n", "M100T54 | \n", "100.075925 | \n", "53.810924 | \n", "ACMG_aqC18_POS_0389 | \n", "LO488_Maleic acid | \n", "48.9 | \n", "JJVNINGBHGBWJH-UHFFFAOYSA-N | \n", "negative | \n", "5.0 | \n", "
| 2 | \n", "M100T54 | \n", "100.075925 | \n", "53.810924 | \n", "ACMG_aqC18_POS_0390 | \n", "LO321_L-Theanine | \n", "49.2 | \n", "SULYEHHGGXARJS-UHFFFAOYSA-N | \n", "positive | \n", "5.0 | \n", "
| 3 | \n", "M100T54 | \n", "100.075925 | \n", "53.810924 | \n", "ACMG_aqC18_POS_0390 | \n", "LO321_L-Theanine | \n", "49.2 | \n", "SULYEHHGGXARJS-UHFFFAOYSA-N | \n", "negative | \n", "5.0 | \n", "
| 4 | \n", "M100T54 | \n", "100.075925 | \n", "53.810924 | \n", "ACMG_aqC18_POS_0391 | \n", "LO310_Dihydrothymine | \n", "50.4 | \n", "YPTJKHVBDCRKNF-UHFFFAOYSA-N | \n", "positive | \n", "5.0 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 150218 | \n", "M934T242 | \n", "933.932365 | \n", "242.395371 | \n", "ACMG_aqC18_POS_0775 | \n", "LO35_2-Methoxybenzoic acid | \n", "247.2 | \n", "RFKITWRHKUYMRJ-UHFFFAOYSA-N | \n", "positive | \n", "5.0 | \n", "
| 150219 | \n", "M934T242 | \n", "933.932365 | \n", "242.395371 | \n", "ACMG_aqC18_POS_0772 | \n", "MS5015_Phenylglyoxal | \n", "247.2 | \n", "QWIZNVHXZXRPDR-WSCXOGSTSA-N | \n", "negative | \n", "5.0 | \n", "
| 150220 | \n", "M934T242 | \n", "933.932365 | \n", "242.395371 | \n", "ACMG_aqC18_POS_0773 | \n", "MS5021_Ethyl 2-methylacetoacetate | \n", "247.2 | \n", "BHTRKEVKTKCXOH-LBSADWJPSA-N | \n", "negative | \n", "5.0 | \n", "
| 150221 | \n", "M934T242 | \n", "933.932365 | \n", "242.395371 | \n", "ACMG_aqC18_POS_0774 | \n", "LO12_Homoveratrumic acid | \n", "247.2 | \n", "SEBFKMXJBCUCAI-UHFFFAOYSA-N | \n", "negative | \n", "5.0 | \n", "
| 150222 | \n", "M934T242 | \n", "933.932365 | \n", "242.395371 | \n", "ACMG_aqC18_POS_0775 | \n", "LO35_2-Methoxybenzoic acid | \n", "247.2 | \n", "RFKITWRHKUYMRJ-UHFFFAOYSA-N | \n", "negative | \n", "5.0 | \n", "
150223 rows × 9 columns
\n", "