Publication:
b-move: faster lossless approximate pattern matching in a run-length compressed index
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.orcid | 0000-0001-8517-0479 | |
| cris.virtual.orcid | 0000-0002-2244-1427 | |
| cris.virtual.orcid | 0000-0002-9994-8269 | |
| cris.virtualsource.department | c4cd9ad3-10fc-4ea8-b7f3-19c50b10d7a7 | |
| cris.virtualsource.department | cafc39a2-8610-45ab-befa-b3f04ef3481d | |
| cris.virtualsource.department | 4a97cd9e-e619-4718-8c1e-9168bc19ef13 | |
| cris.virtualsource.orcid | c4cd9ad3-10fc-4ea8-b7f3-19c50b10d7a7 | |
| cris.virtualsource.orcid | cafc39a2-8610-45ab-befa-b3f04ef3481d | |
| cris.virtualsource.orcid | 4a97cd9e-e619-4718-8c1e-9168bc19ef13 | |
| dc.contributor.author | Depuydt, Lore | |
| dc.contributor.author | Renders, Luca | |
| dc.contributor.author | Van de Vyver, Simon | |
| dc.contributor.author | Veys, Lennart | |
| dc.contributor.author | Gagie, Travis | |
| dc.contributor.author | Fostier, Jan | |
| dc.contributor.imecauthor | Depuydt, Lore | |
| dc.contributor.imecauthor | Renders, Luca | |
| dc.contributor.imecauthor | Fostier, Jan | |
| dc.contributor.orcidimec | Depuydt, Lore::0000-0001-8517-0479 | |
| dc.contributor.orcidimec | Renders, Luca::0000-0002-2244-1427 | |
| dc.contributor.orcidimec | Fostier, Jan::0000-0002-9994-8269 | |
| dc.date.accessioned | 2025-08-23T03:59:02Z | |
| dc.date.available | 2025-08-23T03:59:02Z | |
| dc.date.issued | 2025-AUG 12 | |
| dc.description.abstract | Background Due to the increasing availability of high-quality genome sequences, pan-genomes are gradually replacing single consensus reference genomes in many bioinformatics pipelines to better capture genetic diversity. Traditional bioinformatics tools using the FM-index face memory limitations with such large genome collections. Recent advancements in run-length compressed indices like Gagie et al.’s r-index and Nishimoto and Tabei’s move structure, alleviate memory constraints but focus primarily on backward search for MEM-finding. Arakawa et al.’s br-index initiates complete approximate pattern matching using bidirectional search in run-length compressed space, but with significant computational overhead due to complex memory access patterns. Results We introduce b-move, a novel bidirectional extension of the move structure, enabling fast, cache-efficient, lossless approximate pattern matching in run-length compressed space. It achieves bidirectional character extensions up to 7 times faster than the br-index, closing the performance gap with FM-index-based alternatives. For locating occurrences, b-move performs and operations up to 7 times faster than the br-index. At the same time, it maintains the favorable memory characteristics of the br-index, for example, all available complete E. coli genomes on NCBI’s RefSeq collection can be compiled into a b-move index that fits into the RAM of a typical laptop. Conclusions b-move proves practical and scalable for pan-genome indexing and querying. We provide a C++ implementation of b-move, supporting efficient lossless approximate pattern matching including locate functionality, available at https://github.com/biointec/b-move under the AGPL-3.0 license | |
| dc.description.wosFundingText | The authors thank Ben Langmead, Nathaniel Brown, and Mohsen Zakeri for their helpful feedback and suggestions. | |
| dc.identifier.doi | 10.1186/s13015-025-00281-x | |
| dc.identifier.issn | 1748-7188 | |
| dc.identifier.pmid | MEDLINE:40796877 | |
| dc.identifier.uri | https://imec-publications.be/handle/20.500.12860/46099 | |
| dc.publisher | BMC | |
| dc.source.beginpage | 15 | |
| dc.source.issue | 1 | |
| dc.source.journal | ALGORITHMS FOR MOLECULAR BIOLOGY | |
| dc.source.numberofpages | 19 | |
| dc.source.volume | 20 | |
| dc.subject.keywords | READ ALIGNMENT | |
| dc.title | b-move: faster lossless approximate pattern matching in a run-length compressed index | |
| dc.type | Journal article | |
| dspace.entity.type | Publication | |
| Files | Original bundle
| |
| Publication available in collections: |