Learning Right Monotone Permutation Matrices for Neural Subsequence Search
Abstract
Subsequence retrieval seeks relevant segments in a large corpus given a short query. Existing pairwise metric-based methods are computationally intensive, hard to parallelize, and tied to domain-specific metrics. In this work, we introduce a neural framework that casts subsequence matching as end-to-end alignment with permutation matrices satisfying monotonicity used as differentiable approximate subsequence selectors. Our framework yields fixed-dimensional embeddings for variable-length inputs, and we prove these embeddings are compatible with standard Approximate Nearest Neighbor search methods such as Locality-sensitive hashing (LSH), enabling scalable retrieval. We also impose structural priors on admissible subsequences and integrate them directly into the scoring function. The approach is domain-agnostic and operates on pre-trained representations across modalities. Experiments on real-world datasets from two different domains show strong retrieval performance, robustness, and substantial speedups, with high parallelism on GPU-accelerated hardware.