Recent advancements in statistical and reinforcement learning methods have contributed to superior patient care strategies.However, these methods face substantial challenges in high-stakes contexts, including missing data, stochasticity, and the need for interpretability and patient safety.Our work operationalizes a safe and interpretable approach for optimizing treatment regimes by matching patients with similar medical and pharmacological profiles.This allows us to construct optimal policies via interpolation.Our comprehensive simulation study demonstrates our method's effectiveness in complex scenarios.We use this approach to study seizure treatment in critically ill patients, advocating for personalized strategies based on medical history and pharmacological features.Our findings recommend reducing medication doses for mild, brief seizure episodes and adopting aggressive treatment strategies for severe cases, leading to improved outcomes.