2026-08-09

Differentiably Discovering Sets of Rules

Summary

Association rule mining is a great way to gain insight into complex data, such as single-cell gene expression data, as it allows discovering conditional dependencies of the kind ‘In samples where high expression of gene ???? and ???? are observed, ???? and ???? tend to be expressed too’. Or at least, in theory. In practice, traditional methods for mining association rules tend to overwhelm the user with overly many, highly redundant, and mostly spurious results, while modern approaches that fix those problems do not scale to high-dimensional data. In this paper, we propose an end-to-end differentiable approach for mining high quality sets of rules that does scale to hundreds of thousands of features. In particular, we suggest an inherently interpretable autoencoder that we call RuleNaps. Its hidden layer consists of two modules that together encode the found association rules. The first module encodes the most important associations between features (e.g., ????????????????) while the second splits these into conditional dependencies (e.g., ???????? → ????????). By binarizing the weights in the forward pass, each neuron has a symbolic interpretation that allows us to trivially read out the rules. By considering the weights again as continuous-valued in the backward pass, we can use back-propagation to efficiently learn good rules even from very high-dimensional data. Experiments on synthetic and real-world data show that RuleNaps performs very well, outperforming the state of the art by a wide margin. In a case study, we confirm it also delivers in practice, revealing valuable insight into high-dimensional breast cancer data.