Ensembles of decision rules extracted from tree ensembles, like RuleFit, promise a good trade-off between predictive performance and model simplicity. However, they are affected by competing interests: While a sufficiently large number of binary, non-smooth rules is necessary to fit smooth, well generalizing decision boundaries, a too high number of rules in the ensemble severely jeopardizes interpretability. As a way out of this dilemma, we propose to take an extra step in the rule extraction step and compress clusters of similar rules into ensemble rules. The outputs ofthe individual rules in each cluster are pooled to produce a single soft output, reflecting the original ensemble's marginal smoothing behaviour. The final model, that we call Compressed Rule Ensemble (CRE), fits a linear combination of ensemble rules. We empirically show that CRE is both sparse and accurate on various datasets, carrying over the ensemble behaviour while remaining interpretable.