BEST : A decision tree algorithm that handles missing values

Cédric Beaulac and Jeffrey S. Rosenthal

Submitted to Machine Learning

Abstract :

In this article we propose a new decision tree algorithm. The proposed approach allows the algorithm to interact with some predictors that are only defined in subspaces of the predictor space. One way to utilize this new algorithm is to create or use a predictor to keep track of missing values. This predictor can later be used to define the subspace where predictors with missing values are available for the data partitioning process. By doing so, this new classification tree can handle missing values for both modelling and prediction. The algorithm is tested against simulated and real data. The result is a classification procedure that efficiently handles missing values and produces results that are more accurate and more interpretable than most common procedures.

Download here