This article describes a R package Boruta, implementing a novel feature selection
algorithm for nding all relevant variables. The algorithm is designed as a wrapper around
a Random Forest classication algorithm. It iteratively removes the features which are
proved by a statistical test to be less relevant than random probes. The Boruta package
provides a convenient interface to the algorithm. The short description of the algorithm
and examples of its application are presented.
本文介绍了一个R包Boruta,实现了一种寻找所有相关变量的新特征选择算法。 该算法被设计为包装器随机森林分类算法。 它迭代地删除了那些通过统计检验证明与随机探针不太相关特征。 Boruta包为算法提供了方便的接口。本文是对 算法的简短描述并介绍了其应用实例。
1