6 Tips for Optimizing TreeNet Gradient Boosting Models

Dan Steinberg
January 2013
Salford Systems
www.salford-systems.com

 While TreeNet (Stochastic Gradient Boosting) can work phenomenally well out
of the box it almost always pays to try to tune your control parameters.
Devoting time to optimizing a TreeNet model can improve its out of sample
performance noticeably. Here is a list of several things recommended for all
TreeNet users.

© Copyright Salford Systems 2013

 TreeNet starts with 200 trees by default,
although you can reset default.

 In real-world modeling we often find that
1,000 or more trees perform better.


 This one goes hand in hand with growing
enough trees because the slower your learn
rate is, the more trees you will need.

 There is nothing wrong with using a learn
rate of .001 if you are willing to let your
machine run through all the trees you will
need.


 The default value of 0.10 means that 10% of
the data could be ignored in each training
cycle.
 You ought to experiment with a value of 0.0
to see if it helps or hurts. You can also try
values such as 0.02, 0.05 etc.

Note: If the data are very clean 0.0 should
work best.


 If 500 trees are needed when you generate 6
node trees, you might need 1500 or more
when generating just 2-node trees.
 Sometimes moderately large trees work best:
12-node, 15-node, even 25-node trees could
do the trick.
 Since large trees learn more than smaller
trees, you might also need to dial down the
learn rate to prevent over-fitting.


 Try Battery LOVO (leave one variable out) as this
might allow you to remove a variable from the
middle of the pack in terms of importance.

 Try Battery SHAVING to remove the least
important variables (shaving from the bottom of
the list). This tests the viability of dropping the
"best" variables


 First, run some completely additive models.
Unlike 2-node trees that can actually allow
interactions due to the manner in which
TreeNet handles missing values. With the ICL
ADDITIVE command you guarantee no
possible interactions of any kind, including
interactions between missing value indicators
created by TreeNet and other variables.


 Then, in the PRO EX version, you can run the
BATTERY ADDITIVE procedure which will start
with a fully flexible model and search for the one
variable which can most readily be made additive
(interact with nothing).
 Then it searches for a second variable to be made
additive, and so on, going step by step until all
variables are additive.
 Reviewing the performance curve of this
procedure allows the discovery of the optimal
balance between full free interactivity and limited
interactivity.
 If a variable or variables really do not interact
with any others then preventing chance
interactions from creeping into the model will
improve the model on future unseen data.


 For more on TreeNet, visit
http://www.salford-systems.com/en/products/treenet


6 Tips for Optimizing TreeNet Gradient Boosting Models

Recommandé

Recommandé

Contenu connexe

Plus de Salford Systems

Plus de Salford Systems (20)

6 Tips for Optimizing TreeNet Gradient Boosting Models