Extreme Gradient Boosting – Regression Algorithm


Extreme Gradient Boosting – Regression Algorithm

Table Of Contents:

  1. Example Of Extreme Gradient Boosting Regression.

Problem Statement:

  • Predict the package of the students based on CGPA value.

Step-1: Build First Model

  • In the case of the Boosting algorithm, the first model will be a simple one.
  • For the regression case, we have considered the mean value to be our first model.
  • Mean = (4.5+11+6+8)/4 = 29.5/4 = 7.375
  • Model1 output will always be 7.373 for all the records.

Step-3: Calculate Error Made By First Model

  • To calculate the error we will do the simple subtraction operation.
  • We will subtract the actual minus predicted values.
  • 7.375 – 4.5 = -2.875

Step-4: Building Second Model

  • The second model for the XGBoost algorithm will be a Decision Tree.
  • The input for the decision tree will be the CGPA and MODEL1-RESIDUAL1.
  • The building of Decision Tree will be different in XGBoost algorithm.

Step-5: Tree Building Process.

  • First we will make a leaf node and keep all the residual points on it.

Similarity Score:

  • Second we need to calculate the Similarity Score for each leaf node.
  • More the Similarity Score better the node.
  • λ = Regularization Parameter.
  • Considering λ = 0.
  • Similarity Score = Square(-2.875 + 3.625 – 1.375 + 0.625)/4 = 0.02
  • Here Similarity Score is smaller means elements inside the leaf node is not similar.
  • Next we will try to increase this Similarity Score.

Create A Full Decision Tree:

  • With a single leaf node, our similarity score is quite low.
  • Hence we need to do more splitting on the single leaf node to increase the similarity score.
  • As we have here only one feature CGPA we will split it based on CGPA.
  • To choose a splitting criteria we will sort the CGPA value.

Steps To Create Decision Tree:

Step-1: Sort The CGPA In Ascending Order.

  • To find the perfect splitting criteria we need to first sort the CGPA value in ascending order and find the Average between numbers.
  • We will use these average values to split our original single leaf node.
  • We will choose one of the values where we will get a better similarity score and will make that a root node.

Step-2: Use Splitting Criteria-1: (5.85)

  • Let’s check for the first splitting criteria which is 5.85.
  • After that we will calculate the Similarity Score for each leaf node and find out the Gain.
  • Now we will find out the Similarity Score for each leaf node using below formula.

Step-3: Calculate Similarity Score For Each Leaf Node

  • Here you can notice that residuals which are more similar to each other have the highest similarity score.

Step-3: Calculate Total Gain For The Tree

  • To find out the Gain we will use below formula.
  • Here 0.52 is the Gain or in other words, we have increased our Similarity of residuals when we split it into two parts.

Step-4: Use Splitting Criteria-2: (7.1)

  • Let’s use the second splitting criteria and find out the Gain for the split.
  • We will use the above steps to find out our gain.
  • Here you can notice that the Gain for the similarity score is higher than the first crieteria.

Step-5: Use Splitting Criteria-3: (8.25)

  • Let’s use the third splitting criteria and find out the Gain for the split.
  • We will use the above steps to find out our gain.
  • Here you can notice that the Gain for the similarity score is higher than the second criterion.

Step-6: Conclusion

  • Criteria-1(5.85) Gain = 0.52
  • Criteria-2(7.1) Gain = 5.06
  • Criteria-3(8.25) Gain = 17.51
  • Here we can conclude that the Gain for the third criterion is highest.
  • Hence we will make the third criterion our root node.

Step-7: Again Split The Left Leaf Node.

  • Here we will do one more split to increase the Similarity score and in terms Gain of the split.
  • Now we need to find out the splitting criterion to split the Leaf Node.
  • We have already used one splitting criterion which is 8.25.
  • We have two more splitting criteria left to use which are 5.85 and 7.1
  • We will use both splitting criteria to find out the Gain.

Step-8: Splitting Criteria 5.85

  • Now we will split the Leaf Node using the criteria CGPA < 5.85.
  • The similarity score for the second root node is as follows.
  • Now we will find out the similarity score for the two leaf nodes and find out the Gain.

Step-9: Splitting Criteria 7.1

  • Now we will split the Leaf Node using the criteria CGPA < 7.1.
  • The similarity score for the second root node is as follows.
  • Now we will find out the similarity score for the two leaf nodes and find out the Gain.

Step-10: Conclusion

  • Criteria-1(5.85) Gain = 5.04
  • Criteria-2(7.1) Gain = 0.04
  • Here we can conclude that the Gain for the first criterion is highest.
  • Hence we will make the first criterion our root node.
  • Our final Decision Tree will look like this.
  • We can further split the tree but it will overfit. Hence we will stop here.
  • The default MaxDepth = 6 for XGBost.
  • As we have a smaller dataset we will keep MaxDepth= 2.

Step-11: Calculate Output Values For All Leaf Node.

  • As this is a Regression task we need to calculate the output values for each leaf node.
  •  Keep, λ = 0
  •  If, λ = 0 the output will be the average of residuals.

Step-12: Calculate Stage2 Combined Model Output

  • Now we need to calculate the combined output, which means Model1 + Model2 output.
  • The default value for Learning Rate = 0.3.
  • Now we will calculate the combined output for all the records.
  • For CGPA = 6.7, 
  • Output = 7.3 + 0.3 * (-2.05) = 6.69
  • Here -2.05 is the output from the second decision tree.
  • Like this we will calculate for all the records.

Step-13: Calculate Residuals For Model-2

  • Residuals for Stage 2 will be calculated as = Package – Model2 Output.
  • For CGPA = 6.7, Residual = 4.5 – 6.69 = -2.19.
  • Like this, we will do for all the records.
  • Here you can notice that the residual values are getting reduced e.g. from -2.8 to -2.19.
  • Residuals/Error must go towards Zero, which means you are in the right direction.

Step-14: Building Third Decision Tree.

  • You can build the third Decision Tree, by following the same steps.
  • The input for the 3rd Decision Tree will be CGPA and Residual2 column.

Step-15: Final Output

  • The final output will be the combination of Model1 + Model2 + Model3 output.
  • Using this formula you can predict the Final Output value for new records.

Step-16: Conclusion

  • Here the goal is to reduce the Residual/Error value.
  • Residuals with more closure to zero better the model.

Leave a Reply

Your email address will not be published. Required fields are marked *