Suppose we decided to run gradient boosting for 700 iterations, setting the learning rate to 0.5:
Later, we used subsampling 2 times: with the same initial learning rate (0.5), and later, with the learning rate of 0.3, plotting the black and the magenta curves, illustrated below:
What could be said about the performance and the effects of both the learning rate and the subsampling?