1. Some materials are taken from machine learning course of Victor Kitov
Regression: $\widehat{y}(x)=F(x)$
Binary classification: $score(y|x)=F(x),\,\widehat{y}(x)= sign(F(x))$
Input:
ALGORITHM:
For $m=1,2,...M$:
Input: training dataset $(x_{i},y_{i}),\,i=1,2,...n$; number of additive weak classifiers $M$, a family of weak classifiers $h(x)\in\{+1,-1\}$, trainable on weighted datasets.
ALGORITHM:
for $m=1,2,...M$:
Output: composite classifier $f(x)=sign\left(\sum_{m=1}^{M}\alpha_{m}h^{m}(x)\right)$
X = np.array([[-2, -1], [-2, 1], [2, -1], [2, 1], [-1, -1], [-1, 1], [1, -1], [1, 1]])
y = np.array([-1,-1,-1,-1,1,1,1,1])
plt.scatter(X[:, 0], X[:, 1], c=y, s=500)
ada = AdaBoostClassifier(n_estimators=3, algorithm='SAMME',
base_estimator=DecisionTreeClassifier(max_depth=1))
ada.fit(X, y)
plot_decision(ada)
ada.estimator_weights_
X, y = make_moons(noise=0.1)
plt.figure(figsize=(17,15))
plt.scatter(X[:, 0], X[:, 1], c=y)
interact(ada_demo, n_est=IntSlider(min=1, max=150, value=1, step=1))
Gradient descend algorithm:
Input: $\eta$-parameter, controlling the speed of convergence $M$-number of iterations
ALGORITHM:
interact(grad_small_demo,
n_est=IntSlider(min=0, max=50, value=0, step=1),
learning_rate=FloatSlider(min=0.1, max=1., value=0.1, step=0.05),
max_depth=IntSlider(min=1, max=5, value=1, step=1))
Input: training dataset $(x_{i},y_{i}),\,i=1,2,...N$; loss function $\mathcal{L}(f,y)$; learning rate $\nu$ and the number $M$ of successive additive approximations.
For each step $m=1,2,...M$:
Output: approximation function $f_{M}(x)=f_{0}(x)+\sum_{m=1}^{M}\nu h_{m}(x)$
y = np.array([0, 0, 1, 1, 0])
X = np.array([
[1,1],
[2,1],
[1,2],
[-1,1],
[-1,0],
])
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='flag')
interact(grad_small_demo_class,
n_est=IntSlider(min=1, max=50, value=1, step=1),
learning_rate=FloatSlider(min=0.1, max=1., value=0.1, step=0.05),
max_depth=IntSlider(min=1, max=5, value=1, step=1))
interact(grad_demo, n_est=IntSlider(min=1, max=150, value=1, step=1))
Input: $M$-number of iterations
ALGORITHM:
Input: training dataset $(x_{i},y_{i}),\,i=1,2,...N$; loss function $\mathcal{L}(f,y)$; learning rate $\nu$ and the number $M$ of successive additive approximations.
For each step $m=1,2,...M$:
solve univariate optimization problem: $$ \sum_{i=1}^{N}\mathcal{L}\left(f_{m-1}(x_{i})+c_{m}h_{m}(x_{i}),y_{i}\right)\to\min_{c_{m}\in\mathbb{R}_{+}} $$
set $f_{m}(x)=f_{m-1}(x)+c_m h_{m}(x)$
Output: approximation function $f_{M}(x)=f_{0}(x)+\sum_{m=1}^{M}c_m h_{m}(x)$
Input : training dataset $(x_{i},y_{i}),\,i=1,2,...N$; loss function $\mathcal{L}(f,y)$ and the number $M$ of successive additive approximations.
Output: approximation function $f_{M}(x)$
Comments:
Subsampling