1. Some materials are taken from machine learning course of Victor Kitov
1. (R)MSE ((Root) Mean Squared Error)
$$ L(\hat{y}, y) = \frac{1}{N}\sum\limits_n^N (y_n - \hat{y}_n)^2$$
2. MAE (Mean Absolute Error)
$$ L(\hat{y}, y) = \frac{1}{N}\sum\limits_n^N |y_n - \hat{y}_n|$$
3. RSE (Relative Squared Error)
$$ L(\hat{y}, y) = \sqrt\frac{\sum\limits_n^N (y_n - \hat{y}_n)^2}{\sum\limits_n^N (y_n - \bar{y})^2}$$
4. RAE (Relative Absolute Error)
$$ L(\hat{y}, y) = \frac{\sum\limits_n^N |y_n - \hat{y}_n|}{\sum\limits_n^N |y_n - \bar{y}|}$$
5. MAPE (Mean Absolute Persentage Error)
$$ L(\hat{y}, y) = \frac{100}{N} \sum\limits_n^N\left|\frac{ y_n - \hat{y}_n}{y_n}\right|$$
6. RMSLE (Root Mean Squared Logarithmic Error)
$$ L(\hat{y}, y) = \sqrt{\frac{1}{N}\sum\limits_n^N(\log(y_n + 1) - \log(\hat{y}_n + 1))^2}$$
y = 10000
y_hat = np.linspace(0, 30000, 151)
# log error
error1 = np.sqrt((np.log(y+1) - np.log(y_hat + 1))**2)
# squared error
error2 = (y - y_hat)**2 /1000.
plt.plot(y_hat, error1, label='RMSLE')
plt.plot(y_hat, error2, label='MSE')
plt.xlabel('$\hat{y}$')
plt.ylabel('Error')
plt.title('true value y = %.1f' % y)
plt.legend()
plt.ylim(0, 10)
(0, 10)
Confusion matrix $M=\{m_{ij}\}_{i,j=1}^{C}$ shows the number of $\omega_{i}$ class objects predicted as belonging to class $\omega_{j}$.
Diagonal elements correspond to correct classifications and off-diagonal elements - to incorrect classifications.
fig = interact(demo_fscore, beta=FloatSlider(min=0.1, max=5, step=0.3, value=1))
Decision rule based on discriminant functions:
Decision rule based on probabilities:
If $\mu \downarrow$ , the algorithm predicts $\omega_{1}$ more often and
Characterizes classification accuracy for different $\mu$.
Area under the ROC curve
Global quality characteristic for different $\mu$
AUC$\in[0,1]$
AUC property: it is equal to probability that for 2 random objects $x_{1}\in\omega_{1}$ and $x_{2}\in\omega_{2}$ it will hold that: $\widehat{p}(\omega_{1}|x_{1})>\widehat{p}(\omega_{2}|x_2)$
What about unbalanced case?
Let $TPR @ K\%$ be positive class rate in top $K \%$ segment of the dateset, sorted by score
$$ Model Lift @ K\% = \frac{TPR @ K\%}{r_{POS}} $$
$$ \tilde{\mathcal{L}}(X, \theta) = \sum_nw_n\mathcal{L}(x_n, \theta) $$ Usually $w_n$ is inverse-proportional to class ratio of object $x_n$
interact(demo_weight, class_weight=['balanced', None], ratio=FloatSlider(min=0.05, max=0.5, step=0.05))
<function __main__.demo_weight>
Reduce number of majority class
interact(demo_under, ratio=FloatSlider(min=0.05, max=0.5, step=0.05), sampler=[None, 'rand', 'cluster', 'editnn', 'condnn', 'nearmiss'])
<function __main__.demo_under>
interact(demo_over, ratio=FloatSlider(min=0.05, max=0.5, step=0.05), sampler=[None, 'rand', 'smote'])
<function __main__.demo_over>