Skip to content

Metrics

  • For comparing models (train and evaluate performance out-of-the-box)
    • Regression: RMSE, \(R^2\)
    • Classification: Accuracy, Confusion Matrix, Precision, recall, F1-score, ROC AUC

root_mean_squared_error

r_squared = reg.score(X_test, y_test)
rmse = root_mean_squared_error(y_test, y_pred)

confusion_matrix and classification_report

y_pred = knn.predict(X_test)

confusion_matrix(y_test, y_pred)
classification_report(y_test, y_pred)

roc_curve

  • Logistic Regression model allows us to perform binary classification by setting a threshold. E.g. if threshold, \(p=0.5\), and say prediction from a logreg model is 0.7, then the output is 1 since \(0.7 \gt p\) (get the point?)
  • The ROC (Receiver Operating Characteristic) visualizes how different thresholds (\(p\)) affect TPR and FPR.
    • \(p=0\) will classify everything as positive and thus TPR = 100%, and FPR = 100%.
    • \(p=1\) will classify everything as negative and thus TPR = 0% and FPR = 0%.
fpr, tpr, thresholds = roc_curve(y_test, y_pred_probs)
plt.plot([0, 1], [0, 1], 'k--') # (1)
plt.plot(fpr,tpr)

roc_auc_score

To quantify a score for the ROC Curve, we take the "area under the curve" (auc).

roc_auc_score(y_test, y_pred_probs)
  1. roc_curve for a random classifier