Metrics
- For comparing models (train and evaluate performance out-of-the-box)
- Regression: RMSE, \(R^2\)
- Classification: Accuracy, Confusion Matrix, Precision, recall, F1-score, ROC AUC
root_mean_squared_error¶
confusion_matrix and classification_report¶
roc_curve¶
- Logistic Regression model allows us to perform binary classification by setting a threshold. E.g. if threshold, \(p=0.5\), and say prediction from a
logregmodel is 0.7, then the output is1since \(0.7 \gt p\) (get the point?) - The ROC (Receiver Operating Characteristic) visualizes how different thresholds (\(p\)) affect TPR and FPR.
- \(p=0\) will classify everything as positive and thus TPR = 100%, and FPR = 100%.
- \(p=1\) will classify everything as negative and thus TPR = 0% and FPR = 0%.
fpr, tpr, thresholds = roc_curve(y_test, y_pred_probs)
plt.plot([0, 1], [0, 1], 'k--') # (1)
plt.plot(fpr,tpr)
roc_auc_score¶
To quantify a score for the ROC Curve, we take the "area under the curve" (auc).
roc_curvefor a random classifier