Logistic Regression pt.1

In a recent post I created a table that contained two classes of data: images that represent either the handwritten digit ‘5’ or the digit ‘6’.  In this post I’ll model the data using logistic regression.  I will also take the opportunity to look at the role of training and test datasets, and to highlight the distinction between testing and validation.


Performance Profiling

In my last post I illustrated the performance boost generated by using matrix operations to conduct least squares regression calculations.  Matrices by their nature require numerical data.  So what about handling a categorical predictor variable?  To do this it’s necessary to create dummy variables – separate variables for each unique level of the predictor variable.