Sam Jang's Blog

Posts

Showing posts from June, 2020

Overfitting vs Underfitting

Overfitting vs Underfitting Hello! In this post, we will be exploring the concepts of Overfitting and Underfitting. Overfitting: Overfitting is a modelling error that occurs when the model fits the training data too well. As you can see above, the overfitted model is fit almost perfectly to the training data. The overall cost of the model to the training data be near 0, however, the accuracy of this model would be poor when used on testing data. Underfitting: Underfitting is when the model fits to the training data too simply; when the model isn't complex enough to adequately understand the trend/pattern of the training data. As you can see from the image above, the model is just a linear line and does not fit to the training data very well; the model does not accurately understand the trend of the data and is fit too simply.

Support Vector Machine

Support Vector Machine How the algorithm works: As you can see from the image above, SVM works by dividing different classes with a hyperplane. For the explanation of how this algorithm works, I will be using a 2D graph to simplify how this essentially works. ← 2D case SVM works by finding the optimal weights and bias that separates two different classes. The linear line (when looking at 2D graph) has the equation: w*x - b =0 . When w*x - b ≥ 0 it would predict class 1 and if w*x -b < 0 it would predict class 2. What the SVM model tries to do is to maximize the margin between the support vectors(blue and green points on w*x - b =1 and w*x -b = -1; points that are on the boundary). Hyperplane Definition w*x - b ≥ +1 ( y i = +1) {where +1 = class 1} w*x - b ≤ -1 ( y i = -1) {where -1 = class 2} The two could be combined to form: y i ( w*x - b ) ≥ 1 Finding the Separation of the Margin: We k...