Supervised Learning

Supervised Learning:
We provide a dataset of right answers and ask the Machine to get trained on those datasets, later on once the Machine builds a Model out of those datasets, we ask the Machine to predict the outcome when provided with similar data.

In simple terms, the Machine first learns from the given data and then predicts the outcomes based on that.

1) Housing Price Prediction:
Andrew Ng used Housing Price Prediction problem for the same which is considered as a Regression Problem.

The dataset contains 2 columns, Size in sq. ft. and Price in $1000. Now when the Machine gets trained with this dataset and we can ask the Machine about what would be the Price of the House given the Area of the house.

In here also Size in sq. ft. is the ONLY Feature/Attribute.



Pink Line denotes a Linear equation
Blue Line denotes a Qudratic equation with polynomial 2

Let's not worry about how to differentiate about which one to choose now.

2) Breast Cancer Prediction:
In the below case, we are plotting the Tumor Size against the Severeness, i.e. Malignant (Cancer) or Benign (No Cancer). Blue crosses mean Benign and Red crosses mean Malignant. It is understood that this is a Binary Classification problem where we need to predict given Tumor Size if the Tumor is Malignant or Benign.

Tumor Size is the ONLY Feature / Attribute considered for Training the model.



The above one was Binary Classification problem, but the same can be a Multi Value Classification problem, in here there can 4 values of Cancer level
1) 0 - Benign
2) 1 - Type 1 Cancer
3) 2 - Type 2 Cancer
4) 3 - Type 3 Cancer


The above problem has only one feature but in general a problem can have infinite number of features, considering this fact, let's try to understand the same "Breast Cancer Problem" with 2 features, namely Age and Tumor Size.


The Blue Circles denote that the people with such Tumor Size and of that Age were identified to be having Benign Breast Cancer and the Red Crosses denote that the people of such ages and Tumor size were identified to be having Malignant Breast Cancer.

There can be other Features too such "Clump Thickness" and others mentioned in the image above, which can also be utilized to come up with a better Algorithm.

The Machine in this case would try to draw a line in such a way that it segregates the people who had Benign and Malignant Cancer. This belongs to the Classification problem and to be exact is a Binary Classification problem that outputs only TRUE or FALSE, i.e. Malignant or Benign.


Q&A:

Comments