From a20adde6bf2fb5de5cc738af161dd61ea419f7a0 Mon Sep 17 00:00:00 2001 From: Tyler Perkins Date: Sat, 26 Feb 2022 23:00:01 -0500 Subject: [PATCH] Update for 26-02-22 23:00 --- tech/machine_learning.wiki | 46 +++++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/tech/machine_learning.wiki b/tech/machine_learning.wiki index 3d6e09b..6ba67b9 100644 --- a/tech/machine_learning.wiki +++ b/tech/machine_learning.wiki @@ -1,3 +1,47 @@ = Machine Learning = -Machine learning +Machine learning is a technique in which an algorithm is given data, and over +time 'learns' to approximate the underlying relationship in said data. + +== Types of learning == + +=== Supervised === + +Supervised learningis when an algorithm `F(x)` tages some non-target attribues +`x` and attempts to approximate some known ground truth, `y`. This is +supervised, because the expected result of the data is known. + +=== Unsupervised === + +Unsupervised learning is when no ground truth for the algorithm is provided. we +instead focus on finding broad corrleations. + +* Clustering is when a given data set is clustered into groups based on + similarities among samples in said cluster. For example a cluster made up of + customer profiles can be clustered based on possible interests. +* Association is when an algorithm attempts to find out what clusters appear in + a dataset. For example, finding what products tend to be purchased together. + +=== Semi-supervised === + +Semi supervised learning is used when only a small portion of the dataset is +labeled. We can generally either + +* train the model on the limited set of labeled datapoints then have it perform + unsupervised training on the rest of the data +* cluster the unlabeled data, then use the sampled data to hone in the + clusters + +== Output == + +Machine learning models are divided into two types + +* Classification (boolean outputs) +* Regression (continuous outputs) + +Classification is used to distinguish between two or more choices. + +For regression models the sample is often a Tuple of several factors about the +sample, and often those fields can be categorical. Each element of a tuple can +be either categorical (discrete) or numeric (continuous). These elements are +often called features.