vimwiki/tech/machine_learning.wiki

48 lines
1.7 KiB
Plaintext

= Machine Learning =
Machine learning is a technique in which an algorithm is given data, and over
time 'learns' to approximate the underlying relationship in said data.
== Types of learning ==
=== Supervised ===
Supervised learningis when an algorithm `F(x)` tages some non-target attribues
`x` and attempts to approximate some known ground truth, `y`. This is
supervised, because the expected result of the data is known.
=== Unsupervised ===
Unsupervised learning is when no ground truth for the algorithm is provided. we
instead focus on finding broad corrleations.
* Clustering is when a given data set is clustered into groups based on
similarities among samples in said cluster. For example a cluster made up of
customer profiles can be clustered based on possible interests.
* Association is when an algorithm attempts to find out what clusters appear in
a dataset. For example, finding what products tend to be purchased together.
=== Semi-supervised ===
Semi supervised learning is used when only a small portion of the dataset is
labeled. We can generally either
* train the model on the limited set of labeled datapoints then have it perform
unsupervised training on the rest of the data
* cluster the unlabeled data, then use the sampled data to hone in the
clusters
== Output ==
Machine learning models are divided into two types
* Classification (boolean outputs)
* Regression (continuous outputs)
Classification is used to distinguish between two or more choices.
For regression models the sample is often a Tuple of several factors about the
sample, and often those fields can be categorical. Each element of a tuple can
be either categorical (discrete) or numeric (continuous). These elements are
often called features.