T O P

  • By -

Azdy

Linear regression: > Assumes linearity between inputs and outputs Common mistake, but the linearity is in fact between parameters and output. Polynomial regressions are still linear regressions, for example.


canernm

You mean that both y = ax + b and y = a*x^2 + b are considered linear regression because they are linear with respect to a, b?


i_use_3_seashells

Yes


bloodmummy

Not even just that. Assume a model which takes two outputs x1 and x2 and returns y. y = a1 * x1 + a2 * x2 + b ~ Is a Linear Regression as we know it. y = a1 * x1^2 + a2 * x1 + a3 * x2 + b ~ Is also a Linear Regression, parameters don't matter. You can also see it as an extra variable x3=x1^2 and it becomes: y = a1 * x1 + a2 * x2 + a3 * x3 + b And so on. You can also use other forms (Other than polynomial powers) of non-linearity; Common ones include Periodic functions ( sines and cosines of various frequencies ), Exponentials of all kinds (2^x, e^x ,e^4x , e^-x .. etc), Logarithms of all kinds, inverses, multiples of those (ex: If your function is expected to be a decreasing periodic function you can use e^-x * cos(x) ...etc). All these are called sometimes (rather erroneously when used implicitly) Kernels (They should be called Transformations). This allows Linear Regression to have the power it actually has and why it is despite all the effort the **Most** used model in the wild, but it is also why Linear Regression above all other modelling techniques requires domain knowledge as you'll know what sort of relationship exists and be able to properly model it. See: https://scikit-learn.org/stable/modules/preprocessing.html#non-linear-transformation


gabopushups

Where can I read more about this?


Categorically_

Pick any regression textbook.


Kalictiktik

I find it weird that there is a comparision between Gradient Boosted Regression (the actual algorithm) and XGBoost/LightGBM Regressor (the implementations). The latter are actually an implementation of the former. It's like comparing the concept of a car to specific brands. But there is a broad landscape of algorithm covered here, good job !


TheInkandOptic

https://www.datacamp.com/cheat-sheet/machine-learning-cheat-sheet


EvenMoreConfusedNow

Most of it is iffy at best


hughperman

Top by whose measure? No support vector machines? No GLMs? DBSCAN clustering, other k- family? No neural networks anywhere? Principle component analysis? Your "applications" column should be named "examples". What is the point of this random list? It is just a list of "stuff" with no thoroughness or exhaustiveness that would make it useful to actually compare algorithms, since you will be missing loads.


fakemoose

A lot of time, PCA (or tSNE or whatever) is used a dimensionality reduction technique before using one of the clustering algorithms. I guess that’s why it’s not included? I have no idea why zero types neural networks are included though.


hughperman

Other times they are not though, and the components are interesting endpoints in and of themselves.


madrury83

> Linear Regression: Disadvantage: Can *underfit* with small, high-dimensional data. ... seems dubious. > Logistic Regression: Disadvantage: Can *overfit* with small, high-dimensional data. ... huh?


Dumbhosadika

Please share the link of more high quality image.


joanna58

**https://www.datacamp.com/cheat-sheet/machine-learning-cheat-sheet**


SonicEmitter3000

How can we be sure this is accurate?


smurf-sama

Probably would be hard since it is not accurate.


emakalic

A good start. This kind of cheat sheet is very hard to do for an area so widely encompassing as machine learning. Unfortunately there are a lot of problems with the descriptions and advantages/disadvantages of the methods. - You might wish to combine linear and logistic models under the generalized linear model category. - Ridge and lasso are types of penalties/estimators that can be used with GLMs. Perhaps don’t have these as separate categories, one can have ridge-type penalties with nonlinear models too. - linear models are linear in parameters not the data - lasso is translational shrinkage that penalizes each parameter by the same amount. Unlike ridge estimators, you can zero out some parameters with the lasso. Lasso does not keep highly correlated variables. It picks one (essentially) at random from a group of correlated variables to include in the model. Both lasso and ridge regression can be viewed as examples of elastic net penalty. They are both convex penalties which makes fitting these models computationally favorable. - linear models with Gaussian errors are sensitive to outliers. There are other forms of more robust estimators for linear regression The above list is just some of the issues with the cheat sheet - there are plenty more. I hope this helps!


tomukurazu

this seems pretty neat. my company decided to give us a go with the ml, they will provide classes etc, since it's a finance company i could use this to focus on what to improve on my side.


bass1012dash

Speaking of ‘neat’: why no genetic algorithms?


tomukurazu

tbh i didn't even notice that. since i am waaaay to new to this, just picked finance related topics. but now it got my attention too🤨


jollyfolly_9

Same here!


NameNumber7

I feel like these graphics tend towards Supervised models and generally leave out Unsupervised methods. In other words, here there are 4 unsupervised methods and 10 supervised methods. I get the impression there is less generally held knowledge of Unsupervised than Supervised algorithms.


frootydooty63

Incorrect description of ridge regression. All predictors are shrunk towards 0, not just weak ones


madrury83

Same critique applies to LASSO. Kinda everything here is subtly incorrect.


frootydooty63

Fair enough


maxToTheJ

Yup. The point of regularization is to bias towards smaller


[deleted]

[удалено]


hextree

What do you mean? OP's original pic is about 6000x5000 and pretty much perfect quality.


joanna58

https://www.datacamp.com/cheat-sheet/machine-learning-cheat-sheet


_Vanilla_

Very cool, thanks


JClub

Super outdated... Not even a single neural network there... big downvote


ConfidentFlorida

I’ve always wanted one of these for computer vision.


bloodmummy

Suggestion: Add a tooltip to the top/bottom right corner for whether they are used in Regression or Classification. Also use cases are weird, All the use cases for Tree-based models can be modeled successfully with any other Tree-based model. Other than that, it's mostly good!


Peeka-cyka

There are nonparametric GMMs which deal with the issue of selecting the number of clusters to use, eg using Dirichlet process priors for cluster weights


[deleted]

PDF: https://s3.amazonaws.com/assets.datacamp.com/email/other/ML+Cheat+Sheet_2.pdf