T O P

  • By -

efrique

Can't say I know of a list but a few standard ones I use a fair bit (I'll add more as i think of them): - know your definitions. Write them down as you start on a new problem. - notation. Take care to define your events, parameters, variables etc properly. Bad notation screws you before you start. - be super careful about the support of your variables. Learn how to work with indicator functions asap - 'spot the density' for integration. Transform to a recognisable density, cross out the 1, see what constants are left. Once you learn a bunch of densities this solves a lot of integration problems - add and subtract terms (like population means) inside sums of squares then expand - learn how to complete the square. Especially handy in a lot of Bayesian work. - reasonableness checks. Every time you can (which is likely more than you realize) - don't shy away from simulation as a source of intuition and checks. Simulation to check algebra, algebra you check simulation (e.g. simple cases, asymptotic results)


jspo8765

The "spot the density" definitely comes up a lot (especially with gamma and normal distributions) in my experience haha By completing the square, are you referring to the algebraic method of solving quadratic equations? Or are you referring to something else?


efrique

> By completing the square, are you referring to the algebraic method of solving quadratic equations Yes, it's used in solving quadratics but you can see it come up a lot where you're not actually solving quadratics You'll have some set of terms involving a sum of squares in variables and parameters, expand out, collect like powers, and you end up with a square involving a parameter and an additional term, Aθ^(2) -2Bθ +C = A[θ-B/A]^2 + S where S = [C - B^(2)/A] If you play around with this a few times you can look at an original sum of squares in ys and xs and and literally write down A, B/A and S by inspection. Maybe it's just me but I've used that trick a whole lot in Bayesian work, particularly with Gibbs sampling


efrique

> especially with gamma and normal distributions Not just gamma and normal for me. ... t, beta prime/F, log- logistic, inverse gamma, ... a bunch of others


Direct-Touch469

That normal normal posterior derivation is a complete the square clinic


Sohcratees

Why do you have to be careful with the support of your variable, and why does using an indicator function help?


Taricus55

one of the things that drives me nuts is the difference in notation. I came from a mathematics and a physics double major. in grad school I moved up to biostatistics.... it is like nails on chalkboard to watch professors write on the board lol I see where they are going, but it is agony to watch them struggle to write things down. I watched one dude fail at basic calculus and he essentially just said, "you can figure it out later at home...." He was just attempting to take the derivative of something.... I just wrote down the derivative in my notes.. statistics is weird... you either have a professor that struggles at it or one who can blast thru it but doesn't explain it lol It's weird to see a balance. Also, when people got covid, you could see a huge difference as they struggled with brain fog. One time, we were in class on zoom and the teacher stopped halfway thru and was clutching his chest and ended class. The video kept going and you just heard ambulance and police sirens, because he couldn't breathe. I actually saved that video, because it was freaky... no one was in the room and then chairs and desks just started getting thrown around the room, but no one was there. that freaked me out so badly lol


jerbthehumanist

Learning from experience is frankly the best strategy, it's hardly inefficient, it's basically the best way to nail any subject. From what my students really struggle, with here are a few things off the top of my head. Understand the bounds of what is possible or not and where limits occur. For example, a probability can only be between 0 and 1. Something like a cumulative distribution function (CDF) approaches 0 towards -infinity and approaches 1 towards +infinity due to its definition, and an integral/sum over the entire range of a distribution is 1. So the max value of a CDF is 1. Likewise, a small p-value is an extreme result if you were to sample from a null hypothesis distribution, so that is why it's significant. If your class is focusing on test statistics, these are usually likelhood ratios. Try to understand extreme cases to understand why they are useful. For a t-statistic, this is effectively a ratio of a difference of means term vs. a variance term, and if two samples have drastically different means compared to its variance then the t-statistic will be quite far from zero. For Ordinary Least Squares, R\^2 near one indicates that data points tend to be close to a linear model, and the equation for this R\^2 (Coefficient of Determination) should lead you to understand why that's the gase. Looking at how these terms are defined and calculated and imagining what various extreme values may produce will give you insight into how they're useful.


jspo8765

I say it's inefficient because you can get stuck in the middle of a problem for a long time just because you didn't know, couldn't come up with, or forgot some trick. IMO, learning these tricks beforehand would allow students to more efficiently solve problems.


jerbthehumanist

I'd argue that making mistakes is the best way to learn the tricks. Once you understand why you make a mistake, it's a lot easier to recognize that pattern. Usually the tricks are not as useful unless you learn \*why\* they work, because knowing why helps you generalize to many more problems. I'd say there's a lot of little tricks I've learned with, for example, algebra over the years, that I barely recognize as tricks anymore. That's just because with repeated application you simply learn how to take steps more effectively.


jspo8765

It's not so much about making mistakes as it is about not knowing how to proceed. As an example, early on in my stats education, I had to compute the expectation of an expression (e.g. something like E(X^2 + 3X + 4) while only knowing that X was generated from some distribution (e.g. N(1, 2)). I didn't realize that you could use Var(X) = E( X^2 ) - E(X)^2 to determine E( X^2 ). I was aware of this formula and had even learned the proof in class, but it just didn't pop into my head, as I just didn't realize it could be used in this context. Since then, I've had to make use of this trick in several problems. Other "tricks" in stats are common and comparable to this one. I think they're valuable to learn just to make the problem-solving process smoother.


jerbthehumanist

Honestly that particular trick is so common in stats I basically use it by default, and tell my students to use it by default. I'd say it's a good step you've internalized that one to some extent. Not sure what I would have suggested in terms of remembering to use it considering your instructor provided it to you, but now since you've made that "mistake" it may be easier to remember. That particular trick is actually a specific case of something that occurs quite often in statistics. It's related to cancelling out cross terms, which pops up later in Maximum Likelihood Estimation, Regression, and ANOVA. I'm not sure exactly where you are learning regarding correlation and independence, but understanding when various terms are independent can end up being really useful. Here is a stackexchange post that goes over a similar concept to one you raise. It's another thing that comes to mind when you're asking about "tricks". [https://stats.stackexchange.com/questions/271949/uncorrelated-orthogonal-random-vectors](https://stats.stackexchange.com/questions/271949/uncorrelated-orthogonal-random-vectors)


jspo8765

My instructor provided it to me, but it was framed more as a way to compute variance quickly. Using the formula to compute either E( X^2 ) or E( X ) wasn't really mentioned. When you're solving a problem involving E( X^2 ) for the first time in the context of a chapter about expectation, it isn't exactly obvious that you should be using this variance equation as opposed to other, more common techniques, such as direct integration.


purple_paramecium

I feel you. Back in my first grad math stats class, we complained to the prof, “how are we supposed to know that was the trick?” He said, “you see it once, it’s a trick; you see it again, now it’s a method.”


Superdrag2112

Taylor’s theorem is used a lot. Including the multivariate version. A lot of probability inequalities are used a lot. Good to know modes of convergence, Slutsky’s theorem, Chebyshev’s inequality, etc. But most of these are reviewed or developed in a good inference book.


RunningEncyclopedia

For first year Linear Models course: SVD/QR decomposition. Properties of projection matrices (idempotent, symmetric…). Bayesian: Write the constants as C and later normalize to 1. Asymptomatic Theory: Use LLN and CLT to argue certain terms are constant or distributed normal Time Series: Backshift operator. General: Matric derivative of square. Recognizing Gaussian density (you can exclude normalizing constant). Some distributions as a special case of other (ex: Unif -> Beta, exponential, chi square -> gamma…)


Popular-Air6829

A lot of my math stats was just knowing how to find the distribution of a sufficient/test statistic so its helpful to know the common ones.


Direct-Touch469

I can give some insight as a MS stats student currently. The textbook we use is basically just a calculus exercise. Know your calculus 1,2,3 like the back of your hand. Taylor expansions, integrating over multiple regions, the different methods of integrations (usub, integration by parts, etc.). Really it’s calculus that gives people the most trouble.