Clearly understanding regression coefficients is perhaps the most useful and important knowledge you can have as a data analyst. When you don't understand your results, it's hard to make compelling conclusions about your research.
But understanding the results of your analysis is about more than reaching the right conclusions. When you intuitively understand what each coefficient communicates, you can conciously choose the predictors that succictly answer your research questions.
In 10 years of statistical consulting, misunderstanding linear regression results is the single biggest source of questions and mistakes I've seen.
Sometimes it makes the analysis take longer and be more frustrating.
But not always. Sometimes it really affects your research. I have personally seen all of the following happen:
- Real, interesting, compelling results that were completely missed when the researcher didn't include a categorical, polynomial, or interaction term.
- Articles rejected on statistical grounds, even though the statistics were correct! The authors just didn't explain the results of the model accurately.
- Panic, and unnecessary stress, when results change in the presence of different kinds of predictors.
- Regression models with predictors that don't answer the intended research questions.
- Arbitrary practices like median splits in order to squeeze data into the statistical method the analyst understands.
Often it's one of these outcomes that compels researchers to seek out statistical help.
It all comes down to the same result--you can't do accurate and compelling data analysis if you don't understand what the different types of predictors will tell you.
The good news is that anyone with a basic background in regression or ANOVA can attain this level of intuitive understanding.
Master Linear Regression before trying to learn more sophisticated statistical modeling
The meaning of regression coefficients is largely determined by the form of the predictor variable. How they are interpreted is fundamentally the same in all kinds of regression models--logistic regression, multilevel models, analysis of covariance, survival models, poisson regression--all of them.
But because these other models include logarithms, odds, extra sources of variance, and other mathematical complications, the coefficients in all these other models are harder to understand--they're less intuitive.
You don't want to be struggling with understanding the meaning and consequences of concepts like centering, intercepts, polynomials, and interaction terms at the same time you're trying to understand hazard functions, odds ratios, or covariance matrices. You'll drive yourself crazy.
But, if you can master how to interpret the different types of coefficients in the context of a straightforward model like linear regression, it's only one step further to apply them to more complicated models.
In linear regression, understanding how to interpret coefficients is generally staightforward. It didn't make all that much sense in your statistics classes because the focus there is on giving you background knowledge, not working with real data.
This ebook will give you a solid understanding of many types of regression coefficients in the context of real output. |