![]() |
![]() |
![]() |
![]() |
![]() |
|||
![]() |
|||||||
|
Online Workshop: Effectively Dealing with Missing Data without Biasing your Results Learn the new approaches to missing data so you can confidently analyze your data with no loss of power, accurate p-values, and unbiased results. If you have any experience with missing data, you know it really messes analyses up:
But there is something unique about the way it messes analyses up. It’s not a data issue like skewness or outliers that you can just ignore (whether you should or not). Ignoring missing data still means choosing a method of dealing with missing data–-you’re just using the default. The default method in most statistical software is listwise deletion--drop any case with any value missing. Depending on which statistical software you’re using and the patterns and percentage of missing data, the default may be a perfectly acceptable way of dealing with the missing data. Or it may be the worst possible option. And in data analysis, it’s always better if you understand the defaults, know what they’re doing in your data set, and decide for yourself if it’s the best approach. Up until about 10 years ago, there weren’t many other options. There was listwise deletion and there was imputation (inserting estimates for the missing values). Listwise deletion can decimate your sample, your power, and bias results. Imputations solved the power issue, but most of the imputation methods were pretty sketchy, and they biased both overall results and p-values even worse. It was a “damned if you do, damned if you don’t” kind of situation. But it’s different now. In August 1999, just a month after I started at the Statistical Consulting office at Cornell, I saw a talk by Joe Schaefer at the Joint Statistical Meetings about multiple imputation. I was blown away. It seemed too good to be true–it solved pretty much all of the problems with missing data. So I read all that I could, attended a week-long mini-class, and tried it all out. At that time, you had to use special stand-alone software to implement it, and all the ones I tried were a bit clunky to use. Luckily, statistical software has caught up. And in that time, a few new studies have shown that some of the restrictive assumptions of multiple imputation aren’t as restrictive as they at first seemed. So multiple imputation is easier and more accurate than ever. It’s also become clear that some of those old methods aren’t always as horrible as they seemed–there are some situations when listwise deletion works just fine. But it pays to know the difference, and how to implement not just multiple imputation, but maximum likelihood approaches, which also give great outcomes and are quite a bit easier to use. That is what you'll learn in this workshop--the issues involved in missing data, indepth understanding of the approaches and how to implement them, and the steps to diagnose the best approach in your situation. In this workshop, you will learn: Module 1: Missing Data–The Problem and Basic Solutions Part 1. What is Missing Data? In this first module, you'll get the big picture. The real issues, causes, and the solutions. You learn step by step what the different mechanisms are--exactly how random the missingness is, and how that affects your results. You'll get an understanding of where missing data fits in to an analysis strategy and its relationship to other types of problem data--censoring, truncation, and other partial information. And finally, we'll explore two traditional, simple techniques for dealing with missing data--complete case analysis and imputation. They do work in some situations, but they're disasters in others. You will learn how to tell the difference, and how to use them well. Module 2: Multiple Imputation Part 1: What is Multiple Imputation: The Concept Mulitiple Imputation is a godsend in some really hairy missing data situations. Even with up to 50% of data missing, it can give you unbiased parameter estimates, standard errors, and full power. But it has to be done well, and that's not always easy. It requires a solid imputation algorithm and model. This module will teach you, in detail, how to build an imputation model, how it differs from your analysis model, and what to do with the resulting imputed data. Module 3: Multiple Imputation in Practice--Special Cases Part 1: Multiple Imputation for Categorical Variables Multiple Imputation is very simple if only one predictor variable is missing data, it is highly correlated with other variables, and if it is continuous and normally distributed. But real data is never so clean. Luckily, multiple imputation can handle a lot of mess. So in this module, we'll explore how to do mulitple imputaton in many messy situations. So you will know how to make solid analysis decisions even with messy data. Module 4: Maximum Likelihood and NonIgnorable Missing Data Part 1: Maximum Likelihood Approaches Multiple Imputation isn't the only game in town. There are a number of Maximum Likelihood techniques for running models that have all the advantages of Multiple Imputation without the hassle of imputing anything. You may already be using some of them. And if you're running linear models, you can take advantage of these techniques right as you run your models. No extra steps required. It's actually quite easy to do. But it only works for linear models. So in part 1 you'll learn what maximum likelihood estimation is, the types of analyses for which it works, and the exact steps to implement it. Then in part 2, we'll briefly discuss the approaches available for non-ignorable missing data. This is where you really have to make some crazy assumptions because the approaches require you to know something about the missing values. Module 5: Missing Data Diagnosis Part 1: Decision Factors in Choosing an Approach Part of the reason it is so hard to learn how to deal with missing data is that the right approach depends on how much data are missing, patterns of missing data, why the data are missing, and how you will use the data in analysis. These all vary in different types of research. Learning how to analyze the patterns and reasoning for choosing an approach may be the most important part of the workshop. This is actually the first step in dealing with missing data, but we save it for last so you have a clear picture of what your options are once you do the diagnosis. So in this module, you'll learn, in detail, how to analyze the data and the patterns of missingness to figure out the most likely mechanism, the effects of the missing data, and the best way to proceed in dealing with it. This workshop is for you if you:
Prerequisites:
The Instructor: Karen Grace-Martin is a statistical trainer and consultant and an expert on missing data, SPSS, and SAS. She has guided and trained researchers through their statistical analysis for over 15 years. Her focus is on helping statistics practitioners gain an intuitive understanding of how statistics is applied to real data in research studies. Comments from past workshop participants:
We've designed the format to support your learning and convenience: Web based online workshop: The workshop consists of five modules. Each week, we'll release one module of training materials, then follow up with a live Question & Answer session at the end of the week. You can master that material and have all your questions answered before moving on. For each module, you'll get:
Private workshop website We've also created a workshop site that you become a member of. For a year. The site is our home-base for the workshop. It's where you'll find everything you need to support your learning of each module:
But the biggest benefit of the membership site is you have help for a year. Travelling to a live workshop is expensive. Not just the workshop itself, but the flight, the hotels, the meals. Plus you have to take a few days off work. But I always found the worst part about travelling to a live workshop is that you learn so much all at once, then go home to your busy life. You have to catch up on everything you just missed. You don't have a chance to implement what you've learned right away. Then you lose it. And you're on your own. When you do go to implement it, new questions come up. But the instructor is no longer available. And it doesn't feel right to send an email. But on our workshop website, you get support for a year. Watch the videos when it's convenient. Watch them again when a new issue comes up. Ask questions as they come up for you. And as we add resources, and answer more questions, you have access to the updated material. So you really learn it. Inside out. Cool, huh? It's all the advantages of a live workshop without the disadvantages. The Details: Registration for this workshop is closed. If you'd like a notice when we offer this workshop again, join our Advance Discount List by filling in your name and email address below.
|
|
||||||