PQHS 453 – Categorical Data Analysis

Table of Contents

I. Welcome Video

II. Course Overview at a Glance

III. Course Objectives

IV. Course Description

V. Tentative Schedule

VI. Sample Lecture

I. Welcome Video


Videos removed on print.
Jump to top

II. Course Overview at a Glance

Time & Place ONLINE, full Summer semester, 6/3/2024 - 7/29/2024
Instructor Abdus Sattar, PhD
Office Wood building: W-G51
Teaching Assistant Xueyi Zhang / Email: xxz586@case.edu
Office Hours By appointment. Phone: 1-216-368-1501, Email: sattar@case.edu
Course Web Page canvas.case.edu
Textbook (Required) Analysis of Categorical Data with R by C. Bilder and T. Loughin
Prerequisites:

  • This course is designed for advanced undergraduate students, and graduate students in Biostatistics or other quantitative sciences with background and adequate preparation in statistical methods (at least one statistics course, equivalent to PQHS 431 course experience). Undergraduate student require instructor’s permission for the enrollment.
  • Knowledge in statistical computing or statistical software package is required. We aim to use statistical software R. Some programming experience will be helpful.

Jump to top

II. Course Objectives

  1. Gain proficiency, specifically, in logistic regression, and broadly, in generalized linear models
  2. Acquire competency in standard and cutting edge categorical data analysis methods
  3. Hone skills by applying categorical data analysis methods in analyzing data

Jump to top

III. Course Description

Categorical data are often encountered in many disciplines including the fields of clinical and biological sciences. Analysis methods for categorical data are different from the analysis methods for continuous data. There is a rich collection of methods for categorical data analysis. The elegant “odds ratio” interpretation associated with categorical data is a unique one. This online course will cover cross-sectional categorical data analysis theories and methods. From this course, students will learn standard categorical data analysis methods and its applications to the biomedical and clinical studies. This particular course will focus mostly on statistical methods for categorical data analysis arising from various fields of studies including clinical studies; those who take it will come from a wide variety of disciplines. The course will include video lectures, group discussion and brainstorming, homework, simulations, and collaborative projects on real and realistic problems in human health tied directly to the student’s own professional interests. Focus will be given to logistic regression methods. Topics include (but are not limited to) binary response, multicategory response, count response, model selection and evaluation, exact inference, Bayesian methods for categorical data, and supervised statistical learning methods. This course stresses how the core statistical principles, computing tools, and visualization strategies are used to address complex scientific aims powerfully and efficiently, and to communicate those findings effectively to researchers who may have little or no experience in these methods.

Example: Analyzing Cancer Survivors’ Emotional Concerns


Jump to top

IV. Tentative Schedule

Week Chapters & Sections Topics Video/Slides
1 Introduction to categorical data analysis (CDA) Introduction
Chapter 1: Introduction to a Binary Response Analysis
1.1.2 One-binary variable Point Estimation
1.1.2, 1.1.3 One binary variable Confidence Interval
1.1.2 One binary variable Hypothesis Test
1.2.1 Two binary variables Two Binary Variables
1.2.2, 1.2.3 Two binary variables Difference in Proportions
1.2.4 Two binary variables Relative Risk
1.2.5 Two binary variables Odds Ratio
2 Chapter 2: Regression Models for a Binary Response
2.1 Introduction
2.2 Logistic regression models Logistic Regression Model
2.2.1 Logistic regression models Parameter Estimation
2.2.2 Logistic regression models Hypothesis testing
2.2.2 Logistic regression models Deviance
2.2.3 Logistic regression models Odds Ratio
2.2.4 Logistic regression models Probabilities
2.2.5 Logistic regression models Interactions
2.2.5 Logistic regression models Quadratic Terms
2.3 Generalized linear models GLM
3 Chapter 3: Analyzing a Multicategory Response
3.1 Multinomial Probability Distribution
3.2.1, 3.2.2 IxJ contingency tables & inference procedures I X J Contingency Table
3.2.3 IxJ contingency tables and inference procedures I X J Independence
3.3 Nominal response regression models Nominal Response Models
3.4.1, 3.4.2 Nominal response regression models Ordinal Response Models
3.4.3 Non-proportional odds model Non-Proportional Odds Models
4 Chapter 4: Analyzing a Count Response
4.1 Poisson distribution, Poisson likelihood & inference Poisson Model for Count Data
4.2.1, 4.2.2 Parameter estimation & inference Poisson Regression Models I
4.2.3, 4.2.4 Model interpretation, Categorical explanatory variables Poisson Regression Models II
4.3, 4.4 Other topics on analyzing a count response Poisson Rate Regression & Zero Inflation
5 Chapter 5: Model Selection and Evaluation
5.1 Variable Selection
5.2.1 Tools to access model fit Residuals
5.2.2, 5.2.3 Tools to access model fit Goodness of Fit & Influence
5.3 Over-Dispersion Detection & Solutions
Potential Project Topics
6-8 6.1 Binary responses and testing error Completed Project Submission Deadline: 7/29/2024
6.1 Binary response & testing error
6.2 Exact inference
6.3 Complex survey data analysis
6.4 “Choose all that apply” data
6.5 Categorical longitudinal data
6.6 Bayesian methods for categorical data

Jump to top

VI. Sample Lecture


Videos removed on print.
Jump to top

Materials


PDF preview removed on print.
Jump to top