Introduction to Statistical Modeling for Online Behavior Data
Online platforms generate vast amounts of behavioral data, including subjective ratings (e.g., likes, stars, reactions) and objective engagement metrics (e.g., views, clicks, watch time). While machine learning excels at predicting user behavior, statistical modeling—a cornerstone of empirical research—is critical for interpretable analysis. It helps uncover relationships between user behavior and various factors (e.g., how demographics influence engagement patterns) and enables causal inference in experiments (e.g., measuring the impact of different feed-ranking algorithms on user experience). These methods are widely used in UX research, product analytics, and A/B testing.
This course provides a hands-on introduction to statistical modeling in R, focusing on methods most useful for analyzing behavioral data. We'll cover key concepts and progress to model selection based on outcome type. Topics include logistic regression for binary outcomes (e.g., like/dislike reactions), ordinal regression for rating scales, beta regression for continuous feedback (e.g., sliders), and hierarchical models for nested data (e.g., multiple ratings from the same users). In the end, you'll not only know which model to use but also how to effectively visualize and communicate your findings through clear, interpretable result presentations.
No prior statistical modeling experience is required. If you're familiar with Python or similar languages, the transition to R should be easy. All scripts and exercises are provided.