September 22, 2023

In continuation with my project analysis, taking Professor’s explanation of the interaction term and the quadratic model that includes second-order degree terms of the predictor variables in the regression equation, as baseline, I applied these concepts to the cdc diabetes combined dataset. On adding the interaction term (% OBESE * % INACTIVE) to the Regression equation, it is seen that the accuracy of the model has improved slightly to 0.365. Now the regression equation for the Multi Linear model involving interaction term for this dataset is expressed as:

% DIABETIC = -10.0647 + 1.1534*% INACTIVE + 0.7430*% OBESE – 0.0496*% INACTIVE*% OBESE

Furthermore, I have built a quadratic model for this dataset and it is seen that the accuracy has improved a little to around 0.385. The regression equation for this model is expressed as:

% DIABETIC = -11.5906 + 0.4779*% INACTIVE + 0.4779 *% OBESE + 0.0197 *% – 0.0494*% OBESE^2 – 0.0197*% INACTIVE^2

In today’s class, Professor answered to questions on the dataset which included details on how collinearity analysis between the predictor variables depends on the scenario of the linear model – whether we are using the model for prediction or for explaining any positive correlation between the independent variables.

Additionally, Professor also explained the concepts of Paired and Unpaired T-Tests and how they are useful in interpreting the statistical significance of difference in means between 2 groups, while ANOVA (Analysis of Variance) is used for comparing the difference in means among multiple groups.

Paired T-Test:

A paired t-test, also known as a dependent t-test or matched-pairs t-test, is used to compare the means of two related groups or conditions. It’s called ‘paired’ because it involves paired data points. These paired data points represent measurements or observations taken on the same subjects or items under two different conditions or time points.

Unpaired T-Test:

An unpaired t-test, also known as an independent t-test, is used to compare the means of two independent groups or conditions. Unlike paired data, where each data point is related to another, unpaired data involves two distinct and unrelated groups or conditions.

Leave a Reply

Your email address will not be published. Required fields are marked *