Testing to Find Relationships Between Two Variables

 

There are three main statistical families of techniques that are commonly used to examine the relationship or association between variables. The chi-square family looks at the relationship strength of categorical variables, whereas the correlation family looks at the strength of linear relationship of interval/ratio variables—but some tests in this family can also examine other measurement-level combinations. The regression family goes one step further than the previous tests by assessing the relative strength of one or more variables in predicting the change in another variable.

Before committing to any particular test, you must clarify the nature of the relationship you are interested in, determine how many variables are involved, and determine the measurement levels of each variable.


You are encouraged to review the chi-square, correlation, and regression materials from previous weeks. Then, review How to Choose a Statistical Test and the test-selection tutorials linked in the Resources to determine which test is most likely to be appropriate for your data type.


Using the Framingham study data set, perform and interpret statistical tests that answer the following research questions. Then, provide a written analysis of your results.

At baseline, was there a significant association between diabetes (variable: diabetes1) and smoking status (variable: cursmoke1)?
At baseline, how much variation in participant cholesterol levels (variable: totchol1) could be explained by the variation in an individual's BMI (variable: bmi1)?
Written Analysis Format and Length
Format your analysis using APA style.


Perform the appropriate statistical tests (based on the assumption test).
Provide your rationale for test selection.
Interpret the results of your statistical tests (chi-square, correlation, and regression) for each research question.
Consider associated caveats and limitations.
Determine the practical, public health-related implications of your statistical tests (chi-square, correlation, and regression).
What evidence do you have that validates your conclusions?
Write clearly and concisely, using correct grammar, mechanics, and APA formatting.
Write for an academic audience, using appropriate statistical terminology, style, and form.
Express your main points and conclusions coherently.
Proofread your writing to minimize errors that could distract readers and make it more difficult for them to focus on the substance of your statistical analysis.

 

 

Sample Answer

 

 

 

 

 

 

 

This analysis uses the provided structure to answer two research questions based on the Framingham Heart Study dataset, focusing on the appropriate statistical tests, interpretation, and public health implications.

 

Statistical Analysis of Framibility Study Baseline Data

 

 

Analysis of Research Question 1: Diabetes and Smoking Status

 

 

Research Question

 

At baseline, was there a significant association between diabetes (variable: diabetes1) and smoking status (variable: cursmoke1)?

 

Rationale for Test Selection

 

The variables diabetes1 (0=No, 1=Yes) and cursmoke1 (0=No, 1=Yes) are both categorical (specifically, nominal and dichotomous). To examine the strength of the association between two categorical variables, the Chi-Square Test of Independence ($\chi^2$) is the most appropriate statistical technique. This test determines if the observed frequencies in the cross-tabulation table are significantly different from the frequencies that would be expected if the two variables were truly independent.

Statistical Test and Interpretation (Chi-Square)

 

A Chi-Square Test of Independence was performed to assess the association between baseline diabetes status and current smoking status. The null hypothesis ($H_0$) is that there is no association between the two variables.

Test Statistic: $\chi^2$ = [Insert Calculated Chi-Square Value]

Degrees of Freedom (df): [Insert Calculated df]

p-value: [Insert Calculated p-value]

Contingency Coefficient or Cramer’s V: [Insert Calculated Measure of Association]

Interpretation:

Assuming the analysis yielded a statistically significant result ($p < 0.05$), the null hypothesis is rejected. This indicates that there is a statistically significant association between baseline diabetes status and current smoking status.