-
- QUESTION
Discussion: Multiple Regression
As with the previous week’s Discussion, this Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will once again work with a real, secondary dataset to construct a research question, estimate a multiple regression model, and interpret the results.
Whether in a scholarly or practitioner setting, good research and data analysis should have the benefit of peer feedback. For this Discussion, you will post your response to the hypothesis test, along with the results. Be sure and remember that the goal is to obtain constructive feedback to improve the research and its interpretation, so please view this as an opportunity to learn from one another.To prepare for this Discussion:
• Review this week’s Learning Resources and media program related to multiple regression.
• Create a research question using the General Social Survey that can be answered by multiple regression.By Day 3
Use SPSS to answer the research question. Post your response to the following: Please follow this outline.
1. What is your research question?
2. What is the null hypothesis for your question?
3. What research design would align with this question?
4. What dependent variable was used and how is it measured?
5. What independent variable is used and how is it measured?
6. What other variables were added to the multiple regression models as controls?
7. What is the justification for adding the variables?
8. If you found significance, what is the strength of the effect?
9. Explain your results for a lay audience, explain what the answer to your research question.
Be sure to support your Main Post and Response Post with reference to the week’s Learning Resources and other scholarly evidence in APA Style.Required Readings
Frankfort-Nachmias, C., & Leon-Guerrero, A. (2018). Social statistics for a diverse society (8th ed.). Thousand Oaks, CA: Sage Publications.
• Chapter 12, “Regression and Correlation†(pp. 325-371) (previously read in Week 8)
Wagner, W. E. (2016). Using IBM® SPSS® statistics for research methods and social science statistics (6th ed.). Thousand Oaks, CA: Sage Publications.
• Chapter 11, “Editing Output†(previously read in Week 2, 3, 4, 5. 6, 7, and 8)Walden University Library. (n.d.). Course Guide and Assignment Help for RSCH 8210. Retrieved from http://academicguides.waldenu.edu/rsch8210
For help with this week’s research, see this Course Guide and related weekly assignment resources.
Subject | Business | Pages | 8 | Style | APA |
---|
Answer
Multiple Regression Model Formulation
Multiple linear regression is an improved form of the simple or bivariate linear regression. In this case, the influence of more than one independent variable on a single dependent variable is examined. Therefore, multiple linear regression analysis is associated with exploration of a single dependent or response variable and multiple/several independent or explanatory variables (Frankfort-Nachmias, & Leon-Guerrero, 2017). In multiple regression model, two or more independent variables are used to predict the dependent variable. The general purpose of multiple regression, that also influenced its formulation in the early 20th century, is enhanced expositive examination of the relationship between several independent variables and a dependent variable, the model therefore is a linear expression of this relationship (Wagner, 2016). In the process of multiple regression modeling, certain assumptions like linearity, homoscedasticity, non-multicollinearity and normality have to be made. This paper reports on a multiple regression analysis model formation based on the General Social Survey dataset.
Model Formation
The research question for the study is: does an individual’s weekly hours of work and the degree of education have a significant effect on their income? The research question seeks to establish a correlation between the level of income and the level of an individual’s education and the duration of weekly work reporting. The null hypotheses for the research questions is:
- The degree of education and an individual’s hours of work have no significant effect on income.
Being an experimental study with greater internal validity the research design that would align with the question is randomization design. The experiment will be repeated under uniform investigational conditions as the participants will be assigned different treatment and control groups. The approach will establish an evidence of causation between the variables of income, work hours, weekly reporting or hours spent watching television, by showing the existing associations between them. The approach has the capacity to sort out the magnitude and existence of causal effects of one or more independent variables upon a dependent variable of interest at a given time. Randomization will facilitate assigning of participants to various groups to eliminate selection bias and any confounding bias while ensuring the associate groups are comparable regardless of any other factors other than the ones under investigation.
The dependent variable was individual’s income (measured by scales of respondents’ income). It was the researcher’s major point of interest in determining possible changes to the dependable variable. The research establishes change in income as influenced by working duration recorded working hours per week), and the level of education measured by degree attained by the respondent). The variable was measured by establishing a causal relationship between the four metric variables, one continuous dependent variable and three more independent variables. The assumption behind the multiple regression analysis was that the independent variables have a direct effect on the stability of income as the dependent variable.
Results
Model Summaryb |
|||||||
Model |
R |
R Square |
Adjusted R Square |
Std. Error of the Estimate |
Change Statistics |
||
R Square Change |
F Change |
df1 |
|||||
1 |
.759a |
.576 |
.547 |
2.318 |
.576 |
20.339 |
2 |
Model Summaryb |
||
Model |
Change Statistics |
|
df2 |
Sig. F Change |
|
1 |
30 |
.000 |
a. Predictors: (Constant), NUMBER OF HOURS USUALLY WORK A WEEK, RS HIGHEST DEGREE |
b. Dependent Variable: RESPONDENTS INCOME |
|
Table 1: Model summary table.
The model formulated explains 57.6% variance of the response variable. The ANOVA test confirms the significance of the model, with p-value < 0.05.
Coefficientsa |
||||||
Model |
Unstandardized Coefficients |
Standardized Coefficients |
t |
Sig. |
||
B |
Std. Error |
Beta |
||||
1 |
(Constant) |
1.492 |
1.386 |
|
1.077 |
.290 |
RS HIGHEST DEGREE |
.687 |
.291 |
.281 |
2.357 |
.025 |
|
NUMBER OF HOURS USUALLY WORK A WEEK |
.179 |
.031 |
.688 |
5.769 |
.000 |
Table 2: regression table of the formulated model.
The t-test indicates that the IVs are significant explanatories in the model. The p-values < 0.05 leads us to reject the null hypothesis of no relationship, and further, the correlation coefficients confirm the degree of linear relationship between the independent variables and the dependent variables. Therefore, there is a significant linear correlation between Income and the two independent variables (level of education and the time duration employed in delivering work). These two variables, therefore, effect the level of income independently.
The model therefore, is
Y = 1.492 + 0.687X1 + 0.17X2 + 2.318
Where Y denotes the Income Variable.
X1 denotes the Education level variable
X2 denotes the number of hours worked in a week.
Further, the IVs are controlled and changed in a scientific experiment to determine the extent of their effects on the dependent variable. The nature of employment variable (whether self-employed or works for somebody) is used to control the effects of these variables. This is determined by the hypothetical thought that this variable is likely to consistently affect income. The results are as follows:
Coefficientsa |
||||||
Model |
Unstandardized Coefficients |
Standardized Coefficients |
t |
Sig. |
||
B |
Std. Error |
Beta |
||||
1 |
(Constant) |
-2.554 |
2.879 |
|
-.887 |
.382 |
RS HIGHEST DEGREE |
.666 |
.285 |
.272 |
2.341 |
.026 |
|
NUMBER OF HOURS USUALLY WORK A WEEK |
.189 |
.031 |
.727 |
6.117 |
.000 |
|
R SELF-EMP OR WORKS FOR SOMEBODY |
1.964 |
1.234 |
.189 |
1.592 |
.122 |
Table 3: Regression table with control variable.
While the control is not a significant predictor (p-value > 0.05), its effect on the model is significant as its inclusion in the model increases the r-squared value (with inclusion of nature of employment, 61% variation in the response variable is explained, which is way more than the 57.6% explained without it). The control variable also alters the effect of the controlled IVs as in table 3 above. However, the order of importance of the variables ranks it as the least among predictors.
Therefore, Number of hours of work and level of education significantly influences an individual’s income. From the model, it is clear that the higher the education level attained, the more likely is it for one to get higher income, and higher number of working hours impacts higher income. That is the linear relationship between the two variables. The two variables combined explains 56.7% of the income possibilities. However, with an inclusion of the nature of employment of an individual (whether self-employed or employed by someone), the income possibilities are explained at 61% in prediction. The model above can be used to predict an individual’s income with the percentage variation possibilities explained.
References
Frankfort-Nachmias, C., & Leon-Guerrero, A. (2017). Social statistics for a diverse society. Sage Publications. Wagner, W. E. (2016). Using IBM® SPSS® statistics for research methods and social science statistics (6th ed.). Thousand Oaks, CA: Sage Publications.
|