Estimating Models Using Dummy Variables

[et_pb_section fb_built="1" specialty="on" _builder_version="4.9.3" _module_preset="default" custom_padding="0px|0px|0px|||"][et_pb_column type="3_4" specialty_columns="3" _builder_version="3.25" custom_padding="|||" custom_padding__hover="|||"][et_pb_row_inner _builder_version="4.9.3" _module_preset="default" custom_margin="|||-44px|false|false" custom_margin_tablet="|||0px|false|false" custom_margin_phone="" custom_margin_last_edited="on|tablet" custom_padding="28px|||||"][et_pb_column_inner saved_specialty_column_type="3_4" _builder_version="4.9.3" _module_preset="default"][et_pb_text _builder_version="4.9.3" _module_preset="default" hover_enabled="0" sticky_enabled="0"]

1. QUESTION
Discussion: Estimating Models Using Dummy Variables
You have had plenty of opportunity to interpret coefficients for metric variables in regression models. Using and interpreting categorical variables takes just a little bit of extra practice. In this Discussion, you will have the opportunity to practice how to recode categorical variables so they can be used in a regression model and how to properly interpret the coefficients. Additionally, you will gain some practice in running diagnostics and identifying any potential problems with the model.
To prepare for this Discussion:
â€¢ Review Warnerâ€™s Chapter 12 and Chapter 2 of the Wagner course text and the media program found in this weekâ€™s Learning Resources and consider the use of dummy variables.
â€¢ Create a research question using the General Social Survey dataset that can be answered by multiple regression. Using the SPSS software, choose a categorical variable to dummy code as one of your predictor variables.
By Day 3
Estimate a multiple regression model that answers your research question. Post your response to the following:
1. What is your research question?
2. Interpret the coefficients for the model, specifically commenting on the dummy variable.
3. Run diagnostics for the regression model. Does the model meet all of the assumptions? Be sure and comment on what assumptions were not met and the possible implications. Is there any possible remedy for one the assumption
Learning Resources
Required Readings
Wagner, W. E. (2016). Using IBMÂ® SPSSÂ® statistics for research methods and social science statistics (6th ed.). Thousand Oaks, CA: Sage Publications.
â€¢ Chapter 2, â€œTransforming Variablesâ€ (pp. 14â€“32)
â€¢ Chapter 11, â€œEditing Outputâ€ (previously read in Week 2, 3, 4, 5. 6, 7, 8, and 9)

Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Pine Forge Press/Sage Publications.
Multiple Regression: A Primer, by Allison, P. D. Copyright 1998 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.
â€¢ Chapter 6, â€œWhat are the Assumptions of Multiple Regression?â€ (pp. 119â€“136)

Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Pine Forge Press/Sage Publications.
Multiple Regression: A Primer, by Allison, P. D. Copyright 1998 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.
â€¢ Chapter 7, â€œWhat can be done about Multicollinearity?â€ (pp. 137â€“152)

Multiple Regression: A Primer, by Allison, P. D. Copyright 1998 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.

Warner, R. M. (2012). Applied statistics from bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: Sage Publications.
Applied Statistics From Bivariate Through Multivariate Techniques, 2nd Edition by Warner, R.M. Copyright 2012 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.
â€¢ Chapter 12, â€œDummy Predictor Variables in Multiple Regressionâ€

Applied Statistics From Bivariate Through Multivariate Techniques, 2nd Edition by Warner, R.M. Copyright 2012 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.
Fox, J. (Ed.). (1991). Regression diagnostics. Thousand Oaks, CA: SAGE Publications.
â€¢ Chapter 3, â€œOutlying and Influential Dataâ€ (pp. 22â€“41)
â€¢ Chapter 4, â€œNon-Normally Distributed Errorsâ€ (pp. 41â€“49)
â€¢ Chapter 5, â€œNonconstant Error Varianceâ€ (pp. 49â€“54)
â€¢ Chapter 6, â€œNonlinearityâ€ (pp. 54â€“62)
â€¢ Chapter 7, â€œDiscrete Dataâ€ (pp. 62â€“67)
Note: You will access these chapters through the Walden Library databases

[/et_pb_text][et_pb_text _builder_version="4.9.3" _module_preset="default" width_tablet="" width_phone="100%" width_last_edited="on|phone" max_width="100%"]

Subject	Statistics	Pages	6	Style	APA

[/et_pb_text][/et_pb_column_inner][/et_pb_row_inner][et_pb_row_inner module_class="the_answer" _builder_version="4.9.3" _module_preset="default" custom_margin="|||-44px|false|false" custom_margin_tablet="|||0px|false|false" custom_margin_phone="" custom_margin_last_edited="on|tablet"][et_pb_column_inner saved_specialty_column_type="3_4" _builder_version="4.9.3" _module_preset="default"][et_pb_text _builder_version="4.9.3" _module_preset="default" width="100%" custom_margin="||||false|false" custom_margin_tablet="|0px|||false|false" custom_margin_phone="" custom_margin_last_edited="on|desktop"]

1	(Constant)	.000
	RS HIGHEST DEGREE	.000
	WWW HOURS PER WEEK	.749
	Sex_d1	.015

Table 4: Table of significance of variables.

The variables are all significant (p<0.05) in the model explaining the Income variable, except the hours of work per week variable which posted a p-value < 0.05.

Therefore, the dummy variable Gender is significant in the model for predicting Income.

The model meets all the assumptions of regression (linearity, homoscedasticity, non-multicollinearity and normality) as outlined by Wagner (2016).

Therefore, the model is conclusively significant in explaining the response variable, much more when the dummy variable is used in estimation.

Model Summary

Model	R	R Square	Adjusted R Square	Std. Error of the Estimate	Change Statistics
Model	R	R Square	Adjusted R Square	Std. Error of the Estimate	R Square Change	F Change

1	.216^a	.047	.045	2.798	.047	22.362

Table 1: Regression summary of the variables uncontrolled by gender.

Without gender controlling the equation, the model explains 21.6% variance of the dependent variable.

Model Summary

Model	R	R Square	Adjusted R Square	Std. Error of the Estimate	Change Statistics
Model	R	R Square	Adjusted R Square	Std. Error of the Estimate	R Square Change	F Change

1	.230^a	.053	.050	2.790	.053	16.955

Table 2: Regression summary of the gender-controlled variables.

The results indicate that inclusion of dummy variable gender improves the model as 23% variance of the dependent variable “income” is explained by the variables now.

The p-value<0.05 indicates that both models (with or without dummy variable) are significant in explaining the income variable.

Coefficients^a

Model		Unstandardized Coefficients		Standardized Coefficients	t
Model		B	Std. Error	Beta	t

1	(Constant)	9.689	.210		46.053
	RS HIGHEST DEGREE	.517	.077	.216	6.699
	WWW HOURS PER WEEK	.002	.006	.010	.320
	Sex_d1	-.448	.185	-.078	-2.429

Table 3: Regression Table of the variables when dummy variable gender is included.

The model therefore becomes:

Y = 9.689 + 0.517X₁ + 0.002X₂ – 0.448X₃+ 2.79

Where X₁denotes “Highest Degree Attained”.

X₂demotes “Hours Worked per Week”.

And X₃denotes “Female”.

This model means that with all factors constant, the starting salary undeterred by the variables in the model is $9.689. The other variables cause variation in the model as indicated in the model equation. The dummy variable indicates that:

Since “Male” value was the reference category, its inclusion in the model would cause multicollinearity. Therefore, it is used to compare the category entered in the model (female in this case). The results point out that when controlling for the effects of education and number of hours worked, females earn almost half of an income (0.445) less than males.

Coefficients^a

Model

Sig.

1	(Constant)	.000
	RS HIGHEST DEGREE	.000
	WWW HOURS PER WEEK	.749
	Sex_d1	.015

Table 4: Table of significance of variables.

The variables are all significant (p<0.05) in the model explaining the Income variable, except the hours of work per week variable which posted a p-value < 0.05.

Therefore, the dummy variable Gender is significant in the model for predicting Income.

The model meets all the assumptions of regression (linearity, homoscedasticity, non-multicollinearity and normality) as outlined by Wagner (2016).

Therefore, the model is conclusively significant in explaining the response variable, much more when the dummy variable is used in estimation.

This question has been answered

References

Cohen, P., West, S. G., & Aiken, L. S. (2014). Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press.

Wagner, W. E. (2016). Using IBM® SPSS® statistics for research methods and social science statistics (6th ed.). Thousand Oaks, CA: Sage Publications.

Warner, R. M. (2012). Applied statistics: from bivariate through multivariate techniques: from bivariate through multivariate techniques. Sage.

[/et_pb_text][/et_pb_column_inner][/et_pb_row_inner][et_pb_row_inner _builder_version="4.9.3" _module_preset="default" custom_margin="|||-44px|false|false" custom_margin_tablet="|||0px|false|false" custom_margin_phone="" custom_margin_last_edited="on|desktop" custom_padding="60px||6px|||"][et_pb_column_inner saved_specialty_column_type="3_4" _builder_version="4.9.3" _module_preset="default"][et_pb_text _builder_version="4.9.3" _module_preset="default" min_height="34px" custom_margin="||4px|1px||"]

Related Samples

[/et_pb_text][et_pb_divider color="#E02B20" divider_weight="2px" _builder_version="4.9.3" _module_preset="default" width="10%" module_alignment="center" custom_margin="|||349px||"][/et_pb_divider][/et_pb_column_inner][/et_pb_row_inner][et_pb_row_inner use_custom_gutter="on" _builder_version="4.9.3" _module_preset="default" custom_margin="|||-44px||" custom_margin_tablet="|||0px|false|false" custom_margin_phone="" custom_margin_last_edited="on|tablet" custom_padding="13px||16px|0px|false|false"][et_pb_column_inner saved_specialty_column_type="3_4" _builder_version="4.9.3" _module_preset="default"][et_pb_blog fullwidth="off" post_type="project" posts_number="5" excerpt_length="26" show_more="on" show_pagination="off" _builder_version="4.9.3" _module_preset="default" header_font="|600|||||||" read_more_font="|600|||||||" read_more_text_color="#e02b20" width="100%" custom_padding="|||0px|false|false" border_radii="on|5px|5px|5px|5px" border_width_all="2px" box_shadow_style="preset1"][/et_pb_blog][/et_pb_column_inner][/et_pb_row_inner][/et_pb_column][et_pb_column type="1_4" _builder_version="3.25" custom_padding="|||" custom_padding__hover="|||"][et_pb_sidebar orientation="right" area="sidebar-1" _builder_version="4.9.3" _module_preset="default" custom_margin="|-3px||||"][/et_pb_sidebar][/et_pb_column][/et_pb_section]

Ready to attend?

Ready to join our block community of business leaders for four days of virtual sessions on driving developer happiness and boosting productivity?

Request a Quotation

Estimating Models Using Dummy Variables

References

Related Samples

Our Benefits

Our Services

Free Features

Ready to attend?