##### Can you assist me with questions 1, 2 and 3? the questions relate-(Answered)

**Description**

Step-by-step Instant Solution

**Question**

Can you assist me with questions 1, 2 and 3?

the questions relate to multivariate regression

THE UNIVERSITY OF SYDNEY

FACULTY OF ARTS AND SOCIAL SCIENCES

SCHOOL OF ECONOMICS

ECMT1020: Introduction to Econometrics

Assignment 2

Due date: 27 May 2016, by 4pm

Deadline: 3 June 2016, by 4pm

Instructions:

? This assignment consists of four questions. The ?rst three questions are worth 20 marks each and

the last question is worth 40 marks, so the entire assignment is worth 100 marks. Partial credit

may be given for each sub-question. Your mark for this assignment determines 5% of your ?nal

grade for this course.

? This paper consists of one front page and three pages with questions. There are four pages in total.

? Use Stata, and no other software, to perform the calculations for question 4. The data set that you

need can be found on Blackboard. In addition to your answers to the questions, also include the

relevant Stata commands and output, for example by copying and pasting.

? Assignments must be submitted in hard copy (printed, legibly handwritten, or a combination of

both) via the drop boxes in the School of Economics foyer, which is located on the second ?oor

of the Merewether Building (H04). All submissions must include a completed, signed and dated

?Individual Assessment Cover Sheet?, which can also be found on Blackboard.

? Assignments not submitted on or before the due date stated above are subject to penalty; refer

to sydney.edu.au/arts/current students/late work.shtml. That is, two marks will be subtracted for

each working day or part thereof that has passed after the due date. Concretely, submissions

received after 4pm on 27 May but before 4pm on 30 May will be subject to a two-mark penalty,

submissions received between 4pm on 30 May and 4pm on 31 May incur a four-mark penalty, et

cetera. After the deadline, assessments cannot be accepted and a mark of 0 will be awarded.

Good luck!

Question 1 (20 marks). In this question, I?m after intuitive answers rather than mathematical ones.

Indeed, some of the underlying mathematical arguments are far beyond the scope of this course.

(a) (4 marks) Without even thinking about the central limit theorem or any other asymptotic arguments, explain why a data set with n = 30 observations is nowhere near large enough to

estimate a model with k = 40 parameters using ordinary least squares regression.

(b) (4 marks) To estimate the partial effect of x2 on y while x3 is kept constant, we need to estimate a

regression model like y = ?1 +?2 x2 +?3 x3 +u. Some students ?nd this counterintuitive:

if we?re keeping x3 constant, why should it be in our model at all? Explain why this is

the right thing to do.

(c) (4 marks) Consider again the model with two explanatory variables from part (b). There are two

situations in which the partial effect of x2 on y is equal to the total effect of x2 on y;

what are these situations?

?

(d) (4 marks) Why should none of R2 , R2 , information criteria, and F tests be used to compare models

with yt on the left hand side to models with ln yt instead?

(e) (4 marks) Recall that the coef?cient estimators b2 and b3 are random variables. Why does it make

sense that these two random variables are usually correlated with each other?

Question 2 (20 marks). We have often used the result that TSS = ExpSS + RSS, for example to justify

the use of R2 and of F tests. The purpose of this question is to prove that result.

(a) (6 marks) Prove that (yi ? y )2 = (yi ? yi )2 + (?i ? y )2 + 2 (yi ? yi ) (?i ? y ). (Hint: for this part,

?

?

y

?

? y

?

the de?nitions of yi and y are completely irrelevant; you may ?nd it easier to just call

?

?

them a and b or something similar.)

(b) (6 marks) Use the result of part (a) to establish that TSS = ExpSS + RSS + 2 (n ? 1) Cov [e, y ].

?

(c) (6 marks) Prove that Cov [e, y ] = 0. You may take it as given that the residual is uncorrelated with

?

each of the regressors.

(d) (2 marks) Complete the proof that TSS = ExpSS + RSS.

Question 3 (20 marks). We have collected data on the annual number of cars of twenty different brands

sold in Australia (sales, in number of cars), as well as each brand?s average retail price (price, in

dollars), their annual marketing expenditure (mark, also in dollars), and whether or not they assemble

some of their cars domestically (domestic, dummy variable). We wish to investigate how all of these

factors in?uence sales, and we settle on the following regression model:

. regress lnsales lnprice lnmark domestic

Source |

SS

df

MS

---------+---------------------------Model | 15.2271807

3

5.0757269

Residual | 4.73289348

16 .295805843

---------+---------------------------Total | 19.9600742

19 1.05053022

Number of obs

F(XXXX, XXXX)

Prob > F

R-squared

Adj R-squared

Root MSE

=

20

= 17.16

= 0.0000

= XXXX

= XXXX

= .54388

--------------------------------------------------------------------lnsales |

Coef. Std. Err.

t

P>|t|

[95% Conf. Interval]

---------+----------------------------------------------------------lnprice | -1.389525 .2392136

-5.81

0.000

-1.896635 -.8824151

lnmark | .1775161

.125631

1.41

0.177

-.0888098

.443842

domestic | .6156965

.389287

1.58

0.133

-.209555

1.440948

_cons | 21.35649

3.31581

6.44

0.000

14.32728

28.38569

--------------------------------------------------------------------(a) (8 marks) I have removed four numbers from this table, indicated by ?XXXX?. Compute them.

(b) (4 marks) Describe what the coef?cient estimate ?1.389525 means, in economic terms.

(c) (3 marks) Suppose we wish to test the claim that the Australian car market is completely pricedriven, so that marketing and whether production is done domestically are irrelevant.

Regressing lnsales only on lnprice gave an RSS of 10.69, whereas regressing

lnsales only on lnmark and domestic gave an RSS of 14.72. Which of these two

numbers is useful for testing our claim, and why?

(d) (5 marks) Test the claim described in part (c).

Question 4 (40 marks). The data set education.dta contains data on the years of education of a

random sample of 718 Americans, as well as the same information for both of their parents. It is likely

that parents? achievements have some predictive power for their children?s outcomes, as a result of both

a hereditary component of intelligence and the possibility that higher educated parents stimulate their

children more to do well at school. Thus, we consider the model educi = ?1 +?2 meduci +?3 feduci +ui .

(a) (5 marks) We will ignore any heteroskedasticity and autocorrelation problems in the remainder of

this question. However, discuss whether it is likely that these problems are present in

our model.

(b) (5 marks) Estimate this model, and provide interpretations for the three estimated coef?cients.

(c) (5 marks) Give 95% con?dence intervals for both the conditional mean and the actual value of

education for people whose mother has 16 years of education, while the father has 12.

(d) (5 marks) Explain what the restriction ?2 = ?3 means, in economic terms.

(e) (5 marks) Test the restriction in part (d), and show that it cannot be rejected. (Note: I want to see

that you have estimated both the restricted and the unrestricted model. Feel free to use

Stata?s test command to check your result, but using only that would be too easy.)

(f) (5 marks) Use the restricted model from part (e) to repeat the prediction exercise in part (c). Intuitively, why are the resulting con?dence intervals narrower this time?

(g) (5 marks) Go back to the original model, where ?2 and ?3 are allowed to be different. Now, what

would the restriction ?2 + ?3 = 1 mean, in economic terms?

(h) (5 marks) Test the restriction in part (g). (The same note as in part (e) applies.)

This is the last page of the assignment.

Paper#9256872 | Written in 27-Jul-2016

Price :*$17.85*