- Your friend has recently run a regression and has obtained the following numbers:
Your friend has asked you to help them recover the missing information.
- a) Demonstrate how your friend can recover the missing values from the regression. (10 marks)
- b) Using the information above perform a hypothesis test on a single variable from the model (you may wish to utilise the statistical tables at the end of this paper) (5 marks)
- c) Perform an F-Test of the overall significance of the regression performed by your friend. What does this tell you (7 marks)
- d) What do the results of the above regression tell you about the influence of the variables in the regression on the wage level of the people in the sample. (3 marks)
The above information was covered in term 1:
- a) We looked at a similar exercise to this in Seminar 6, however the statement of equations needed to cover this material can be found in lectures 2-5.
- b) and c) Make use of the hypothesis testing material we covered directly in lectures 5 and 6, which contained the methods for testing this in conjunction with the statistical tables.
- d) Is merely asking you to state in words what the test results mean for the model.
- You wish to estimate the following model
where is an Nx1 vector of observations is a Nxp matrix of variables, is a px1 vector of parameters and is an Nx1 vector of error terms.
Answer each of the following questions noting explicitly any assumptions you make:
- a) Explain how you know an OLS estimator of to be BLUE (10 marks)
- b) Identify the distribution of the OLS estimator you derived in part, stating any new assumptions you make (8 marks)
- c) Describe how a) and b) can be useful in identifying a t-test procedure for testing the significance of a single variable. (7 marks)
- a) Covers the argument found in lecture 3 which shows that any other unbiased estimator must have a larger variance.
- b) This is covered in the end of lecture 3 into lecture 4 in which the variance of the estimator is derived and then the assumption regarding normality is used to state the distribution of the estimator, alongside the material from a.
- c) Uses the material about the derivation of the t-test procedure from lecture 5, remember you only need to state the unproven theorems regarding the distributions of certain variables.
- You have just run the following model
Which models the salary of baseball players as a function of their years in the league (years), the average games they play a year (gamesyr) their batting average (bavg), their home runs per year (hrunsyr) and runs batted in per year (rbisyr).
- a) What is the intepretation of the parameters on the explanatory variables. (5 marks)
- b) Explain how you might test that the impact of an extra game per year is the same as an extra home run (8 marks)
- c) Someone says that the only determinants of player salaries are home runs and years in the league. How would you test such a statement? (12 marks)
- a) The parameters will tell us the % change in salary as each of the variables increase. i.e. beta_1 is the % increase in salary for each extra year played.
- b) The null hypothesis here is that the beta_2=beta_4 or beta_2-beta_4=0, we can then use the method seen in the lecture to estimate theta=beta_2-beta_4 and test whether this is equal to zero using a t-test. See lecture 6 for more detail on this.
- c) This is an f-test as the null hypothesis is that Beta_2=Beta_3=Beta_5=0, hence you will need to specify a restricted model, an appropriate t-test and a testing rule. The extra information on the test paper will then help you to operationalise this test.
- You have been asked by a University to consider some data they have collected on people who smoke. The survey collects information on earnings, education, marital and family status as well as other basic demographic information.
- a) You have been asked by the University to propose a model which considers the determinants of earnings for the panel of respondents, which includes people who work and some who do not. Propose a model, stating any assumptions you make and explaining how the model will be estimated. (15 marks)
- b) You are also asked to propose a model which will identify the factors which might increase the probability of a person having attended university. Outline such a model and note any factors of interest regarding its estimation. (10 marks)
a and b cover the material from the cross section lecture notes. a) uses censored data and Heckman’s two step model, properly explained – in accordance with the lecture material – would represent a good answer here.
- b) Is a binary variable and can be set up and discussed in the format of a limited dependent variable model, see the associated material in lecture 13 which can be used to answer this question.
Annex: Statistical Tables
Table of T-distribution Critical Values
|Significance Level (split between 2 tails)|
Z-Distribution Cumulative Probability Values
|Cumulative Area Under the Normal Distribution (Z probabilities)|
| minus 0|
i.e. -0.01 etc
End of Exam Paper