Statistics and Econometrics

This study requires each student to compile a sample of quarterly observations, in the UK, over a period of ten years. Thus, n=40. Each student will be given a specific period and area. You will collect data on these two variables.

(1)   Birth rates (BR) (2)   GDP per capita (GDP)

1. Present your data in a table showing the names of the variables. Make sure the full definitions and sources of each variable are given. Table 1 shows a regression for birth rate in the given period against the GDP per capita.

Table 1

Years   Births rates (Y) GDP per capita £ (x)               Y*x                 (X)

1991 Q1         15.86   3941                            62504.26         15531481

1991 Q2         16        3923                            62,766             15389929

1991 Q3         16.79   3905                            65564.95         15249025

1991 Q4         15.7     3905                            61308.5           15249025

1992 Q1         15.92   3907                            62199.44         15264649

1992                   16.19   3898                            63108.62         15194404

1992 Q3         16.4     3915                            64206              15327225

1992 Q4         15.06   3934                            59246.04         15476356

1993 Q1         15.52   3956                            61397.12         15649936

1993 Q2         15.66   3972                            62201.52         15776784

1993 Q3         16.22   4004                            64944.88         16032016

1993 Q4         15.05   4037                            60756.85         16297369

1994 Q1         15.3     4078                            62393.4           16630084

1994 Q2         15.59   4132                            64417.88         17073424

1994 Q3         15.52   4187                            64982.24         17530969

1994 Q4         14.77   4215                            62255.55         17766225

1995 Q1         14.64   4227                            61883.28         17867529

1995 Q2         15.25   4245                            64736.25         18020025

1995 Q3         15.36   4290                            65894.4           18404100

1995 Q4         14.47   4309                            62351.23         18567481

1996 Q1         14.49   4348                            63002.52         18905104

1996 Q2         14.61   4363                            63743.43         19035769

1996 Q3         15.57   4389                            68336.73         19263321

1996 Q4         15.11   4420                            66786.2           19536400

1997 Q1         14.53   4460                            64803.8           19891600

1997 Q2         15.02   4499                            67574.98         20241001

1997 Q3         15.14   4533                            68629.62         20548089

1997 Q4         14.4     4579                            65937.6           20967241

1998 Q1         14.29   4619                            66005.51         21335161

1998 Q2         14.61   4651                            67951.11         21631801

1998 Q3         15.23   4695                            71504.85         22043025

1998 Q4         14.25   4746                            67630.5           22524516

1999 Q1         13.98   4769                            66670.62         22743361

1999 Q2         14.43   4794                            69177.42         22982436

1999 Q3         14.68   4855                            71271.4           23571025

1999 Q4         13.91   4910                            68298.1           24108100

2000 Q1         13.69   4975                            68107.75         24750625

2000 Q2         13.75   5026                            69107.5           25260676

2000 Q3         14.14   5043                            71308.02         25431849

2000 Q4         13.67   5073                            69347.91         25735329

total    600.77 174728                        2614314.30     768804465

Births Rates were made by doing interpolation – what means it makes yearly data into quarterly. At first, I have collected live births, then total female’s population from 15-44 years (because it’s women’s fertile period) and then I done Interpolation and I got Quarterly Births Rates. Each rate per quarter means how many babies belong for 1000 women.

2. The equation to be estimated is:

BRi = b0 + b1 GDPi+ ui                      (i)

(i)                  In terms of the literature on demand for children, what would you expect to find for the coefficient on b1?

Given that the birth rate is the number of births per 1000 women then according to the equation the positive sign before the b1 means that birth rate and GDP have a direct relationship. This means that as the GDP increases so does the birth rate.

(ii)  Explain how you would modify the model, implied by this equation, if there is an ‘Engel curve’ relationship in the demand for children. An Engel curve show that the demand for children is increasing at an increasing rate due to the concave nature of the curve. To modify this equation a GDP deflator should be used in calculating the GDP. The Engel curve is biased because of the biased nature of the consumer price index. The Engel curve has low explanatory power because it exhibits the problem of heteroscedastisity. This means that the Engel curve is not a satisfactory model for explaining human behavior. (Passineti 1981). Heteroscedastisity is caused by the omission of variables, averaging of data and errors of measurement. In addition, data that is collected from a cross section is likely to exhibit heteroscedasticity because the variance depend o the size of the group. Heteroscedasticity needs to be treated when it occurs in any equation because it is a violation of the assumptions of the ordinary least squares. It is treated by the use of a logarithmic model, reducing the size of the variables and incorporating all the variables into the project.

(ii)                Why is there a constant term in the equation with no variable attached?

The equation described above, like many economic equations is deterministic in nature. A deterministic equation is one that for each of the independent variable there is one and only one corresponding value of the dependent variable. However, deterministic equations are not realistic because human behavior cannot be determined in exact quantities. Thus, the constant term is incorporated to take care of the errors in measurement, omitted variables and errors of specification.

(iii)               Why do these types of equations have a ‘u’ term?

The constant is present in the equation because there are other small factors that influence the dependent variable but are too small to be incorporated in the equation. Human behavior has several variations hence the need of a constant term to take care of the variations.

3. Estimate equation (i) by OLS and present the results in a suitable table

(n.b. marks will be lost for simply pasting over the computer output)

(i)                  Comment on the result for the coefficient on the GDP variable.

For calculation purposes in this question let the birth rate=Y and GDP= x

BRi = b0 + b1 GDPi+ ui.

b0 =  Yixi-∑ Yixixi                                             b1 = n∑ Yixi-∑ Yi∑xi

n∑xi2 – (∑xi) 2                                                                                         n∑Xi2-(∑xi) 2

b0 =        (600.77) (174728) - (2614314.30) (174728)          = 14.962

(40) (174728) - (174728) 2

b1 =   40(2614314.30) - (600.77) (174728) = 0.00013.

(40) (174728) – (174728)2

The model can now be specified as Y=14.962+0.0013X

(ii)                Comment on the R squared statistic.

R statistic is obtained by explained sum of squares/ total sum of squares

]   ]/ ∑ (Yi) 2 = (14.962*600.77) + (0.00013*2614314.30)/ (600.77)

Or 2.5%. this can be explained as follows: holding all other factors constant GDP explains 2.5% of the birth rate while the remaining 97.5% is explained by other factors.

(iii)               Derive estimates of the income elasticity of demand for children from your results.

Y=14.962+0.0013X. Assume that X =4905, then Y (Birth rate) =21.33. Income elasticity is obtained by dividing the percentage change in income in the percentage change in quantity demanded. In this case we shall assume that the birth rate is the number of children demanded and estimate at two points when X=4905; X= 3115.

The change in income is=790 or 0.2536%

Y=14.962+0.0013*(4905) = 21.3385

Y=14.962+0.0013*(3115) =19.0115.

The change in birth rate is 2.327 or 0.1223%

The income elasticity is given as 0.2536/0.1223=2.07%

4.  Carry out the following hypothesis tests:

(i)         b0=0   against the two- sided alternative at the 1% level

Step 1: formulate the hypothesis

H0:B0=0 (B0 is not significant)

HA:B0 is not equal to zero. (B0 is significant)

H0:B1=0 (B1 is not significant)

H0:B1 is not equal to zero. (B1 is significant)

Step 2: Obtain the absolute value of the calculated t statistic.

t-calculated bo= b0/standard error of b0

Standard error of model = 6.475637037/ (40-2) =0.1703

Standard error b0 = 0.1703√ {768804465/ (40*768804465- 40*4368.2*4368.2)} =0.0273

t-calculated b1=b1/standard error of b1

standard error b1= 0.1703√{1/768804465-40*4368.2*4368.2)}=0.00722

Step3: obtain the critical t statistic from the t tables

T critical=tn-k degrees of freedom, %u03AC/2

%u03AC= 0.01/2=0.005; n=40; k=2

t-critical=2.704

Step 4: compare the t calculated and the t critical

B0=0.0273 which is less than 2.704; we reject the alternative hypothesis and do not reject the null thus b0 is not statistically significant.

B1=0.00722 is less than 2.704, we reject the alternative hypothesis, and conclude that B! is not statistically significant.

If t-calculated is greater than t critical, reject null hypothesis; if t calculated is less then t critical, reject the alternative hypothesis

(ii)        B1=zero   against the two- sided alternative at the 5% level.

The t critical value for a two sided alternative at the 5% level =2.021 thus b1 is less than 2.021. This concludes that b1 is not statistically significant at this level.

(iii)     b0<0   against the alternative at the 5% level

H0:b0<zero. (b0 is not significant)

HA:b1>=0 (b0 is significant)

B1 is not statistically significant at this level.

(iv)     b1<0   against the alternative at the 5% level

b1 is not statistically significant at this level

5.   Differences in the pattern of births, over the calendar year, may cause serious problems with the accuracy of your results for this model. Outline the simple ‘seasonal dummy’ method of dealing with this and apply it to your data to produce a new set of results.

After respecifying the model to incorporate the estimates the model can be rewritten as Y=14.962+0.0013X. The data collected indicates that the birth rate increases in the third quarter. To take care of the seasonal changes that occur in the third quarter we introduce a dummy variable as follows     Y=14.962+0.0013X (D), where 1=third quarter and zero for every other quarter.

In the third quarter, when D=1 then the birth rate will b given as 14.9633 while in the other quarters the birth rate will be the intercept of the model at 14.962. This intercept is lower than 26.605, which is the estimate of the model above.

6.   Compare your new set of results (from Q.5) with your original results (from Q.3). You may consider the following relevant:

(i)Whether the new model offers a significant improvement in goodness of fit.

The model does not better the goodness of fit because the goodness of fit only increases when there are new variables that have been added to the model to make it a better estimate. Therefore, introducing a dummy variable has no effect on the goodness of fit.

(ii)  Assessing whether there has been any major change in estimated income

elasticity.

Using the same example where X=4905 and X=3115. The model is rewritten as Y=14.962+ (0.0013*4905) D                   14.962+6.3765D; when D=1 then Y= 21.3385

Y=14.962+ (0.0013*3115) (D)                 14.962+4.0495D; when D=1 then Y=19.0115

It can therefore be concluded that a dummy variable does not have any significant change in the income elasticity of demand for children.

(iv)              Assessing which quarters of the year tend to have, ceteris paribus, a higher or lower birth rate than others.

The quarters with the highest birth rate when all other factors are held constant are the third quarters. Throughout the data collected, the third quarters have a higher birth rate.

7. You should now write a short report of 450-600 words. This should briefly summarize your findings but most of your answer should consist of further exploration of your data (such as collecting further explanatory variables and estimating new regressions) and suggestions for improvement of the model you have estimated.

A model with dummy variables may at times exhibit perfect multicollinerity. The consequences of multicollinearity include:

• Increases in the standard errors, which lowers the t statistic.
• The variables may convey the same information as other variables
• The model will have to be respecified.
• One has to add more data to the sample, which increase the data that one has to work with.

Treating multicollinearity

• The model could be left as it is. However, this is advisable when the t-statistics are high enough such that the coefficients are significant.
• Estimate the model on the independent variable though it may cause bias in the results.
• Redesign the model to include more variables.
• Increase the sample size

In this example, I have chosen to redesign the model to include new variables as a treatment for multicollinearity (for the same period 1988-2001) which will be shown in the table below. The new variables include Marriages, Employment Female Unemployment Female, Deaths (Infants under one year).

Redesigned model to be estimated

Year          Births rates (Y)            GDP per capita £ (x1)  Marriages (x2)  Employment Female (x3)      Unemployment Female(x4)       Deaths (Infants under one year) (x5)

Table 2

Year                      Y                     x1                     x2                     x3                     x4         x5

1991 Q1               15.86               3941                46.8                 10647              488      1.57

1991 Q2                16                    3923                101.6               10639              540.9   1.49

1991 Q3                16.79               3905                138.7               10562              584      1.35

1991 Q4                15.7                 3905                62.7                 10548              598.3   1.41

1992 Q1                15.92               3907                45.4                 10495              618.4   1.36

1992 Q2                16.19               3898                101.9               10485              633.1       1.26

1992 Q3                16.4                 3915                146.2               10302              656.9   1.22

1992 Q4                15.06               3934                62.3                 10585              677.4   1.29

1993 Q1                15.52               3956                41.7                 10528              686.3   1.29

1993 Q2                15.66               3972                100.5               10626              681.4   1.31

1993 Q3                16.22               4004                138.5               10633              677.8   1.15

1993 Q4                15.05               4037                60.9                 10695              653.9   1.31

1994 Q1                15.3                 4078                41.7                 10603              637.9   1.22

1994 Q2                15.59               4132                95.5                 10645              622.9   1.18

1994 Q3                15.52               4187                134.4               10663              614.3   1.08

1994 Q4                14.77               4215                59.8                 10867              583.2   1.16

1995 Q1                14.64               4227                38.6                 10762              560.8   1.16

1995 Q2                15.25               4245                92.8                 10870              551.2   1.12

1995 Q3                15.36               4290                135.4               10821              544.6   1.06

1995 Q4                14.47               4309                55.2                 11053              535.6   1.18

1996 Q1                14.49               4348                41                    10992              523.8   1.06

1996 Q2                14.61               4363                91.4                 11160              520      1.07

1996 Q3                15.57               4389                129.4               11230              506.6   1.13

1996 Q4                15.11               4420                55.8                 11333              465.9   1.12

1997 Q1                14.53               4460                39.3                 11207              414.9   1.1

1997 Q2                15.02               4499                87.1                 11329              382.9   1.1

1997 Q3                15.14               4533                128.9               11361              346.6   1.01

1997 Q4                14.4                 4579                54.9                 11659              336.9   1.08

1998 Q1                14.29               4619                37.7                 11614              326.9   1.02

1998 Q2                14.61               4651                85.6                 11654              319.6   0.97

1998 Q3                15.23               4695                125.5               11728              315.7   0.98

1998 Q4                14.25               4746                56                    11811              311.3   1.1

1999 Q1                13.98               4769                36.9                 11688              307.8   1.06

1999 Q2                14.43               4794                83.2                 11774              299.3   1.02

1999 Q3                14.68               4855                126.6               11827              284.5   0.99

1999 Q4                13.91               4910                52.1                 11845              280.6   0.98

2000 Q1                13.69               4975                35.2                 12494              273.2   1

2000 Q2                13.75               5026                84.7                 12523              262.2   0.93

2000 Q3                14.14               5043                132.5               12603              247.6   0.96

2000 Q4                13.67               5073                52.9                 12674              244.4   0.93

Summary Output Report.

Graph showing residuals against the birth rates.

Regression Statistics

Multiple R                          0.964956395

R Square                           0.931140844

Standard Error                   0.221707584

Observations                     40

Coefficients      standard of errors        t- statistic

Intercept                                       26.6055           3.1146                         8.5420

GDP per capita £ (x)                     -0.0028           0.00082                      -3.4092

Marriages                                      0.0091             0.0010                        8.4872

Employment Female                      9.4724             0.00028                       0.3346

Unemployment Female                  -0.0025           0.00087                      -2.9581

Deaths (Infants under one year)     0.0700             0.6953                         0.1007

(All numbers have been presented to 4s.f)

The above table shows the regression results of the model after an increase in the number of variables from one independent variable to five independent variables. The model would therefore appear in this form Y=b0+bx1+bx2+bx3+bx4+bx5+µ. From the above model that is estimated at a 1%, significant level the statistical significance of each of the coefficients can be determined. The model can therefore be written as follows:

Y = 26.6055-0.0028X1+0.0091X2+9.4724X3 -0.0025X4+0.0700X5

S.E      (3.1146) (0.00082) (0.0010)     (0.00028)      (0.00087)      (0.6953 )

H0:b0=0 (b0 is not significant)

HA: b0 is not equal to zero. (b0 is significant)

3.1146 <8.5420, thus we reject the alternative hypothesis and do not reject the alternative hypothesis therefore b0 is not statistically significant.

H0:b1=0 (b1 is not significant)

HA: b1 is not equal to zero. (b1 is significant)

0.00082 > -3.4092, the alternative hypothesis is rejected and this concludes thatb1 is statistically significant in the model

H0:b2=0 (b2 is not significant)

HA: b2 is not equal to zero. (b2 is significant)

0.0010<8.4872, this leads to the conclusion that b2 is not statistically significant because the t value is greater than the standard error which requires that the alternative hypothesis be rejected.

H0:b3=0 (b3 is not significant)

HA: b3 is not equal to zero. (b3 is significant)

0.00028 < 0.3346, b3 is not statistically significant in the model.

H0:b4=0 (b4 is not significant)

H1:b4 is not equal to zero. (b4 is significant)

0.00087 > -2.9581, this leads to the conclusion that b4 is of statistical significance to the   model.

H0:b5=0 (b5 is not significant)

HA: b 5 is not equal to zero. (B5 is significant)

0.6953 > 0.1007,b5 is of statistical significance in the model.

Conclusion

From the above results its can be concluded GDP per capita, employment in females, and unemployment in females are all statistically significant in the model.

The above shows the results of a scatter diagram for the birth rates and all the statistically significant variables of the model.

Y =  26.6055-0.0028X1+0.0091X2+9.4724X3 -0.0025X4+0.0700X5 .The newly estimated model shows that as the birth rate increases the GDP and unemployment in females decreases due to the negative sign preceding the coefficient. Employment, marriage and the death rate have a direct relationship with the birth rate.

R Square                           0.931140844