A fashion magazine wishes to construct a price index for men's clothing. The following table shows the prices and quantities bought for the years 1995 to 1997 by a typical consumer:
 
1995  1996  1997  
Price (£)  Quantity  Price (£)  Quantity  Price (£)  Quantity  
 
Shoes  52.00  3.2  56.00  3.8  61.00  3.6 
Trousers  33.00  4.4  37.70  4.7  39.10  5.1 
Shirts  18.00  6.1  20.45  6.4  23.50  6.2 
Underwear  4.20  8.7  4.85  8.5  4.70  7.9 

Using 1995 as the base year:
Construct a simple aggregate index for the years 1995 to 1997 for the price of clothing.
(6 marks)
Construct a Laspeyres Index for the years 1995 to 1997 for the price of clothing.
(8 marks)
Discuss which of these index numbers is more suitable, giving your recommendation.
(5 marks)
Discuss critically the alternative methods that could be used to construct an appropriate index.
(6 marks)
(Total 25 marks)
A company produces cans of soft drinks. At its Edinburgh site there are four similar production lines A, B, C and D for the canning process. The company operates a shift system with employees working the early shift, the late shift or the night shift. Production is carefully monitored and statistics are kept on the number of substandard cans produced during a shift. The table below shows the relevant data for production on Monday 26 January.
 
Shift  Production Line  
A  B  C  D  
 
Early  24  24  18  22 
Late  30  20  27  27 
Night  30  31  36  35 

Carry out a twoway analysis of variance. Produce an appropriate summary table.
(10 marks)
Test whether there is a significant difference in the average number of substandard cans for the four production lines?
(5 marks)
Test whether there is a significant difference in the average number of substandard cans for the three shifts?
(5 marks)
Write a report on your findings. Indicate any other information you would require. Detail your recommendations for further research and analyses.
(5 marks)
(Total 25 marks)
A bank has carried out a study to determine the efficiency of its transaction handling. The bank uses three different methods for handling customers' transactions: human tellers, automated tellers with keyboard and voice activated automatic tellers. The bank has collected information on the value of the transaction (in £ sterling), the type of teller used and the time taken for the transaction.
The following notation has been used:
Y = Transaction time in minutes
X_{2} = Method of handling transaction, where:
1 = human teller,
2 = automated teller with keyboard,
3 = voice activated automatic teller.
The information was run through a multiple regression model and the following printout was generated:
 
Degrees of Freedom  Sums of Squares  
Regression  2  78.26 
Residual  60  64.57 
Total  62  142.83 
Variable  Coefficient  Standard Error 
Constant  1.4279  
X1  0.0028  0.0012 
X2  0.2141  0.1162 

What is the regression equation generated by this model?
(2 marks)
How long would you expect a transaction for £1200 to take if it was handled by the voice activated teller? Explain your reasoning.
(2 marks)
What statistic could be used to measure the proportion of the total variability in the data that is explained by the regression relationship? Calculate the statistic and comment on your findings.
(3 marks)
Test whether the multiple regression model is statistically significant, and explain what this means.
(5 marks)
Which, if any, of the independent variables are statistically significant? Explain your reasoning. What implications do these findings have?
(4 marks)
What additional analyses would you recommend and why?
(4 marks)
Write a report on your findings. Discuss the usefulness of this model for the bank and make recommendations for its possible improvement.
(5 marks)
(Total 25 marks)
The Weight watchers Association encourages its members to come to weekly meetings to discuss diets and healthy eating. At each meeting members are weighed to determine the change from the previous meeting and their overall progress.
The attached table shows the relevant information for the 10 members who attended the last group meeting
 
Member  Weight (kilograms)  
05–Jan–98  21–May–98  28–May–98  
 
Edna Mitchell  92.3  90.4  90.2 
Mary Booth  70.2  65.3  66.2 
Rosemary Wilson  64.7  67.2  67.2 
Ronnie Biggs  86.7  85.1  83.9 
Vivienne Black  68.4  65.6  65.4 
Judy Russell  74.8  71.7  72.3 
John Hunter  86.4  82.1  82.0 
May West  95.2  88.4  88.3 
Charlie Dawson  84.1  80.6  81.3 
Margaret Ross  72.3  68.7  67.3 

Comparing the average weight as recorded at the beginning of January with that recorded on May 21, determine whether there is a significant decrease in the average weight of the members.
(9 marks)
Determine if there is a significant difference in the average weight as recorded at the two sessions in May 1998.
(9 marks)
Write a report on your findings. Describe any assumptions you have made. Indicate any further information you would require and any further analyses you would wish to perform.
(7 marks)
(Total 25 marks)
(Total of 100 marks)
END OF PAPER
Note from the Examiner: These answers are skeletal guidelines only. Though they contain the major points sought, the student's response should provide further insight into many of the points. Simply repeating the major bullet points does not necessarily demonstrate the required level of understanding of the subject.
Construct a simple aggregate index for the years 1995 to 1997 for the price of clothing.
(6 marks)
See below.
Construct a Laspeyres Index for the years 1995 to 1997 for the price of clothing.
(8 marks)
A Laspeyres Index for the years 1995 to 1997 for the price of clothing:
Discuss which of these index numbers is more suitable, giving your recommendation.
(5 marks)
Laspeyres index is most suitable since it takes account of the quantities of clothing purchased. Quantities used reflect those purchased in the base year 1995. Simple aggregate index gives equal importance (weight) to all of the items.
Discuss critically the alternative methods that could be used to construct an appropriate index.
(6 marks)
A Paasche index could also be used. This would use current year quantities instead of base year quantities (1995). Often difficult to obtain accurate quantity figures for the current year.
Carry out a twoway analysis of variance. Produce an appropriate summary table.
(10 marks)
Initial calculations:
A  B  C  D  Total  Mean  
Early  24  24  18  22  88  22 
Late  30  20  27  27  104  26 
Night  30  31  36  35  132  33 
Total  84  75  81  84  324  
Mean  28  25  27  28  27  
 
Source  Degrees of Freedom  Sums of Squares  Mean Square  Fratio 
 
Treatments  3  18.00  6.00  0.42 
Blocks  2  248.00  124.00  8.65 
Error  6  86.00  14.33  
Total  11  352.00  

Test whether there is a significant difference in the average number of substandard cans for the four production lines?
(5 marks)
Test for a significant difference in the average number of substandard cans for the four production lines.:
H_{O}: _{A} = _{B} = _{C} = _{D}
H_{A}: _{A} _{B} _{C} _{D}
Test Statistic F_{calc} = 0.42 with (3,6) degrees of freedom
From tables:
Critical value of F at 5% = 4.76 
Critical value of F at 1% = 9.78 
There is no significant difference in the average number of substandard cans for the four production lines.
Test whether there is a significant difference in the average number of substandard cans for the three shifts?
(5 marks)
Test for a significant difference in the average number of substandard cans for the three shifts:
H_{O}: _{EARLY} = _{LATE} = _{NIGHT}
H_{A}: _{EARLY} _{LATE} _{NIGHT}
Test Statistic F_{calc} = 8.65 with (2,6) degrees of freedom
From tables:
Critical value of F at 5% = 5.14 
Critical value of F at 1% = 10.92 
Reject H_{O} at the 5% level of significance.
There is a significant difference in the average number of substandard cans for the three shifts, at the 5% level of significance.
Write a report on your findings. Indicate any other information you would require. Detail your recommendations for further research and analyses.
(5 marks)
Looking at the average number of defective cans it seems that there are less on production line B. However this difference is not found to be significant.
When the average number substandard is compared for the three shifts it can be seen that the night shift has the highest average (33) in contrast to that of the early shift (22).
Further information needs to be obtained relating to the total production, comparisons with other days and periods, how substandard is defined.
What is the regression equation generated by this model?
(2 marks)
How long would you expect a transaction for £1200 to take if it was handled by the voice activated teller? Explain your reasoning.
(2 marks)
Using the above equation.
This assumes that the regression equation is valid for a transaction of £1200, and that X_{1} = £1200 and X_{2} = 2.
What statistic could be used to measure the proportion of the total variability in the data that is explained by the regression relationship? Calculate the statistic and comment on your findings.
(3 marks)
The statistic R^{2} could be used to measure the proportion of variability that is explained by the regression relationship.
55% of the variability in the data is explained by the regression relationship.
Test whether the multiple regression model is statistically significant, and explain what this means.
(5 marks)
 
Source  Degrees of Freedom  Sums of Squares  Mean Square  Fratio 
 
Regression  2  78.26  39.13  36.23 
Residual  60  64.57  1.08  
Total  62  142.83  

H_{O}: 
Regression model does not explain a significance proportion of the variance in Y. 
H_{A}: 
Regression model explains a significance proportion of the variance in Y. 
F_{calc} = 36.23 with (2,60) degrees of freedom.
From tables the critical values are:
Fratio at 5% = 3.15 
Fratio at 1% = 4.13 
Reject H_{O} at the 1% level of significance.
The regression model explains a significant proportion of the variation in Y, at the 1% level of significance.
Which, if any, of the independent variables are statistically significant? Explain your reasoning. What implications do these findings have?
(4 marks)
In order to test the significance of the independent variables the relevant tstatistics needs to be calculated.
Variable  Coefficient  Standard Error  tstatistic 
X_{1}  0.0028  0.0012  2.33 
X_{2}  0.2141  0.1162  1.84 
Model Y = b_{0} + b_{1}X_{1} + b_{2}X_{2}
Reject H_{O} at the 5% level of significance.
The regression coefficient b_{1} is significantly different from 0 at the 5% level of significance.
Retain X_{1} transaction value (in £), in the regression equation.
H_{O}: b_{2} = 0
H_{A}: b_{2} 0
The regression coefficient b_{2} is not significantly different from 0.
Remove X_{2} , method of handling transaction, from the regression model. Rerun model and test again.
What additional analyses would you recommend and why?
(4 marks)
Check the residuals. Scatter diagram. Examine residuals for serial correlation and heteroscedascity. Check for randomness using a runs test.
Write a report on your findings. Discuss the usefulness of this model for the bank and make recommendations for its possible improvement.
(5 marks)
The analysis has shown that the proposed regression model explains 55% of the variation in the Y values.
Testing the significance of the model, it is found to be significant at the 5% level of significance.
When the individual regression coefficients are tested it is found that variable X_{1} (transaction value) makes a significant contribution to the model and should be retained.
The variable X_{2}, method of handling transaction, does not make a significant contribution to the model and should be excluded.
A new model should be derived and tested for significance.
No details are given of the way the study was carried out. Was the sample representative?
The sample size of 63 seems relatively small for the type of study carried out.
See the summary of calculations below.
Comparing the average weight as recorded at the beginning of January with that recorded on May 21, determine whether there is a significant decrease in the average weight of the members.
(9 marks)
In this case study paired differences should be used in the hypothesis tests.
Comparing January with May 21st:
H_{O}: _{d} 0  Since testing a decrease  
H_{A}: _{d} > 0 
Reject H_{O} at the 1% level of significance.
There is a significant decrease in the average weight between January and May 21st at the 1% level of significance.
Determine if there is a significant difference in the average weight as recorded at the two sessions in May 1998.
(9 marks)
Comparing May 21st with May 28th:
H_{O}: _{d} = 0  Since testing a decrease  
H_{A}: _{d} 0 
There is no significant difference in the average weights for May 21st and May 28th.
Write a report on your findings. Describe any assumptions you have made. Indicate any further information you would require and any further analyses you would wish to perform.
(7 marks)
The data show that there is a significant decrease in the average weight recorded in Jan and on May 21st at the 1% level of significance.
The greatest weight loss was recorded by May West. Only one member Rosemary Wilson showed a weight gain.
When the average weights are compared for the two meetings in May it is found that there is no significant difference.
The tests assume that the sample was taken at random and that it comes from a normally distributed population. However the members included are only those who turned up at the last meeting and who were at the previous two sessions. There may be another group who decided not to come because they had put on some weight.
The data should also be examined for differences in the recorded weights for males and females. There is also likely to be a relationship to the initial weight when the member joined and to the length of membership of the Association.
Information on the accuracy of the weighing process and on the type of clothing worn should also be obtained e.g. winter clothes rather than summer wear.