Statistics Solutions

 

  • Based on the sample data and a significance level equal to 0.05, does there appear to be a difference in the proportion of loan defaults between residential and commercial customers?
  • Prepare a short response to the Vintner board of directors. Include in your report a graph of the data that supports your statistical analysis.
  • Consider the outcome of the hypothesis test in part a. In the last five audits, 10 residential and 10 commercial customers were selected. In three of the audits, there were more residential than commercial loan defaults. Determine the probability of such an occurrence.
t-Test: Two-Sample Assuming Equal Variances
  Residential Loan Status Commercial Loan Status
Mean 1.155 1.180952
Variance 0.131633 0.149634
Observations 200 105
Pooled Variance 0.137812
Hypothesized Mean Difference 0
df 303
t Stat -0.58009
P(T<=t) one-tail 0.281143
t Critical one-tail 1.649898
P(T<=t) two-tail 0.562286
t Critical two-tail 1.967824

 

Q2The California State Highway Patrol recently conducted a study on a stretch of interstate highway south of San Francisco to determine what differences, if any, existed in driving speeds of cars licensed in California and cars licensed in Nevada. One of the issues to be examined was whether there was a diffrence in the variability of driving speeds between cars licensed in the two states. The data file Speed-Test contains speeds of 140 randomly selected California cars and 75 randomly selected Nevada cars. Based on these sample results, can you conclude at the 0.05 level of significance there is a difference between the variations in driving speeds for cars licensed in the two states?

t-Test: Two-Sample Assuming Equal Variances
  California Cars Out-of-State Cars
Mean 64.45 61.96
Variance 64.29245 59.76865
Observations 140 75
Pooled Variance 62.7208
Hypothesized Mean Difference 0
df 213
t Stat 2.197197
P(T<=t) one-tail 0.014542
t Critical one-tail 1.652039
P(T<=t) two-tail 0.029084
t Critical two-tail 1.971164

 

There is a difference between the two cars speed and thus it can be said that there is a significant difference in the two states.

 

Q3 The Ecco Company makes electronics products for distribution throughout the world. As a member of the quality department, you are interested in the warranty claims that are made by customers who have experienced problems with Ecco products. The file called Ecco contains data for a random sample of warranty claims. Large warranty claims not only cost the company money but also provide adverse publicity. The quality manager has asked you to provide her with a range of values that would represent the percentage of warranty claims filed for more than $300. Provide this information for your quality manager.

Sum of %total Column Labels
Row Labels 1 2 3 4 Grand Total
1 0.20965 0.300377 0.146518 0.029564 0.686109
1 0.18429 0.195902 0.090594 0.01328 0.484067
2 0.008442 0.073476 0.044513 0.016283 0.142714
3 0.016917 0.030999 0.011412 0.059328
2 0.097534 0.106477 0.039274 0.023157 0.266442
1 0.090126 0.08719 0.006607 0.012813 0.196737
2 0.007408 0.019287 0.023091 0.049785
3 0.009577 0.010344 0.019921
3 0.010544 0.011278 0.025626 0.047449
1 0.017852 0.017852
2 0.010544 0.011278 0.007775 0.029597
Grand Total 0.317728 0.418132 0.211418 0.052721 1

 

Q4 The state transportation department recently conducted a study of motorists in Idaho. Two main factors of interest were whether the vehicle was insured with liability insurance and whether the driver was wearing a seat belt. A random sample of 100 cars was stopped at various locations throughout the state. The data are in the file called Liabins. The investigators were interested in determining whether seat belt status is independent of insurance status. Conduct the appropriate hypothesis test using a 0.05 level of significance and discuss your results.

  Driving Citations Vehicle Year Driver Sex Driver Age Seat Belt Status Law Knowledge Employment Status Year In State Registered Vehicles Years Education Insurance Certificate Status Insurance Status
Driving Citations 1
Vehicle Year 0.030072 1
Driver Sex -0.25747 0.258334 1
Driver Age -0.29097 0.116277 0.041819 1
Seat Belt Status 0.009689 -0.22479 -0.11649 -0.1071 1
Law Knowledge -0.02023 -0.06389 -0.04404 0.164283 0.177074 1
Employment Status -0.1347 0.098752 0.192186 0.306856 -0.06546 0.186704 1
Year In State -0.17428 0.12232 -0.08811 0.610012 0.018003 0.022402 0.03915 1
Registered Vehicles -0.17653 -0.04777 -0.13674 0.330033 -0.09816 0.046213 0.011787 0.285945 1
Years Education -0.00536 0.247238 0.137123 0.048782 -0.28447 -0.22471 0.059279 0.055306 0.1084 1
Insurance Certificate Status 0.067598 -0.07084 -0.02452 -0.0343 -0.05786 0.080405 0.138235 -0.10396 -0.1895 0.102671 1
Insurance Status -0.00283 0.060221 -0.12737 0.046721 0.088611 -0.07135 0.042459 0.166711 0.149672 -0.01691 -0.14086 1

 

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.088611
R Square 0.007852
Adjusted R Square -0.00227
Standard Error 0.525144
Observations 100
ANOVA
  df SS MS F Significance F
Regression 1 0.213886 0.213886 0.775578 0.380652
Residual 98 27.02611 0.275777
Total 99 27.24
  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 1.571429 0.198486 7.917078 3.81E-12 1.17754 1.965317 1.17754 1.965317
Insurance Status 0.18126 0.20582 0.880669 0.380652 -0.22718 0.589703 -0.22718 0.589703

 

Seat belt status is independent of insurance status

 

 

Q5 An economist for the state government of Mississippi recently collected the data contained in the file called Mississippi on the percentage of people unemployed in the state at randomly selected points in time over the past 25 years and the interest rate of Treasury bills offered by the federal government at that point in time.

  • Develop a plot showing the relationship between the two variables.
  • Describe the relationship as being either linear or curvilinear.
  • Develop a simple linear regression model with unemployment rate as the dependent variable.
  • Write a short report describing the model and indicating the important measures.

 

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.973125
R Square 0.946972
Adjusted R Square 0.943658
Standard Error 0.788972
Observations 18
ANOVA
  df SS MS F Significance F
Regression 1 177.8604 177.8604 285.7299 1.26E-11
Residual 16 9.959637 0.622477
Total 17 187.82
  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -19.3726 1.586889 -12.2079 1.6E-09 -22.7366 -16.0085 -22.7366 -16.0085
Interest Rates (x) 2.902579 0.171714 16.90355 1.26E-11 2.538561 3.266597 2.538561 3.266597

Linear

Quadratic

The equation is a quadratic one.