Data Quality:

It is important to make sure the data you use is valid. An outlier data point can dramatically reduce the fit of a model, so it is critical that bad data points be moved. In the case of the Store24 data, we will assume that all managers have some experience, so remove any data points where the manager experience is zero.

Regression Analysis

First, you should run a full model for profit that includes both tenure and site location related variables. Tenure related variables are MTenure and CTenure. Site location related variables are population, number of competitors, street level visibility, pedestrian access, type of neighborhood, and whether a store stays open 24 hours. These variables are also defined on page 4 of the case Store24 (A).

First you should determine if all variables contribute to our understanding of the model. Use the p-value for each coefficient to decide (a value of 0.05 is typically used to decide whether a variable should be included). If any variables are not significant, copy the worksheet, remove the variable and run the regression again. In your report you should explain how well the model fits (e.g. describes the factors that impact profit).

It is not commonly understood how to evaluate the “impact” of the independent variables. The variables have to have a p-value that is significant (otherwise we can’t say there is a relationship), but how small the p-value is does not tell us how important the variable is. A good way to understand the impact of the variable is to find the range of values it can take, and then multiply that range by the value of the coefficient. That tells you the maximum impact that the variable can have on the problem.

Next, you must address Tom Hart’s hypothesis that manager tenure does not have a linear impact on profitability—that is, that there are diminishing returns to manager tenure. To test this, copy the worksheet, then add the variable MTenure2. To do this, insert a column next to the Tenure column, and then enter the formula =D2^2 in cell E2 and copy this formula to the rest of the cells. Now run a regression on this new set of variables and see if the MTenure2 variable is statistically significant.

What to Submit: You are to write a memo from Sarah Jenkins to Paul Doucette summarizing your results. You should explain your regression results:

* How well the model predicts store performance (r2, p-value of variables and their “impact”) * How your MTenure2 assesses Tom Hart’s hypothesis. Does it support his hypothesis? You should include a graph that shows contribution to profit of employee tenure over the range of values in the data set. The x-axis should be manager tenure, and the y-axis should be the predicted contribution to store profit.

Finally, your memo should give Paul Doucette a concrete recommendation as to how much Store24 should invest in any new manager retention programs.