[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: Re: R-squared for xtreg, re |

Date |
Thu, 18 Sep 2003 00:24:29 +0100 |

Mehmet Beceren > Does anybody know, > > if there is a way to make Stata give us the R-squared for > the random > effects panel regressions? Or, is it nonsense to report > them for random > effects estimations? Questions of this form arise quite frequently. I'll essay a general and discursive answer, and let others chime in with technicalities, especially if the technicalities invalidate what I'm going to say, either in general or with particular models. 1. If Stata refuses to give you an R-squared, there may be a good explanation, other than that the developers never got around to implementing this. Perhaps the R-squared doesn't seem to be a good measure for this model, on some technical grounds. You have to consult the advanced literature or an expert to take this further, unless you yourself are an expert, in which case you probably disagree with the other experts. 2. There is usually something you can do for yourself: calculate the correlation between the observed response and the predicted response, and then square it. Here is the general idea: . regress weight length . predict weightp if e(sample) . corr weight weightp if e(sample) . di r(rho)^2 Naturally, in this example, you get an R-squared any way, so you need not do this. But you can check here that you get the result you think you should get. Two crucial details to note: a. The predicted response must be on the same scale as the response, up to a linear transformation. b. Use -if e(sample)- to make sure everything is done for the estimation sample only. (In this example, the second -if e(sample)- is redundant given the first, but it does no harm.) 3. For many models, especially those with categorical responses, there are often several different supposed approximations or analogues to R-squared. Often they are labelled "pseudo". Beware that they typically do not agree, even roughly. You need to look at literature in your field and to realise that software and papers may often be unclear about precisely what was calculated. Thus if you do this after -logit- you will find that this is _not_ what -logit- reports as pseudo R-squared. (What that actually is does not appear to be documented, thus exemplifying my assertion.) 4. Even if you now have an R-squared, it is only a single figure of merit. Resist the temptation to use it as a weapon or as a comforter. Your R-squared may be high because your model codifies tautology or truism; or your R-squared may be low, but no indictment of your model, if the field is refractory and your dataset is problematic. There is likely to be a great deal of information about the limitations of the model, with implications for how it can be improved, in the detailed estimation results and residuals you can usually get from Stata. There is almost no such information in an R-squared. 5. Even if you now have an R-squared, it is best a descriptive measure. It takes into consideration only the information on which it is based, and says nothing about the structure of the data in any sense (e.g. dependence or group structure). Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: FW: limit on IVs for multiple regression** - Next by Date:
**st: Explanation: limit for variables in multiple regression** - Previous by thread:
**st: FW: limit on IVs for multiple regression** - Next by thread:
**st: Explanation: limit for variables in multiple regression** - Index(es):

© Copyright 1996–2021 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |