Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1""" 

2Linear mixed effects models are regression models for dependent data. 

3They can be used to estimate regression relationships involving both 

4means and variances. 

5 

6These models are also known as multilevel linear models, and 

7hierarchical linear models. 

8 

9The MixedLM class fits linear mixed effects models to data, and 

10provides support for some common post-estimation tasks. This is a 

11group-based implementation that is most efficient for models in which 

12the data can be partitioned into independent groups. Some models with 

13crossed effects can be handled by specifying a model with a single 

14group. 

15 

16The data are partitioned into disjoint groups. The probability model 

17for group i is: 

18 

19Y = X*beta + Z*gamma + epsilon 

20 

21where 

22 

23* n_i is the number of observations in group i 

24 

25* Y is a n_i dimensional response vector (called endog in MixedLM) 

26 

27* X is a n_i x k_fe dimensional design matrix for the fixed effects 

28 (called exog in MixedLM) 

29 

30* beta is a k_fe-dimensional vector of fixed effects parameters 

31 (called fe_params in MixedLM) 

32 

33* Z is a design matrix for the random effects with n_i rows (called 

34 exog_re in MixedLM). The number of columns in Z can vary by group 

35 as discussed below. 

36 

37* gamma is a random vector with mean 0. The covariance matrix for the 

38 first `k_re` elements of `gamma` (called cov_re in MixedLM) is 

39 common to all groups. The remaining elements of `gamma` are 

40 variance components as discussed in more detail below. Each group 

41 receives its own independent realization of gamma. 

42 

43* epsilon is a n_i dimensional vector of iid normal 

44 errors with mean 0 and variance sigma^2; the epsilon 

45 values are independent both within and between groups 

46 

47Y, X and Z must be entirely observed. beta, Psi, and sigma^2 are 

48estimated using ML or REML estimation, and gamma and epsilon are 

49random so define the probability model. 

50 

51The marginal mean structure is E[Y | X, Z] = X*beta. If only the mean 

52structure is of interest, GEE is an alternative to using linear mixed 

53models. 

54 

55Two types of random effects are supported. Standard random effects 

56are correlated with each other in arbitrary ways. Every group has the 

57same number (`k_re`) of standard random effects, with the same joint 

58distribution (but with independent realizations across the groups). 

59 

60Variance components are uncorrelated with each other, and with the 

61standard random effects. Each variance component has mean zero, and 

62all realizations of a given variance component have the same variance 

63parameter. The number of realized variance components per variance 

64parameter can differ across the groups. 

65 

66The primary reference for the implementation details is: 

67 

68MJ Lindstrom, DM Bates (1988). "Newton Raphson and EM algorithms for 

69linear mixed effects models for repeated measures data". Journal of 

70the American Statistical Association. Volume 83, Issue 404, pages 

711014-1022. 

72 

73See also this more recent document: 

74 

75http://econ.ucsb.edu/~doug/245a/Papers/Mixed%20Effects%20Implement.pdf 

76 

77All the likelihood, gradient, and Hessian calculations closely follow 

78Lindstrom and Bates 1988, adapted to support variance components. 

79 

80The following two documents are written more from the perspective of 

81users: 

82 

83http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf 

84 

85http://lme4.r-forge.r-project.org/slides/2009-07-07-Rennes/3Longitudinal-4.pdf 

86 

87Notation: 

88 

89* `cov_re` is the random effects covariance matrix (referred to above 

90 as Psi) and `scale` is the (scalar) error variance. For a single 

91 group, the marginal covariance matrix of endog given exog is scale*I 

92 + Z * cov_re * Z', where Z is the design matrix for the random 

93 effects in one group. 

94 

95* `vcomp` is a vector of variance parameters. The length of `vcomp` 

96 is determined by the number of keys in either the `exog_vc` argument 

97 to ``MixedLM``, or the `vc_formula` argument when using formulas to 

98 fit a model. 

99 

100Notes: 

101 

1021. Three different parameterizations are used in different places. 

103The regression slopes (usually called `fe_params`) are identical in 

104all three parameterizations, but the variance parameters differ. The 

105parameterizations are: 

106 

107* The "user parameterization" in which cov(endog) = scale*I + Z * 

108 cov_re * Z', as described above. This is the main parameterization 

109 visible to the user. 

110 

111* The "profile parameterization" in which cov(endog) = I + 

112 Z * cov_re1 * Z'. This is the parameterization of the profile 

113 likelihood that is maximized to produce parameter estimates. 

114 (see Lindstrom and Bates for details). The "user" cov_re is 

115 equal to the "profile" cov_re1 times the scale. 

116 

117* The "square root parameterization" in which we work with the Cholesky 

118 factor of cov_re1 instead of cov_re directly. This is hidden from the 

119 user. 

120 

121All three parameterizations can be packed into a vector by 

122(optionally) concatenating `fe_params` together with the lower 

123triangle or Cholesky square root of the dependence structure, followed 

124by the variance parameters for the variance components. The are 

125stored as square roots if (and only if) the random effects covariance 

126matrix is stored as its Choleky factor. Note that when unpacking, it 

127is important to either square or reflect the dependence structure 

128depending on which parameterization is being used. 

129 

130Two score methods are implemented. One takes the score with respect 

131to the elements of the random effects covariance matrix (used for 

132inference once the MLE is reached), and the other takes the score with 

133respect to the parameters of the Choleky square root of the random 

134effects covariance matrix (used for optimization). 

135 

136The numerical optimization uses GLS to avoid explicitly optimizing 

137over the fixed effects parameters. The likelihood that is optimized 

138is profiled over both the scale parameter (a scalar) and the fixed 

139effects parameters (if any). As a result of this profiling, it is 

140difficult and unnecessary to calculate the Hessian of the profiled log 

141likelihood function, so that calculation is not implemented here. 

142Therefore, optimization methods requiring the Hessian matrix such as 

143the Newton-Raphson algorithm cannot be used for model fitting. 

144""" 

145 

146import numpy as np 

147import statsmodels.base.model as base 

148from statsmodels.tools.decorators import cache_readonly 

149from statsmodels.tools import data as data_tools 

150from scipy.stats.distributions import norm 

151from scipy import sparse 

152import pandas as pd 

153import patsy 

154from collections import OrderedDict 

155import warnings 

156from statsmodels.tools.sm_exceptions import ConvergenceWarning 

157from statsmodels.base._penalties import Penalty 

158 

159 

160def _dot(x, y): 

161 """ 

162 Returns the dot product of the arrays, works for sparse and dense. 

163 """ 

164 

165 if isinstance(x, np.ndarray) and isinstance(y, np.ndarray): 

166 return np.dot(x, y) 

167 elif sparse.issparse(x): 

168 return x.dot(y) 

169 elif sparse.issparse(y): 

170 return y.T.dot(x.T).T 

171 

172 

173# From numpy, adapted to work with sparse and dense arrays. 

174def _multi_dot_three(A, B, C): 

175 """ 

176 Find best ordering for three arrays and do the multiplication. 

177 

178 Doing in manually instead of using dynamic programing is 

179 approximately 15 times faster. 

180 """ 

181 # cost1 = cost((AB)C) 

182 cost1 = (A.shape[0] * A.shape[1] * B.shape[1] + # (AB) 

183 A.shape[0] * B.shape[1] * C.shape[1]) # (--)C 

184 # cost2 = cost((AB)C) 

185 cost2 = (B.shape[0] * B.shape[1] * C.shape[1] + # (BC) 

186 A.shape[0] * A.shape[1] * C.shape[1]) # A(--) 

187 

188 if cost1 < cost2: 

189 return _dot(_dot(A, B), C) 

190 else: 

191 return _dot(A, _dot(B, C)) 

192 

193 

194def _dotsum(x, y): 

195 """ 

196 Returns sum(x * y), where '*' is the pointwise product, computed 

197 efficiently for dense and sparse matrices. 

198 """ 

199 

200 if sparse.issparse(x): 

201 return x.multiply(y).sum() 

202 else: 

203 # This way usually avoids allocating a temporary. 

204 return np.dot(x.ravel(), y.ravel()) 

205 

206 

207class VCSpec(object): 

208 """ 

209 Define the variance component structure of a multilevel model. 

210 

211 An instance of the class contains three attributes: 

212 

213 - names : names[k] is the name of variance component k. 

214 

215 - mats : mats[k][i] is the design matrix for group index 

216 i in variance component k. 

217 

218 - colnames : colnames[k][i] is the list of column names for 

219 mats[k][i]. 

220 

221 The groups in colnames and mats must be in sorted order. 

222 """ 

223 

224 def __init__(self, names, colnames, mats): 

225 self.names = names 

226 self.colnames = colnames 

227 self.mats = mats 

228 

229 

230def _get_exog_re_names(self, exog_re): 

231 """ 

232 Passes through if given a list of names. Otherwise, gets pandas names 

233 or creates some generic variable names as needed. 

234 """ 

235 if self.k_re == 0: 

236 return [] 

237 if isinstance(exog_re, pd.DataFrame): 

238 return exog_re.columns.tolist() 

239 elif isinstance(exog_re, pd.Series) and exog_re.name is not None: 

240 return [exog_re.name] 

241 elif isinstance(exog_re, list): 

242 return exog_re 

243 

244 # Default names 

245 defnames = ["x_re{0:1d}".format(k + 1) for k in range(exog_re.shape[1])] 

246 return defnames 

247 

248 

249class MixedLMParams(object): 

250 """ 

251 This class represents a parameter state for a mixed linear model. 

252 

253 Parameters 

254 ---------- 

255 k_fe : int 

256 The number of covariates with fixed effects. 

257 k_re : int 

258 The number of covariates with random coefficients (excluding 

259 variance components). 

260 k_vc : int 

261 The number of variance components parameters. 

262 

263 Notes 

264 ----- 

265 This object represents the parameter state for the model in which 

266 the scale parameter has been profiled out. 

267 """ 

268 

269 def __init__(self, k_fe, k_re, k_vc): 

270 

271 self.k_fe = k_fe 

272 self.k_re = k_re 

273 self.k_re2 = k_re * (k_re + 1) // 2 

274 self.k_vc = k_vc 

275 self.k_tot = self.k_fe + self.k_re2 + self.k_vc 

276 self._ix = np.tril_indices(self.k_re) 

277 

278 def from_packed(params, k_fe, k_re, use_sqrt, has_fe): 

279 """ 

280 Create a MixedLMParams object from packed parameter vector. 

281 

282 Parameters 

283 ---------- 

284 params : array_like 

285 The mode parameters packed into a single vector. 

286 k_fe : int 

287 The number of covariates with fixed effects 

288 k_re : int 

289 The number of covariates with random effects (excluding 

290 variance components). 

291 use_sqrt : bool 

292 If True, the random effects covariance matrix is provided 

293 as its Cholesky factor, otherwise the lower triangle of 

294 the covariance matrix is stored. 

295 has_fe : bool 

296 If True, `params` contains fixed effects parameters. 

297 Otherwise, the fixed effects parameters are set to zero. 

298 

299 Returns 

300 ------- 

301 A MixedLMParams object. 

302 """ 

303 k_re2 = int(k_re * (k_re + 1) / 2) 

304 

305 # The number of covariance parameters. 

306 if has_fe: 

307 k_vc = len(params) - k_fe - k_re2 

308 else: 

309 k_vc = len(params) - k_re2 

310 

311 pa = MixedLMParams(k_fe, k_re, k_vc) 

312 

313 cov_re = np.zeros((k_re, k_re)) 

314 ix = pa._ix 

315 if has_fe: 

316 pa.fe_params = params[0:k_fe] 

317 cov_re[ix] = params[k_fe:k_fe+k_re2] 

318 else: 

319 pa.fe_params = np.zeros(k_fe) 

320 cov_re[ix] = params[0:k_re2] 

321 

322 if use_sqrt: 

323 cov_re = np.dot(cov_re, cov_re.T) 

324 else: 

325 cov_re = (cov_re + cov_re.T) - np.diag(np.diag(cov_re)) 

326 

327 pa.cov_re = cov_re 

328 if k_vc > 0: 

329 if use_sqrt: 

330 pa.vcomp = params[-k_vc:]**2 

331 else: 

332 pa.vcomp = params[-k_vc:] 

333 else: 

334 pa.vcomp = np.array([]) 

335 

336 return pa 

337 

338 from_packed = staticmethod(from_packed) 

339 

340 def from_components(fe_params=None, cov_re=None, cov_re_sqrt=None, 

341 vcomp=None): 

342 """ 

343 Create a MixedLMParams object from each parameter component. 

344 

345 Parameters 

346 ---------- 

347 fe_params : array_like 

348 The fixed effects parameter (a 1-dimensional array). If 

349 None, there are no fixed effects. 

350 cov_re : array_like 

351 The random effects covariance matrix (a square, symmetric 

352 2-dimensional array). 

353 cov_re_sqrt : array_like 

354 The Cholesky (lower triangular) square root of the random 

355 effects covariance matrix. 

356 vcomp : array_like 

357 The variance component parameters. If None, there are no 

358 variance components. 

359 

360 Returns 

361 ------- 

362 A MixedLMParams object. 

363 """ 

364 

365 if vcomp is None: 

366 vcomp = np.empty(0) 

367 if fe_params is None: 

368 fe_params = np.empty(0) 

369 if cov_re is None and cov_re_sqrt is None: 

370 cov_re = np.empty((0, 0)) 

371 

372 k_fe = len(fe_params) 

373 k_vc = len(vcomp) 

374 k_re = cov_re.shape[0] if cov_re is not None else cov_re_sqrt.shape[0] 

375 

376 pa = MixedLMParams(k_fe, k_re, k_vc) 

377 pa.fe_params = fe_params 

378 if cov_re_sqrt is not None: 

379 pa.cov_re = np.dot(cov_re_sqrt, cov_re_sqrt.T) 

380 elif cov_re is not None: 

381 pa.cov_re = cov_re 

382 

383 pa.vcomp = vcomp 

384 

385 return pa 

386 

387 from_components = staticmethod(from_components) 

388 

389 def copy(self): 

390 """ 

391 Returns a copy of the object. 

392 """ 

393 obj = MixedLMParams(self.k_fe, self.k_re, self.k_vc) 

394 obj.fe_params = self.fe_params.copy() 

395 obj.cov_re = self.cov_re.copy() 

396 obj.vcomp = self.vcomp.copy() 

397 return obj 

398 

399 def get_packed(self, use_sqrt, has_fe=False): 

400 """ 

401 Return the model parameters packed into a single vector. 

402 

403 Parameters 

404 ---------- 

405 use_sqrt : bool 

406 If True, the Cholesky square root of `cov_re` is 

407 included in the packed result. Otherwise the 

408 lower triangle of `cov_re` is included. 

409 has_fe : bool 

410 If True, the fixed effects parameters are included 

411 in the packed result, otherwise they are omitted. 

412 """ 

413 

414 if self.k_re > 0: 

415 if use_sqrt: 

416 L = np.linalg.cholesky(self.cov_re) 

417 cpa = L[self._ix] 

418 else: 

419 cpa = self.cov_re[self._ix] 

420 else: 

421 cpa = np.zeros(0) 

422 

423 if use_sqrt: 

424 vcomp = np.sqrt(self.vcomp) 

425 else: 

426 vcomp = self.vcomp 

427 

428 if has_fe: 

429 pa = np.concatenate((self.fe_params, cpa, vcomp)) 

430 else: 

431 pa = np.concatenate((cpa, vcomp)) 

432 

433 return pa 

434 

435 

436def _smw_solver(s, A, AtA, Qi, di): 

437 r""" 

438 Returns a solver for the linear system: 

439 

440 .. math:: 

441 

442 (sI + ABA^\prime) y = x 

443 

444 The returned function f satisfies f(x) = y as defined above. 

445 

446 B and its inverse matrix are block diagonal. The upper left block 

447 of :math:`B^{-1}` is Qi and its lower right block is diag(di). 

448 

449 Parameters 

450 ---------- 

451 s : scalar 

452 See above for usage 

453 A : ndarray 

454 p x q matrix, in general q << p, may be sparse. 

455 AtA : square ndarray 

456 :math:`A^\prime A`, a q x q matrix. 

457 Qi : square symmetric ndarray 

458 The matrix `B` is q x q, where q = r + d. `B` consists of a r 

459 x r diagonal block whose inverse is `Qi`, and a d x d diagonal 

460 block, whose inverse is diag(di). 

461 di : 1d array_like 

462 See documentation for Qi. 

463 

464 Returns 

465 ------- 

466 A function for solving a linear system, as documented above. 

467 

468 Notes 

469 ----- 

470 Uses Sherman-Morrison-Woodbury identity: 

471 https://en.wikipedia.org/wiki/Woodbury_matrix_identity 

472 """ 

473 

474 # Use SMW identity 

475 qmat = AtA / s 

476 if sparse.issparse(qmat): 

477 qmat = qmat.todense() 

478 m = Qi.shape[0] 

479 qmat[0:m, 0:m] += Qi 

480 d = qmat.shape[0] 

481 qmat.flat[m*(d+1)::d+1] += di 

482 if sparse.issparse(A): 

483 qmati = sparse.linalg.spsolve(sparse.csc_matrix(qmat), A.T) 

484 else: 

485 qmati = np.linalg.solve(qmat, A.T) 

486 

487 if sparse.issparse(A): 

488 def solver(rhs): 

489 ql = qmati.dot(rhs) 

490 ql = A.dot(ql) 

491 return rhs / s - ql / s**2 

492 else: 

493 def solver(rhs): 

494 ql = np.dot(qmati, rhs) 

495 ql = np.dot(A, ql) 

496 return rhs / s - ql / s**2 

497 

498 return solver 

499 

500 

501def _smw_logdet(s, A, AtA, Qi, di, B_logdet): 

502 r""" 

503 Returns the log determinant of 

504 

505 .. math:: 

506 

507 sI + ABA^\prime 

508 

509 Uses the matrix determinant lemma to accelerate the calculation. 

510 B is assumed to be positive definite, and s > 0, therefore the 

511 determinant is positive. 

512 

513 Parameters 

514 ---------- 

515 s : positive scalar 

516 See above for usage 

517 A : ndarray 

518 p x q matrix, in general q << p. 

519 AtA : square ndarray 

520 :math:`A^\prime A`, a q x q matrix. 

521 Qi : square symmetric ndarray 

522 The matrix `B` is q x q, where q = r + d. `B` consists of a r 

523 x r diagonal block whose inverse is `Qi`, and a d x d diagonal 

524 block, whose inverse is diag(di). 

525 di : 1d array_like 

526 See documentation for Qi. 

527 B_logdet : real 

528 The log determinant of B 

529 

530 Returns 

531 ------- 

532 The log determinant of s*I + A*B*A'. 

533 

534 Notes 

535 ----- 

536 Uses the matrix determinant lemma: 

537 https://en.wikipedia.org/wiki/Matrix_determinant_lemma 

538 """ 

539 

540 p = A.shape[0] 

541 ld = p * np.log(s) 

542 qmat = AtA / s 

543 m = Qi.shape[0] 

544 qmat[0:m, 0:m] += Qi 

545 d = qmat.shape[0] 

546 qmat.flat[m*(d+1)::d+1] += di 

547 _, ld1 = np.linalg.slogdet(qmat) 

548 return B_logdet + ld + ld1 

549 

550 

551def _convert_vc(exog_vc): 

552 

553 vc_names = [] 

554 vc_colnames = [] 

555 vc_mats = [] 

556 

557 # Get the groups in sorted order 

558 groups = set([]) 

559 for k, v in exog_vc.items(): 

560 groups |= set(v.keys()) 

561 groups = list(groups) 

562 groups.sort() 

563 

564 for k, v in exog_vc.items(): 

565 vc_names.append(k) 

566 colnames, mats = [], [] 

567 for g in groups: 

568 try: 

569 colnames.append(v[g].columns) 

570 except AttributeError: 

571 colnames.append([str(j) for j in range(v[g].shape[1])]) 

572 mats.append(v[g]) 

573 vc_colnames.append(colnames) 

574 vc_mats.append(mats) 

575 

576 ii = np.argsort(vc_names) 

577 vc_names = [vc_names[i] for i in ii] 

578 vc_colnames = [vc_colnames[i] for i in ii] 

579 vc_mats = [vc_mats[i] for i in ii] 

580 

581 return VCSpec(vc_names, vc_colnames, vc_mats) 

582 

583 

584class MixedLM(base.LikelihoodModel): 

585 """ 

586 Linear Mixed Effects Model 

587 

588 Parameters 

589 ---------- 

590 endog : 1d array_like 

591 The dependent variable 

592 exog : 2d array_like 

593 A matrix of covariates used to determine the 

594 mean structure (the "fixed effects" covariates). 

595 groups : 1d array_like 

596 A vector of labels determining the groups -- data from 

597 different groups are independent 

598 exog_re : 2d array_like 

599 A matrix of covariates used to determine the variance and 

600 covariance structure (the "random effects" covariates). If 

601 None, defaults to a random intercept for each group. 

602 exog_vc : VCSpec instance or dict-like (deprecated) 

603 A VCSPec instance defines the structure of the variance 

604 components in the model. Alternatively, see notes below 

605 for a dictionary-based format. The dictionary format is 

606 deprecated and may be removed at some point in the future. 

607 use_sqrt : bool 

608 If True, optimization is carried out using the lower 

609 triangle of the square root of the random effects 

610 covariance matrix, otherwise it is carried out using the 

611 lower triangle of the random effects covariance matrix. 

612 missing : str 

613 The approach to missing data handling 

614 

615 Notes 

616 ----- 

617 If `exog_vc` is not a `VCSpec` instance, then it must be a 

618 dictionary of dictionaries. Specifically, `exog_vc[a][g]` is a 

619 matrix whose columns are linearly combined using independent 

620 random coefficients. This random term then contributes to the 

621 variance structure of the data for group `g`. The random 

622 coefficients all have mean zero, and have the same variance. The 

623 matrix must be `m x k`, where `m` is the number of observations in 

624 group `g`. The number of columns may differ among the top-level 

625 groups. 

626 

627 The covariates in `exog`, `exog_re` and `exog_vc` may (but need 

628 not) partially or wholly overlap. 

629 

630 `use_sqrt` should almost always be set to True. The main use case 

631 for use_sqrt=False is when complicated patterns of fixed values in 

632 the covariance structure are set (using the `free` argument to 

633 `fit`) that cannot be expressed in terms of the Cholesky factor L. 

634 

635 Examples 

636 -------- 

637 A basic mixed model with fixed effects for the columns of 

638 ``exog`` and a random intercept for each distinct value of 

639 ``group``: 

640 

641 >>> model = sm.MixedLM(endog, exog, groups) 

642 >>> result = model.fit() 

643 

644 A mixed model with fixed effects for the columns of ``exog`` and 

645 correlated random coefficients for the columns of ``exog_re``: 

646 

647 >>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re) 

648 >>> result = model.fit() 

649 

650 A mixed model with fixed effects for the columns of ``exog`` and 

651 independent random coefficients for the columns of ``exog_re``: 

652 

653 >>> free = MixedLMParams.from_components( 

654 fe_params=np.ones(exog.shape[1]), 

655 cov_re=np.eye(exog_re.shape[1])) 

656 >>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re) 

657 >>> result = model.fit(free=free) 

658 

659 A different way to specify independent random coefficients for the 

660 columns of ``exog_re``. In this example ``groups`` must be a 

661 Pandas Series with compatible indexing with ``exog_re``, and 

662 ``exog_re`` has two columns. 

663 

664 >>> g = pd.groupby(groups, by=groups).groups 

665 >>> vc = {} 

666 >>> vc['1'] = {k : exog_re.loc[g[k], 0] for k in g} 

667 >>> vc['2'] = {k : exog_re.loc[g[k], 1] for k in g} 

668 >>> model = sm.MixedLM(endog, exog, groups, vcomp=vc) 

669 >>> result = model.fit() 

670 """ 

671 

672 def __init__(self, endog, exog, groups, exog_re=None, 

673 exog_vc=None, use_sqrt=True, missing='none', 

674 **kwargs): 

675 

676 _allowed_kwargs = ["missing_idx", "design_info", "formula"] 

677 for x in kwargs.keys(): 

678 if x not in _allowed_kwargs: 

679 raise ValueError( 

680 "argument %s not permitted for MixedLM initialization" % x) 

681 

682 self.use_sqrt = use_sqrt 

683 

684 # Some defaults 

685 self.reml = True 

686 self.fe_pen = None 

687 self.re_pen = None 

688 

689 if isinstance(exog_vc, dict): 

690 warnings.warn("Using deprecated variance components format") 

691 # Convert from old to new representation 

692 exog_vc = _convert_vc(exog_vc) 

693 

694 if exog_vc is not None: 

695 self.k_vc = len(exog_vc.names) 

696 self.exog_vc = exog_vc 

697 else: 

698 self.k_vc = 0 

699 self.exog_vc = VCSpec([], [], []) 

700 

701 # If there is one covariate, it may be passed in as a column 

702 # vector, convert these to 2d arrays. 

703 # TODO: Can this be moved up in the class hierarchy? 

704 # yes, it should be done up the hierarchy 

705 if (exog is not None and 

706 data_tools._is_using_ndarray_type(exog, None) and 

707 exog.ndim == 1): 

708 exog = exog[:, None] 

709 if (exog_re is not None and 

710 data_tools._is_using_ndarray_type(exog_re, None) and 

711 exog_re.ndim == 1): 

712 exog_re = exog_re[:, None] 

713 

714 # Calling super creates self.endog, etc. as ndarrays and the 

715 # original exog, endog, etc. are self.data.endog, etc. 

716 super(MixedLM, self).__init__(endog, exog, groups=groups, 

717 exog_re=exog_re, missing=missing, 

718 **kwargs) 

719 

720 self._init_keys.extend(["use_sqrt", "exog_vc"]) 

721 

722 # Number of fixed effects parameters 

723 self.k_fe = exog.shape[1] 

724 

725 if exog_re is None and len(self.exog_vc.names) == 0: 

726 # Default random effects structure (random intercepts). 

727 self.k_re = 1 

728 self.k_re2 = 1 

729 self.exog_re = np.ones((len(endog), 1), dtype=np.float64) 

730 self.data.exog_re = self.exog_re 

731 names = ['Group Var'] 

732 self.data.param_names = self.exog_names + names 

733 self.data.exog_re_names = names 

734 self.data.exog_re_names_full = names 

735 

736 elif exog_re is not None: 

737 # Process exog_re the same way that exog is handled 

738 # upstream 

739 # TODO: this is wrong and should be handled upstream wholly 

740 self.data.exog_re = exog_re 

741 self.exog_re = np.asarray(exog_re) 

742 if self.exog_re.ndim == 1: 

743 self.exog_re = self.exog_re[:, None] 

744 # Model dimensions 

745 # Number of random effect covariates 

746 self.k_re = self.exog_re.shape[1] 

747 # Number of covariance parameters 

748 self.k_re2 = self.k_re * (self.k_re + 1) // 2 

749 

750 else: 

751 # All random effects are variance components 

752 self.k_re = 0 

753 self.k_re2 = 0 

754 

755 if not self.data._param_names: 

756 # HACK: could have been set in from_formula already 

757 # needs refactor 

758 (param_names, exog_re_names, 

759 exog_re_names_full) = self._make_param_names(exog_re) 

760 self.data.param_names = param_names 

761 self.data.exog_re_names = exog_re_names 

762 self.data.exog_re_names_full = exog_re_names_full 

763 

764 self.k_params = self.k_fe + self.k_re2 

765 

766 # Convert the data to the internal representation, which is a 

767 # list of arrays, corresponding to the groups. 

768 group_labels = list(set(groups)) 

769 group_labels.sort() 

770 row_indices = dict((s, []) for s in group_labels) 

771 for i, g in enumerate(groups): 

772 row_indices[g].append(i) 

773 self.row_indices = row_indices 

774 self.group_labels = group_labels 

775 self.n_groups = len(self.group_labels) 

776 

777 # Split the data by groups 

778 self.endog_li = self.group_list(self.endog) 

779 self.exog_li = self.group_list(self.exog) 

780 self.exog_re_li = self.group_list(self.exog_re) 

781 

782 # Precompute this. 

783 if self.exog_re is None: 

784 self.exog_re2_li = None 

785 else: 

786 self.exog_re2_li = [np.dot(x.T, x) for x in self.exog_re_li] 

787 

788 # The total number of observations, summed over all groups 

789 self.nobs = len(self.endog) 

790 self.n_totobs = self.nobs 

791 

792 # Set the fixed effects parameter names 

793 if self.exog_names is None: 

794 self.exog_names = ["FE%d" % (k + 1) for k in 

795 range(self.exog.shape[1])] 

796 

797 # Precompute this 

798 self._aex_r = [] 

799 self._aex_r2 = [] 

800 for i in range(self.n_groups): 

801 a = self._augment_exog(i) 

802 self._aex_r.append(a) 

803 

804 # This matrix is not very sparse so convert it to dense. 

805 ma = _dot(a.T, a) 

806 if sparse.issparse(ma): 

807 ma = ma.todense() 

808 self._aex_r2.append(ma) 

809 

810 # Precompute this 

811 self._lin, self._quad = self._reparam() 

812 

813 def _make_param_names(self, exog_re): 

814 """ 

815 Returns the full parameter names list, just the exogenous random 

816 effects variables, and the exogenous random effects variables with 

817 the interaction terms. 

818 """ 

819 exog_names = list(self.exog_names) 

820 exog_re_names = _get_exog_re_names(self, exog_re) 

821 param_names = [] 

822 

823 jj = self.k_fe 

824 for i in range(len(exog_re_names)): 

825 for j in range(i + 1): 

826 if i == j: 

827 param_names.append(exog_re_names[i] + " Var") 

828 else: 

829 param_names.append(exog_re_names[j] + " x " + 

830 exog_re_names[i] + " Cov") 

831 jj += 1 

832 

833 vc_names = [x + " Var" for x in self.exog_vc.names] 

834 

835 return exog_names + param_names + vc_names, exog_re_names, param_names 

836 

837 @classmethod 

838 def from_formula(cls, formula, data, re_formula=None, vc_formula=None, 

839 subset=None, use_sparse=False, missing='none', *args, 

840 **kwargs): 

841 """ 

842 Create a Model from a formula and dataframe. 

843 

844 Parameters 

845 ---------- 

846 formula : str or generic Formula object 

847 The formula specifying the model 

848 data : array_like 

849 The data for the model. See Notes. 

850 re_formula : str 

851 A one-sided formula defining the variance structure of the 

852 model. The default gives a random intercept for each 

853 group. 

854 vc_formula : dict-like 

855 Formulas describing variance components. `vc_formula[vc]` is 

856 the formula for the component with variance parameter named 

857 `vc`. The formula is processed into a matrix, and the columns 

858 of this matrix are linearly combined with independent random 

859 coefficients having mean zero and a common variance. 

860 subset : array_like 

861 An array-like object of booleans, integers, or index 

862 values that indicate the subset of df to use in the 

863 model. Assumes df is a `pandas.DataFrame` 

864 missing : str 

865 Either 'none' or 'drop' 

866 args : extra arguments 

867 These are passed to the model 

868 kwargs : extra keyword arguments 

869 These are passed to the model with one exception. The 

870 ``eval_env`` keyword is passed to patsy. It can be either a 

871 :class:`patsy:patsy.EvalEnvironment` object or an integer 

872 indicating the depth of the namespace to use. For example, the 

873 default ``eval_env=0`` uses the calling namespace. If you wish 

874 to use a "clean" environment set ``eval_env=-1``. 

875 

876 Returns 

877 ------- 

878 model : Model instance 

879 

880 Notes 

881 ----- 

882 `data` must define __getitem__ with the keys in the formula 

883 terms args and kwargs are passed on to the model 

884 instantiation. E.g., a numpy structured or rec array, a 

885 dictionary, or a pandas DataFrame. 

886 

887 If the variance component is intended to produce random 

888 intercepts for disjoint subsets of a group, specified by 

889 string labels or a categorical data value, always use '0 +' in 

890 the formula so that no overall intercept is included. 

891 

892 If the variance components specify random slopes and you do 

893 not also want a random group-level intercept in the model, 

894 then use '0 +' in the formula to exclude the intercept. 

895 

896 The variance components formulas are processed separately for 

897 each group. If a variable is categorical the results will not 

898 be affected by whether the group labels are distinct or 

899 re-used over the top-level groups. 

900 

901 Examples 

902 -------- 

903 Suppose we have data from an educational study with students 

904 nested in classrooms nested in schools. The students take a 

905 test, and we want to relate the test scores to the students' 

906 ages, while accounting for the effects of classrooms and 

907 schools. The school will be the top-level group, and the 

908 classroom is a nested group that is specified as a variance 

909 component. Note that the schools may have different number of 

910 classrooms, and the classroom labels may (but need not be) 

911 different across the schools. 

912 

913 >>> vc = {'classroom': '0 + C(classroom)'} 

914 >>> MixedLM.from_formula('test_score ~ age', vc_formula=vc, \ 

915 re_formula='1', groups='school', data=data) 

916 

917 Now suppose we also have a previous test score called 

918 'pretest'. If we want the relationship between pretest 

919 scores and the current test to vary by classroom, we can 

920 specify a random slope for the pretest score 

921 

922 >>> vc = {'classroom': '0 + C(classroom)', 'pretest': '0 + pretest'} 

923 >>> MixedLM.from_formula('test_score ~ age + pretest', vc_formula=vc, \ 

924 re_formula='1', groups='school', data=data) 

925 

926 The following model is almost equivalent to the previous one, 

927 but here the classroom random intercept and pretest slope may 

928 be correlated. 

929 

930 >>> vc = {'classroom': '0 + C(classroom)'} 

931 >>> MixedLM.from_formula('test_score ~ age + pretest', vc_formula=vc, \ 

932 re_formula='1 + pretest', groups='school', \ 

933 data=data) 

934 """ 

935 

936 if "groups" not in kwargs.keys(): 

937 raise AttributeError("'groups' is a required keyword argument " + 

938 "in MixedLM.from_formula") 

939 groups = kwargs["groups"] 

940 

941 # If `groups` is a variable name, retrieve the data for the 

942 # groups variable. 

943 group_name = "Group" 

944 if isinstance(groups, str): 

945 group_name = groups 

946 groups = np.asarray(data[groups]) 

947 else: 

948 groups = np.asarray(groups) 

949 del kwargs["groups"] 

950 

951 # Bypass all upstream missing data handling to properly handle 

952 # variance components 

953 if missing == 'drop': 

954 data, groups = _handle_missing(data, groups, formula, re_formula, 

955 vc_formula) 

956 missing = 'none' 

957 

958 if re_formula is not None: 

959 if re_formula.strip() == "1": 

960 # Work around Patsy bug, fixed by 0.3. 

961 exog_re = np.ones((data.shape[0], 1)) 

962 exog_re_names = [group_name] 

963 else: 

964 eval_env = kwargs.get('eval_env', None) 

965 if eval_env is None: 

966 eval_env = 1 

967 elif eval_env == -1: 

968 from patsy import EvalEnvironment 

969 eval_env = EvalEnvironment({}) 

970 exog_re = patsy.dmatrix(re_formula, data, eval_env=eval_env) 

971 exog_re_names = exog_re.design_info.column_names 

972 exog_re_names = [x.replace("Intercept", group_name) 

973 for x in exog_re_names] 

974 exog_re = np.asarray(exog_re) 

975 if exog_re.ndim == 1: 

976 exog_re = exog_re[:, None] 

977 else: 

978 exog_re = None 

979 if vc_formula is None: 

980 exog_re_names = [group_name] 

981 else: 

982 exog_re_names = [] 

983 

984 if vc_formula is not None: 

985 eval_env = kwargs.get('eval_env', None) 

986 if eval_env is None: 

987 eval_env = 1 

988 elif eval_env == -1: 

989 from patsy import EvalEnvironment 

990 eval_env = EvalEnvironment({}) 

991 

992 vc_mats = [] 

993 vc_colnames = [] 

994 vc_names = [] 

995 gb = data.groupby(groups) 

996 kylist = sorted(gb.groups.keys()) 

997 vcf = sorted(vc_formula.keys()) 

998 for vc_name in vcf: 

999 md = patsy.ModelDesc.from_formula(vc_formula[vc_name]) 

1000 vc_names.append(vc_name) 

1001 evc_mats, evc_colnames = [], [] 

1002 for group_ix, group in enumerate(kylist): 

1003 ii = gb.groups[group] 

1004 mat = patsy.dmatrix( 

1005 md, 

1006 data.loc[ii, :], 

1007 eval_env=eval_env, 

1008 return_type='dataframe') 

1009 evc_colnames.append(mat.columns.tolist()) 

1010 if use_sparse: 

1011 evc_mats.append(sparse.csr_matrix(mat)) 

1012 else: 

1013 evc_mats.append(np.asarray(mat)) 

1014 vc_mats.append(evc_mats) 

1015 vc_colnames.append(evc_colnames) 

1016 exog_vc = VCSpec(vc_names, vc_colnames, vc_mats) 

1017 else: 

1018 exog_vc = VCSpec([], [], []) 

1019 

1020 kwargs["subset"] = None 

1021 kwargs["exog_re"] = exog_re 

1022 kwargs["exog_vc"] = exog_vc 

1023 kwargs["groups"] = groups 

1024 mod = super(MixedLM, cls).from_formula( 

1025 formula, data, *args, **kwargs) 

1026 

1027 # expand re names to account for pairs of RE 

1028 (param_names, 

1029 exog_re_names, 

1030 exog_re_names_full) = mod._make_param_names(exog_re_names) 

1031 

1032 mod.data.param_names = param_names 

1033 mod.data.exog_re_names = exog_re_names 

1034 mod.data.exog_re_names_full = exog_re_names_full 

1035 

1036 if vc_formula is not None: 

1037 mod.data.vcomp_names = mod.exog_vc.names 

1038 

1039 return mod 

1040 

1041 def predict(self, params, exog=None): 

1042 """ 

1043 Return predicted values from a design matrix. 

1044 

1045 Parameters 

1046 ---------- 

1047 params : array_like 

1048 Parameters of a mixed linear model. Can be either a 

1049 MixedLMParams instance, or a vector containing the packed 

1050 model parameters in which the fixed effects parameters are 

1051 at the beginning of the vector, or a vector containing 

1052 only the fixed effects parameters. 

1053 exog : array_like, optional 

1054 Design / exogenous data for the fixed effects. Model exog 

1055 is used if None. 

1056 

1057 Returns 

1058 ------- 

1059 An array of fitted values. Note that these predicted values 

1060 only reflect the fixed effects mean structure of the model. 

1061 """ 

1062 if exog is None: 

1063 exog = self.exog 

1064 

1065 if isinstance(params, MixedLMParams): 

1066 params = params.fe_params 

1067 else: 

1068 params = params[0:self.k_fe] 

1069 

1070 return np.dot(exog, params) 

1071 

1072 def group_list(self, array): 

1073 """ 

1074 Returns `array` split into subarrays corresponding to the 

1075 grouping structure. 

1076 """ 

1077 

1078 if array is None: 

1079 return None 

1080 

1081 if array.ndim == 1: 

1082 return [np.array(array[self.row_indices[k]]) 

1083 for k in self.group_labels] 

1084 else: 

1085 return [np.array(array[self.row_indices[k], :]) 

1086 for k in self.group_labels] 

1087 

1088 def fit_regularized(self, start_params=None, method='l1', alpha=0, 

1089 ceps=1e-4, ptol=1e-6, maxit=200, **fit_kwargs): 

1090 """ 

1091 Fit a model in which the fixed effects parameters are 

1092 penalized. The dependence parameters are held fixed at their 

1093 estimated values in the unpenalized model. 

1094 

1095 Parameters 

1096 ---------- 

1097 method : str of Penalty object 

1098 Method for regularization. If a string, must be 'l1'. 

1099 alpha : array_like 

1100 Scalar or vector of penalty weights. If a scalar, the 

1101 same weight is applied to all coefficients; if a vector, 

1102 it contains a weight for each coefficient. If method is a 

1103 Penalty object, the weights are scaled by alpha. For L1 

1104 regularization, the weights are used directly. 

1105 ceps : positive real scalar 

1106 Fixed effects parameters smaller than this value 

1107 in magnitude are treated as being zero. 

1108 ptol : positive real scalar 

1109 Convergence occurs when the sup norm difference 

1110 between successive values of `fe_params` is less than 

1111 `ptol`. 

1112 maxit : int 

1113 The maximum number of iterations. 

1114 fit_kwargs : keywords 

1115 Additional keyword arguments passed to fit. 

1116 

1117 Returns 

1118 ------- 

1119 A MixedLMResults instance containing the results. 

1120 

1121 Notes 

1122 ----- 

1123 The covariance structure is not updated as the fixed effects 

1124 parameters are varied. 

1125 

1126 The algorithm used here for L1 regularization is a"shooting" 

1127 or cyclic coordinate descent algorithm. 

1128 

1129 If method is 'l1', then `fe_pen` and `cov_pen` are used to 

1130 obtain the covariance structure, but are ignored during the 

1131 L1-penalized fitting. 

1132 

1133 References 

1134 ---------- 

1135 Friedman, J. H., Hastie, T. and Tibshirani, R. Regularized 

1136 Paths for Generalized Linear Models via Coordinate 

1137 Descent. Journal of Statistical Software, 33(1) (2008) 

1138 http://www.jstatsoft.org/v33/i01/paper 

1139 

1140 http://statweb.stanford.edu/~tibs/stat315a/Supplements/fuse.pdf 

1141 """ 

1142 

1143 if isinstance(method, str) and (method.lower() != 'l1'): 

1144 raise ValueError("Invalid regularization method") 

1145 

1146 # If method is a smooth penalty just optimize directly. 

1147 if isinstance(method, Penalty): 

1148 # Scale the penalty weights by alpha 

1149 method.alpha = alpha 

1150 fit_kwargs.update({"fe_pen": method}) 

1151 return self.fit(**fit_kwargs) 

1152 

1153 if np.isscalar(alpha): 

1154 alpha = alpha * np.ones(self.k_fe, dtype=np.float64) 

1155 

1156 # Fit the unpenalized model to get the dependence structure. 

1157 mdf = self.fit(**fit_kwargs) 

1158 fe_params = mdf.fe_params 

1159 cov_re = mdf.cov_re 

1160 vcomp = mdf.vcomp 

1161 scale = mdf.scale 

1162 try: 

1163 cov_re_inv = np.linalg.inv(cov_re) 

1164 except np.linalg.LinAlgError: 

1165 cov_re_inv = None 

1166 

1167 for itr in range(maxit): 

1168 

1169 fe_params_s = fe_params.copy() 

1170 for j in range(self.k_fe): 

1171 

1172 if abs(fe_params[j]) < ceps: 

1173 continue 

1174 

1175 # The residuals 

1176 fe_params[j] = 0. 

1177 expval = np.dot(self.exog, fe_params) 

1178 resid_all = self.endog - expval 

1179 

1180 # The loss function has the form 

1181 # a*x^2 + b*x + pwt*|x| 

1182 a, b = 0., 0. 

1183 for group_ix, group in enumerate(self.group_labels): 

1184 

1185 vc_var = self._expand_vcomp(vcomp, group_ix) 

1186 

1187 exog = self.exog_li[group_ix] 

1188 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix] 

1189 

1190 resid = resid_all[self.row_indices[group]] 

1191 solver = _smw_solver(scale, ex_r, ex2_r, cov_re_inv, 

1192 1 / vc_var) 

1193 

1194 x = exog[:, j] 

1195 u = solver(x) 

1196 a += np.dot(u, x) 

1197 b -= 2 * np.dot(u, resid) 

1198 

1199 pwt1 = alpha[j] 

1200 if b > pwt1: 

1201 fe_params[j] = -(b - pwt1) / (2 * a) 

1202 elif b < -pwt1: 

1203 fe_params[j] = -(b + pwt1) / (2 * a) 

1204 

1205 if np.abs(fe_params_s - fe_params).max() < ptol: 

1206 break 

1207 

1208 # Replace the fixed effects estimates with their penalized 

1209 # values, leave the dependence parameters in their unpenalized 

1210 # state. 

1211 params_prof = mdf.params.copy() 

1212 params_prof[0:self.k_fe] = fe_params 

1213 

1214 scale = self.get_scale(fe_params, mdf.cov_re_unscaled, mdf.vcomp) 

1215 

1216 # Get the Hessian including only the nonzero fixed effects, 

1217 # then blow back up to the full size after inverting. 

1218 hess = self.hessian(params_prof) 

1219 pcov = np.nan * np.ones_like(hess) 

1220 ii = np.abs(params_prof) > ceps 

1221 ii[self.k_fe:] = True 

1222 ii = np.flatnonzero(ii) 

1223 hess1 = hess[ii, :][:, ii] 

1224 pcov[np.ix_(ii, ii)] = np.linalg.inv(-hess1) 

1225 

1226 params_object = MixedLMParams.from_components(fe_params, cov_re=cov_re) 

1227 

1228 results = MixedLMResults(self, params_prof, pcov / scale) 

1229 results.params_object = params_object 

1230 results.fe_params = fe_params 

1231 results.cov_re = cov_re 

1232 results.scale = scale 

1233 results.cov_re_unscaled = mdf.cov_re_unscaled 

1234 results.method = mdf.method 

1235 results.converged = True 

1236 results.cov_pen = self.cov_pen 

1237 results.k_fe = self.k_fe 

1238 results.k_re = self.k_re 

1239 results.k_re2 = self.k_re2 

1240 results.k_vc = self.k_vc 

1241 

1242 return MixedLMResultsWrapper(results) 

1243 

1244 def get_fe_params(self, cov_re, vcomp): 

1245 """ 

1246 Use GLS to update the fixed effects parameter estimates. 

1247 

1248 Parameters 

1249 ---------- 

1250 cov_re : array_like 

1251 The covariance matrix of the random effects. 

1252 

1253 Returns 

1254 ------- 

1255 The GLS estimates of the fixed effects parameters. 

1256 """ 

1257 

1258 if self.k_fe == 0: 

1259 return np.array([]) 

1260 

1261 if self.k_re == 0: 

1262 cov_re_inv = np.empty((0, 0)) 

1263 else: 

1264 cov_re_inv = np.linalg.inv(cov_re) 

1265 

1266 # Cache these quantities that do not change. 

1267 if not hasattr(self, "_endex_li"): 

1268 self._endex_li = [] 

1269 for group_ix, _ in enumerate(self.group_labels): 

1270 mat = np.concatenate( 

1271 (self.exog_li[group_ix], 

1272 self.endog_li[group_ix][:, None]), axis=1) 

1273 self._endex_li.append(mat) 

1274 

1275 xtxy = 0. 

1276 for group_ix, group in enumerate(self.group_labels): 

1277 vc_var = self._expand_vcomp(vcomp, group_ix) 

1278 exog = self.exog_li[group_ix] 

1279 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix] 

1280 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var) 

1281 u = solver(self._endex_li[group_ix]) 

1282 xtxy += np.dot(exog.T, u) 

1283 

1284 fe_params = np.linalg.solve(xtxy[:, 0:-1], xtxy[:, -1]) 

1285 

1286 return fe_params 

1287 

1288 def _reparam(self): 

1289 """ 

1290 Returns parameters of the map converting parameters from the 

1291 form used in optimization to the form returned to the user. 

1292 

1293 Returns 

1294 ------- 

1295 lin : list-like 

1296 Linear terms of the map 

1297 quad : list-like 

1298 Quadratic terms of the map 

1299 

1300 Notes 

1301 ----- 

1302 If P are the standard form parameters and R are the 

1303 transformed parameters (i.e. with the Cholesky square root 

1304 covariance and square root transformed variance components), 

1305 then P[i] = lin[i] * R + R' * quad[i] * R 

1306 """ 

1307 

1308 k_fe, k_re, k_re2, k_vc = self.k_fe, self.k_re, self.k_re2, self.k_vc 

1309 k_tot = k_fe + k_re2 + k_vc 

1310 ix = np.tril_indices(self.k_re) 

1311 

1312 lin = [] 

1313 for k in range(k_fe): 

1314 e = np.zeros(k_tot) 

1315 e[k] = 1 

1316 lin.append(e) 

1317 for k in range(k_re2): 

1318 lin.append(np.zeros(k_tot)) 

1319 for k in range(k_vc): 

1320 lin.append(np.zeros(k_tot)) 

1321 

1322 quad = [] 

1323 # Quadratic terms for fixed effects. 

1324 for k in range(k_tot): 

1325 quad.append(np.zeros((k_tot, k_tot))) 

1326 

1327 # Quadratic terms for random effects covariance. 

1328 ii = np.tril_indices(k_re) 

1329 ix = [(a, b) for a, b in zip(ii[0], ii[1])] 

1330 for i1 in range(k_re2): 

1331 for i2 in range(k_re2): 

1332 ix1 = ix[i1] 

1333 ix2 = ix[i2] 

1334 if (ix1[1] == ix2[1]) and (ix1[0] <= ix2[0]): 

1335 ii = (ix2[0], ix1[0]) 

1336 k = ix.index(ii) 

1337 quad[k_fe+k][k_fe+i2, k_fe+i1] += 1 

1338 for k in range(k_tot): 

1339 quad[k] = 0.5*(quad[k] + quad[k].T) 

1340 

1341 # Quadratic terms for variance components. 

1342 km = k_fe + k_re2 

1343 for k in range(km, km+k_vc): 

1344 quad[k][k, k] = 1 

1345 

1346 return lin, quad 

1347 

1348 def _expand_vcomp(self, vcomp, group_ix): 

1349 """ 

1350 Replicate variance parameters to match a group's design. 

1351 

1352 Parameters 

1353 ---------- 

1354 vcomp : array_like 

1355 The variance parameters for the variance components. 

1356 group_ix : int 

1357 The group index 

1358 

1359 Returns an expanded version of vcomp, in which each variance 

1360 parameter is copied as many times as there are independent 

1361 realizations of the variance component in the given group. 

1362 """ 

1363 if len(vcomp) == 0: 

1364 return np.empty(0) 

1365 vc_var = [] 

1366 for j in range(len(self.exog_vc.names)): 

1367 d = self.exog_vc.mats[j][group_ix].shape[1] 

1368 vc_var.append(vcomp[j] * np.ones(d)) 

1369 if len(vc_var) > 0: 

1370 return np.concatenate(vc_var) 

1371 else: 

1372 # Cannot reach here? 

1373 return np.empty(0) 

1374 

1375 def _augment_exog(self, group_ix): 

1376 """ 

1377 Concatenate the columns for variance components to the columns 

1378 for other random effects to obtain a single random effects 

1379 exog matrix for a given group. 

1380 """ 

1381 ex_r = self.exog_re_li[group_ix] if self.k_re > 0 else None 

1382 if self.k_vc == 0: 

1383 return ex_r 

1384 

1385 ex = [ex_r] if self.k_re > 0 else [] 

1386 any_sparse = False 

1387 for j, _ in enumerate(self.exog_vc.names): 

1388 ex.append(self.exog_vc.mats[j][group_ix]) 

1389 any_sparse |= sparse.issparse(ex[-1]) 

1390 if any_sparse: 

1391 for j, x in enumerate(ex): 

1392 if not sparse.issparse(x): 

1393 ex[j] = sparse.csr_matrix(x) 

1394 ex = sparse.hstack(ex) 

1395 ex = sparse.csr_matrix(ex) 

1396 else: 

1397 ex = np.concatenate(ex, axis=1) 

1398 

1399 return ex 

1400 

1401 def loglike(self, params, profile_fe=True): 

1402 """ 

1403 Evaluate the (profile) log-likelihood of the linear mixed 

1404 effects model. 

1405 

1406 Parameters 

1407 ---------- 

1408 params : MixedLMParams, or array_like. 

1409 The parameter value. If array-like, must be a packed 

1410 parameter vector containing only the covariance 

1411 parameters. 

1412 profile_fe : bool 

1413 If True, replace the provided value of `fe_params` with 

1414 the GLS estimates. 

1415 

1416 Returns 

1417 ------- 

1418 The log-likelihood value at `params`. 

1419 

1420 Notes 

1421 ----- 

1422 The scale parameter `scale` is always profiled out of the 

1423 log-likelihood. In addition, if `profile_fe` is true the 

1424 fixed effects parameters are also profiled out. 

1425 """ 

1426 

1427 if type(params) is not MixedLMParams: 

1428 params = MixedLMParams.from_packed(params, self.k_fe, 

1429 self.k_re, self.use_sqrt, 

1430 has_fe=False) 

1431 

1432 cov_re = params.cov_re 

1433 vcomp = params.vcomp 

1434 

1435 # Move to the profile set 

1436 if profile_fe: 

1437 fe_params = self.get_fe_params(cov_re, vcomp) 

1438 else: 

1439 fe_params = params.fe_params 

1440 

1441 if self.k_re > 0: 

1442 try: 

1443 cov_re_inv = np.linalg.inv(cov_re) 

1444 except np.linalg.LinAlgError: 

1445 cov_re_inv = None 

1446 _, cov_re_logdet = np.linalg.slogdet(cov_re) 

1447 else: 

1448 cov_re_inv = np.zeros((0, 0)) 

1449 cov_re_logdet = 0 

1450 

1451 # The residuals 

1452 expval = np.dot(self.exog, fe_params) 

1453 resid_all = self.endog - expval 

1454 

1455 likeval = 0. 

1456 

1457 # Handle the covariance penalty 

1458 if (self.cov_pen is not None) and (self.k_re > 0): 

1459 likeval -= self.cov_pen.func(cov_re, cov_re_inv) 

1460 

1461 # Handle the fixed effects penalty 

1462 if (self.fe_pen is not None): 

1463 likeval -= self.fe_pen.func(fe_params) 

1464 

1465 xvx, qf = 0., 0. 

1466 for group_ix, group in enumerate(self.group_labels): 

1467 

1468 vc_var = self._expand_vcomp(vcomp, group_ix) 

1469 cov_aug_logdet = cov_re_logdet + np.sum(np.log(vc_var)) 

1470 

1471 exog = self.exog_li[group_ix] 

1472 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix] 

1473 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var) 

1474 

1475 resid = resid_all[self.row_indices[group]] 

1476 

1477 # Part 1 of the log likelihood (for both ML and REML) 

1478 ld = _smw_logdet(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var, 

1479 cov_aug_logdet) 

1480 likeval -= ld / 2. 

1481 

1482 # Part 2 of the log likelihood (for both ML and REML) 

1483 u = solver(resid) 

1484 qf += np.dot(resid, u) 

1485 

1486 # Adjustment for REML 

1487 if self.reml: 

1488 mat = solver(exog) 

1489 xvx += np.dot(exog.T, mat) 

1490 

1491 if self.reml: 

1492 likeval -= (self.n_totobs - self.k_fe) * np.log(qf) / 2. 

1493 _, ld = np.linalg.slogdet(xvx) 

1494 likeval -= ld / 2. 

1495 likeval -= (self.n_totobs - self.k_fe) * np.log(2 * np.pi) / 2. 

1496 likeval += ((self.n_totobs - self.k_fe) * 

1497 np.log(self.n_totobs - self.k_fe) / 2.) 

1498 likeval -= (self.n_totobs - self.k_fe) / 2. 

1499 else: 

1500 likeval -= self.n_totobs * np.log(qf) / 2. 

1501 likeval -= self.n_totobs * np.log(2 * np.pi) / 2. 

1502 likeval += self.n_totobs * np.log(self.n_totobs) / 2. 

1503 likeval -= self.n_totobs / 2. 

1504 

1505 return likeval 

1506 

1507 def _gen_dV_dPar(self, ex_r, solver, group_ix, max_ix=None): 

1508 """ 

1509 A generator that yields the element-wise derivative of the 

1510 marginal covariance matrix with respect to the random effects 

1511 variance and covariance parameters. 

1512 

1513 ex_r : array_like 

1514 The random effects design matrix 

1515 solver : function 

1516 A function that given x returns V^{-1}x, where V 

1517 is the group's marginal covariance matrix. 

1518 group_ix : int 

1519 The group index 

1520 max_ix : {int, None} 

1521 If not None, the generator ends when this index 

1522 is reached. 

1523 """ 

1524 

1525 axr = solver(ex_r) 

1526 

1527 # Regular random effects 

1528 jj = 0 

1529 for j1 in range(self.k_re): 

1530 for j2 in range(j1 + 1): 

1531 if max_ix is not None and jj > max_ix: 

1532 return 

1533 # Need 2d 

1534 mat_l, mat_r = ex_r[:, j1:j1+1], ex_r[:, j2:j2+1] 

1535 vsl, vsr = axr[:, j1:j1+1], axr[:, j2:j2+1] 

1536 yield jj, mat_l, mat_r, vsl, vsr, j1 == j2 

1537 jj += 1 

1538 

1539 # Variance components 

1540 for j, _ in enumerate(self.exog_vc.names): 

1541 if max_ix is not None and jj > max_ix: 

1542 return 

1543 mat = self.exog_vc.mats[j][group_ix] 

1544 axmat = solver(mat) 

1545 yield jj, mat, mat, axmat, axmat, True 

1546 jj += 1 

1547 

1548 def score(self, params, profile_fe=True): 

1549 """ 

1550 Returns the score vector of the profile log-likelihood. 

1551 

1552 Notes 

1553 ----- 

1554 The score vector that is returned is computed with respect to 

1555 the parameterization defined by this model instance's 

1556 `use_sqrt` attribute. 

1557 """ 

1558 

1559 if type(params) is not MixedLMParams: 

1560 params = MixedLMParams.from_packed( 

1561 params, self.k_fe, self.k_re, self.use_sqrt, 

1562 has_fe=False) 

1563 

1564 if profile_fe: 

1565 params.fe_params = self.get_fe_params(params.cov_re, params.vcomp) 

1566 

1567 if self.use_sqrt: 

1568 score_fe, score_re, score_vc = self.score_sqrt( 

1569 params, calc_fe=not profile_fe) 

1570 else: 

1571 score_fe, score_re, score_vc = self.score_full( 

1572 params, calc_fe=not profile_fe) 

1573 

1574 if self._freepat is not None: 

1575 score_fe *= self._freepat.fe_params 

1576 score_re *= self._freepat.cov_re[self._freepat._ix] 

1577 score_vc *= self._freepat.vcomp 

1578 

1579 if profile_fe: 

1580 return np.concatenate((score_re, score_vc)) 

1581 else: 

1582 return np.concatenate((score_fe, score_re, score_vc)) 

1583 

1584 def score_full(self, params, calc_fe): 

1585 """ 

1586 Returns the score with respect to untransformed parameters. 

1587 

1588 Calculates the score vector for the profiled log-likelihood of 

1589 the mixed effects model with respect to the parameterization 

1590 in which the random effects covariance matrix is represented 

1591 in its full form (not using the Cholesky factor). 

1592 

1593 Parameters 

1594 ---------- 

1595 params : MixedLMParams or array_like 

1596 The parameter at which the score function is evaluated. 

1597 If array-like, must contain the packed random effects 

1598 parameters (cov_re and vcomp) without fe_params. 

1599 calc_fe : bool 

1600 If True, calculate the score vector for the fixed effects 

1601 parameters. If False, this vector is not calculated, and 

1602 a vector of zeros is returned in its place. 

1603 

1604 Returns 

1605 ------- 

1606 score_fe : array_like 

1607 The score vector with respect to the fixed effects 

1608 parameters. 

1609 score_re : array_like 

1610 The score vector with respect to the random effects 

1611 parameters (excluding variance components parameters). 

1612 score_vc : array_like 

1613 The score vector with respect to variance components 

1614 parameters. 

1615 

1616 Notes 

1617 ----- 

1618 `score_re` is taken with respect to the parameterization in 

1619 which `cov_re` is represented through its lower triangle 

1620 (without taking the Cholesky square root). 

1621 """ 

1622 

1623 fe_params = params.fe_params 

1624 cov_re = params.cov_re 

1625 vcomp = params.vcomp 

1626 

1627 try: 

1628 cov_re_inv = np.linalg.inv(cov_re) 

1629 except np.linalg.LinAlgError: 

1630 cov_re_inv = None 

1631 

1632 score_fe = np.zeros(self.k_fe) 

1633 score_re = np.zeros(self.k_re2) 

1634 score_vc = np.zeros(self.k_vc) 

1635 

1636 # Handle the covariance penalty. 

1637 if self.cov_pen is not None: 

1638 score_re -= self.cov_pen.deriv(cov_re, cov_re_inv) 

1639 

1640 # Handle the fixed effects penalty. 

1641 if calc_fe and (self.fe_pen is not None): 

1642 score_fe -= self.fe_pen.deriv(fe_params) 

1643 

1644 # resid' V^{-1} resid, summed over the groups (a scalar) 

1645 rvir = 0. 

1646 

1647 # exog' V^{-1} resid, summed over the groups (a k_fe 

1648 # dimensional vector) 

1649 xtvir = 0. 

1650 

1651 # exog' V^{_1} exog, summed over the groups (a k_fe x k_fe 

1652 # matrix) 

1653 xtvix = 0. 

1654 

1655 # V^{-1} exog' dV/dQ_jj exog V^{-1}, where Q_jj is the jj^th 

1656 # covariance parameter. 

1657 xtax = [0., ] * (self.k_re2 + self.k_vc) 

1658 

1659 # Temporary related to the gradient of log |V| 

1660 dlv = np.zeros(self.k_re2 + self.k_vc) 

1661 

1662 # resid' V^{-1} dV/dQ_jj V^{-1} resid (a scalar) 

1663 rvavr = np.zeros(self.k_re2 + self.k_vc) 

1664 

1665 for group_ix, group in enumerate(self.group_labels): 

1666 

1667 vc_var = self._expand_vcomp(vcomp, group_ix) 

1668 

1669 exog = self.exog_li[group_ix] 

1670 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix] 

1671 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var) 

1672 

1673 # The residuals 

1674 resid = self.endog_li[group_ix] 

1675 if self.k_fe > 0: 

1676 expval = np.dot(exog, fe_params) 

1677 resid = resid - expval 

1678 

1679 if self.reml: 

1680 viexog = solver(exog) 

1681 xtvix += np.dot(exog.T, viexog) 

1682 

1683 # Contributions to the covariance parameter gradient 

1684 vir = solver(resid) 

1685 for (jj, matl, matr, vsl, vsr, sym) in\ 

1686 self._gen_dV_dPar(ex_r, solver, group_ix): 

1687 dlv[jj] = _dotsum(matr, vsl) 

1688 if not sym: 

1689 dlv[jj] += _dotsum(matl, vsr) 

1690 

1691 ul = _dot(vir, matl) 

1692 ur = ul.T if sym else _dot(matr.T, vir) 

1693 ulr = np.dot(ul, ur) 

1694 rvavr[jj] += ulr 

1695 if not sym: 

1696 rvavr[jj] += ulr.T 

1697 

1698 if self.reml: 

1699 ul = _dot(viexog.T, matl) 

1700 ur = ul.T if sym else _dot(matr.T, viexog) 

1701 ulr = np.dot(ul, ur) 

1702 xtax[jj] += ulr 

1703 if not sym: 

1704 xtax[jj] += ulr.T 

1705 

1706 # Contribution of log|V| to the covariance parameter 

1707 # gradient. 

1708 if self.k_re > 0: 

1709 score_re -= 0.5 * dlv[0:self.k_re2] 

1710 if self.k_vc > 0: 

1711 score_vc -= 0.5 * dlv[self.k_re2:] 

1712 

1713 rvir += np.dot(resid, vir) 

1714 

1715 if calc_fe: 

1716 xtvir += np.dot(exog.T, vir) 

1717 

1718 fac = self.n_totobs 

1719 if self.reml: 

1720 fac -= self.k_fe 

1721 

1722 if calc_fe and self.k_fe > 0: 

1723 score_fe += fac * xtvir / rvir 

1724 

1725 if self.k_re > 0: 

1726 score_re += 0.5 * fac * rvavr[0:self.k_re2] / rvir 

1727 if self.k_vc > 0: 

1728 score_vc += 0.5 * fac * rvavr[self.k_re2:] / rvir 

1729 

1730 if self.reml: 

1731 xtvixi = np.linalg.inv(xtvix) 

1732 for j in range(self.k_re2): 

1733 score_re[j] += 0.5 * _dotsum(xtvixi.T, xtax[j]) 

1734 for j in range(self.k_vc): 

1735 score_vc[j] += 0.5 * _dotsum(xtvixi.T, xtax[self.k_re2 + j]) 

1736 

1737 return score_fe, score_re, score_vc 

1738 

1739 def score_sqrt(self, params, calc_fe=True): 

1740 """ 

1741 Returns the score with respect to transformed parameters. 

1742 

1743 Calculates the score vector with respect to the 

1744 parameterization in which the random effects covariance matrix 

1745 is represented through its Cholesky square root. 

1746 

1747 Parameters 

1748 ---------- 

1749 params : MixedLMParams or array_like 

1750 The model parameters. If array-like must contain packed 

1751 parameters that are compatible with this model instance. 

1752 calc_fe : bool 

1753 If True, calculate the score vector for the fixed effects 

1754 parameters. If False, this vector is not calculated, and 

1755 a vector of zeros is returned in its place. 

1756 

1757 Returns 

1758 ------- 

1759 score_fe : array_like 

1760 The score vector with respect to the fixed effects 

1761 parameters. 

1762 score_re : array_like 

1763 The score vector with respect to the random effects 

1764 parameters (excluding variance components parameters). 

1765 score_vc : array_like 

1766 The score vector with respect to variance components 

1767 parameters. 

1768 """ 

1769 

1770 score_fe, score_re, score_vc = self.score_full(params, calc_fe=calc_fe) 

1771 params_vec = params.get_packed(use_sqrt=True, has_fe=True) 

1772 

1773 score_full = np.concatenate((score_fe, score_re, score_vc)) 

1774 scr = 0. 

1775 for i in range(len(params_vec)): 

1776 v = self._lin[i] + 2 * np.dot(self._quad[i], params_vec) 

1777 scr += score_full[i] * v 

1778 score_fe = scr[0:self.k_fe] 

1779 score_re = scr[self.k_fe:self.k_fe + self.k_re2] 

1780 score_vc = scr[self.k_fe + self.k_re2:] 

1781 

1782 return score_fe, score_re, score_vc 

1783 

1784 def hessian(self, params): 

1785 """ 

1786 Returns the model's Hessian matrix. 

1787 

1788 Calculates the Hessian matrix for the linear mixed effects 

1789 model with respect to the parameterization in which the 

1790 covariance matrix is represented directly (without square-root 

1791 transformation). 

1792 

1793 Parameters 

1794 ---------- 

1795 params : MixedLMParams or array_like 

1796 The model parameters at which the Hessian is calculated. 

1797 If array-like, must contain the packed parameters in a 

1798 form that is compatible with this model instance. 

1799 

1800 Returns 

1801 ------- 

1802 hess : 2d ndarray 

1803 The Hessian matrix, evaluated at `params`. 

1804 """ 

1805 

1806 if type(params) is not MixedLMParams: 

1807 params = MixedLMParams.from_packed(params, self.k_fe, self.k_re, 

1808 use_sqrt=self.use_sqrt, 

1809 has_fe=True) 

1810 

1811 fe_params = params.fe_params 

1812 vcomp = params.vcomp 

1813 cov_re = params.cov_re 

1814 if self.k_re > 0: 

1815 cov_re_inv = np.linalg.inv(cov_re) 

1816 else: 

1817 cov_re_inv = np.empty((0, 0)) 

1818 

1819 # Blocks for the fixed and random effects parameters. 

1820 hess_fe = 0. 

1821 hess_re = np.zeros((self.k_re2 + self.k_vc, self.k_re2 + self.k_vc)) 

1822 hess_fere = np.zeros((self.k_re2 + self.k_vc, self.k_fe)) 

1823 

1824 fac = self.n_totobs 

1825 if self.reml: 

1826 fac -= self.exog.shape[1] 

1827 

1828 rvir = 0. 

1829 xtvix = 0. 

1830 xtax = [0., ] * (self.k_re2 + self.k_vc) 

1831 m = self.k_re2 + self.k_vc 

1832 B = np.zeros(m) 

1833 D = np.zeros((m, m)) 

1834 F = [[0.] * m for k in range(m)] 

1835 for group_ix, group in enumerate(self.group_labels): 

1836 

1837 vc_var = self._expand_vcomp(vcomp, group_ix) 

1838 

1839 exog = self.exog_li[group_ix] 

1840 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix] 

1841 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var) 

1842 

1843 # The residuals 

1844 resid = self.endog_li[group_ix] 

1845 if self.k_fe > 0: 

1846 expval = np.dot(exog, fe_params) 

1847 resid = resid - expval 

1848 

1849 viexog = solver(exog) 

1850 xtvix += np.dot(exog.T, viexog) 

1851 vir = solver(resid) 

1852 rvir += np.dot(resid, vir) 

1853 

1854 for (jj1, matl1, matr1, vsl1, vsr1, sym1) in\ 

1855 self._gen_dV_dPar(ex_r, solver, group_ix): 

1856 

1857 ul = _dot(viexog.T, matl1) 

1858 ur = _dot(matr1.T, vir) 

1859 hess_fere[jj1, :] += np.dot(ul, ur) 

1860 if not sym1: 

1861 ul = _dot(viexog.T, matr1) 

1862 ur = _dot(matl1.T, vir) 

1863 hess_fere[jj1, :] += np.dot(ul, ur) 

1864 

1865 if self.reml: 

1866 ul = _dot(viexog.T, matl1) 

1867 ur = ul if sym1 else np.dot(viexog.T, matr1) 

1868 ulr = _dot(ul, ur.T) 

1869 xtax[jj1] += ulr 

1870 if not sym1: 

1871 xtax[jj1] += ulr.T 

1872 

1873 ul = _dot(vir, matl1) 

1874 ur = ul if sym1 else _dot(vir, matr1) 

1875 B[jj1] += np.dot(ul, ur) * (1 if sym1 else 2) 

1876 

1877 # V^{-1} * dV/d_theta 

1878 E = [(vsl1, matr1)] 

1879 if not sym1: 

1880 E.append((vsr1, matl1)) 

1881 

1882 for (jj2, matl2, matr2, vsl2, vsr2, sym2) in\ 

1883 self._gen_dV_dPar(ex_r, solver, group_ix, jj1): 

1884 

1885 re = sum([_multi_dot_three(matr2.T, x[0], x[1].T) 

1886 for x in E]) 

1887 vt = 2 * _dot(_multi_dot_three(vir[None, :], matl2, re), 

1888 vir[:, None]) 

1889 

1890 if not sym2: 

1891 le = sum([_multi_dot_three(matl2.T, x[0], x[1].T) 

1892 for x in E]) 

1893 vt += 2 * _dot(_multi_dot_three( 

1894 vir[None, :], matr2, le), vir[:, None]) 

1895 

1896 D[jj1, jj2] += vt 

1897 if jj1 != jj2: 

1898 D[jj2, jj1] += vt 

1899 

1900 rt = _dotsum(vsl2, re.T) / 2 

1901 if not sym2: 

1902 rt += _dotsum(vsr2, le.T) / 2 

1903 

1904 hess_re[jj1, jj2] += rt 

1905 if jj1 != jj2: 

1906 hess_re[jj2, jj1] += rt 

1907 

1908 if self.reml: 

1909 ev = sum([_dot(x[0], _dot(x[1].T, viexog)) for x in E]) 

1910 u1 = _dot(viexog.T, matl2) 

1911 u2 = _dot(matr2.T, ev) 

1912 um = np.dot(u1, u2) 

1913 F[jj1][jj2] += um + um.T 

1914 if not sym2: 

1915 u1 = np.dot(viexog.T, matr2) 

1916 u2 = np.dot(matl2.T, ev) 

1917 um = np.dot(u1, u2) 

1918 F[jj1][jj2] += um + um.T 

1919 

1920 hess_fe -= fac * xtvix / rvir 

1921 hess_re = hess_re - 0.5 * fac * (D/rvir - np.outer(B, B) / rvir**2) 

1922 hess_fere = -fac * hess_fere / rvir 

1923 

1924 if self.reml: 

1925 QL = [np.linalg.solve(xtvix, x) for x in xtax] 

1926 for j1 in range(self.k_re2 + self.k_vc): 

1927 for j2 in range(j1 + 1): 

1928 a = _dotsum(QL[j1].T, QL[j2]) 

1929 a -= np.trace(np.linalg.solve(xtvix, F[j1][j2])) 

1930 a *= 0.5 

1931 hess_re[j1, j2] += a 

1932 if j1 > j2: 

1933 hess_re[j2, j1] += a 

1934 

1935 # Put the blocks together to get the Hessian. 

1936 m = self.k_fe + self.k_re2 + self.k_vc 

1937 hess = np.zeros((m, m)) 

1938 hess[0:self.k_fe, 0:self.k_fe] = hess_fe 

1939 hess[0:self.k_fe, self.k_fe:] = hess_fere.T 

1940 hess[self.k_fe:, 0:self.k_fe] = hess_fere 

1941 hess[self.k_fe:, self.k_fe:] = hess_re 

1942 

1943 return hess 

1944 

1945 def get_scale(self, fe_params, cov_re, vcomp): 

1946 """ 

1947 Returns the estimated error variance based on given estimates 

1948 of the slopes and random effects covariance matrix. 

1949 

1950 Parameters 

1951 ---------- 

1952 fe_params : array_like 

1953 The regression slope estimates 

1954 cov_re : 2d array_like 

1955 Estimate of the random effects covariance matrix 

1956 vcomp : array_like 

1957 Estimate of the variance components 

1958 

1959 Returns 

1960 ------- 

1961 scale : float 

1962 The estimated error variance. 

1963 """ 

1964 

1965 try: 

1966 cov_re_inv = np.linalg.inv(cov_re) 

1967 except np.linalg.LinAlgError: 

1968 cov_re_inv = None 

1969 

1970 qf = 0. 

1971 for group_ix, group in enumerate(self.group_labels): 

1972 

1973 vc_var = self._expand_vcomp(vcomp, group_ix) 

1974 

1975 exog = self.exog_li[group_ix] 

1976 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix] 

1977 

1978 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var) 

1979 

1980 # The residuals 

1981 resid = self.endog_li[group_ix] 

1982 if self.k_fe > 0: 

1983 expval = np.dot(exog, fe_params) 

1984 resid = resid - expval 

1985 

1986 mat = solver(resid) 

1987 qf += np.dot(resid, mat) 

1988 

1989 if self.reml: 

1990 qf /= (self.n_totobs - self.k_fe) 

1991 else: 

1992 qf /= self.n_totobs 

1993 

1994 return qf 

1995 

1996 def fit(self, start_params=None, reml=True, niter_sa=0, 

1997 do_cg=True, fe_pen=None, cov_pen=None, free=None, 

1998 full_output=False, method=None, **kwargs): 

1999 """ 

2000 Fit a linear mixed model to the data. 

2001 

2002 Parameters 

2003 ---------- 

2004 start_params : array_like or MixedLMParams 

2005 Starting values for the profile log-likelihood. If not a 

2006 `MixedLMParams` instance, this should be an array 

2007 containing the packed parameters for the profile 

2008 log-likelihood, including the fixed effects 

2009 parameters. 

2010 reml : bool 

2011 If true, fit according to the REML likelihood, else 

2012 fit the standard likelihood using ML. 

2013 niter_sa : int 

2014 Currently this argument is ignored and has no effect 

2015 on the results. 

2016 cov_pen : CovariancePenalty object 

2017 A penalty for the random effects covariance matrix 

2018 do_cg : bool, defaults to True 

2019 If False, the optimization is skipped and a results 

2020 object at the given (or default) starting values is 

2021 returned. 

2022 fe_pen : Penalty object 

2023 A penalty on the fixed effects 

2024 free : MixedLMParams object 

2025 If not `None`, this is a mask that allows parameters to be 

2026 held fixed at specified values. A 1 indicates that the 

2027 corresponding parameter is estimated, a 0 indicates that 

2028 it is fixed at its starting value. Setting the `cov_re` 

2029 component to the identity matrix fits a model with 

2030 independent random effects. Note that some optimization 

2031 methods do not respect this constraint (bfgs and lbfgs both 

2032 work). 

2033 full_output : bool 

2034 If true, attach iteration history to results 

2035 method : str 

2036 Optimization method. Can be a scipy.optimize method name, 

2037 or a list of such names to be tried in sequence. 

2038 

2039 Returns 

2040 ------- 

2041 A MixedLMResults instance. 

2042 """ 

2043 

2044 _allowed_kwargs = ['gtol', 'maxiter', 'eps', 'maxcor', 'ftol', 

2045 'tol', 'disp', 'maxls'] 

2046 for x in kwargs.keys(): 

2047 if x not in _allowed_kwargs: 

2048 warnings.warn("Argument %s not used by MixedLM.fit" % x) 

2049 

2050 if method is None: 

2051 method = ['bfgs', 'lbfgs', 'cg'] 

2052 elif isinstance(method, str): 

2053 method = [method] 

2054 

2055 for meth in method: 

2056 if meth.lower() in ["newton", "ncg"]: 

2057 raise ValueError( 

2058 "method %s not available for MixedLM" % meth) 

2059 

2060 self.reml = reml 

2061 self.cov_pen = cov_pen 

2062 self.fe_pen = fe_pen 

2063 

2064 self._freepat = free 

2065 

2066 if full_output: 

2067 hist = [] 

2068 else: 

2069 hist = None 

2070 

2071 if start_params is None: 

2072 params = MixedLMParams(self.k_fe, self.k_re, self.k_vc) 

2073 params.fe_params = np.zeros(self.k_fe) 

2074 params.cov_re = np.eye(self.k_re) 

2075 params.vcomp = np.ones(self.k_vc) 

2076 else: 

2077 if isinstance(start_params, MixedLMParams): 

2078 params = start_params 

2079 else: 

2080 # It's a packed array 

2081 if len(start_params) == self.k_fe + self.k_re2 + self.k_vc: 

2082 params = MixedLMParams.from_packed( 

2083 start_params, self.k_fe, self.k_re, self.use_sqrt, 

2084 has_fe=True) 

2085 elif len(start_params) == self.k_re2 + self.k_vc: 

2086 params = MixedLMParams.from_packed( 

2087 start_params, self.k_fe, self.k_re, self.use_sqrt, 

2088 has_fe=False) 

2089 else: 

2090 raise ValueError("invalid start_params") 

2091 

2092 if do_cg: 

2093 kwargs["retall"] = hist is not None 

2094 if "disp" not in kwargs: 

2095 kwargs["disp"] = False 

2096 packed = params.get_packed(use_sqrt=self.use_sqrt, has_fe=False) 

2097 

2098 if niter_sa > 0: 

2099 warnings.warn("niter_sa is currently ignored") 

2100 

2101 # Try optimizing one or more times 

2102 for j in range(len(method)): 

2103 rslt = super(MixedLM, self).fit(start_params=packed, 

2104 skip_hessian=True, 

2105 method=method[j], 

2106 **kwargs) 

2107 if rslt.mle_retvals['converged']: 

2108 break 

2109 packed = rslt.params 

2110 if j + 1 < len(method): 

2111 next_method = method[j + 1] 

2112 warnings.warn( 

2113 "Retrying MixedLM optimization with %s" % next_method, 

2114 ConvergenceWarning) 

2115 else: 

2116 msg = ("MixedLM optimization failed, " + 

2117 "trying a different optimizer may help.") 

2118 warnings.warn(msg, ConvergenceWarning) 

2119 

2120 # The optimization succeeded 

2121 params = np.atleast_1d(rslt.params) 

2122 if hist is not None: 

2123 hist.append(rslt.mle_retvals) 

2124 

2125 converged = rslt.mle_retvals['converged'] 

2126 if not converged: 

2127 gn = self.score(rslt.params) 

2128 gn = np.sqrt(np.sum(gn**2)) 

2129 msg = "Gradient optimization failed, |grad| = %f" % gn 

2130 warnings.warn(msg, ConvergenceWarning) 

2131 

2132 # Convert to the final parameterization (i.e. undo the square 

2133 # root transform of the covariance matrix, and the profiling 

2134 # over the error variance). 

2135 params = MixedLMParams.from_packed( 

2136 params, self.k_fe, self.k_re, use_sqrt=self.use_sqrt, has_fe=False) 

2137 cov_re_unscaled = params.cov_re 

2138 vcomp_unscaled = params.vcomp 

2139 fe_params = self.get_fe_params(cov_re_unscaled, vcomp_unscaled) 

2140 params.fe_params = fe_params 

2141 scale = self.get_scale(fe_params, cov_re_unscaled, vcomp_unscaled) 

2142 cov_re = scale * cov_re_unscaled 

2143 vcomp = scale * vcomp_unscaled 

2144 

2145 f1 = (self.k_re > 0) and (np.min(np.abs(np.diag(cov_re))) < 0.01) 

2146 f2 = (self.k_vc > 0) and (np.min(np.abs(vcomp)) < 0.01) 

2147 if f1 or f2: 

2148 msg = "The MLE may be on the boundary of the parameter space." 

2149 warnings.warn(msg, ConvergenceWarning) 

2150 

2151 # Compute the Hessian at the MLE. Note that this is the 

2152 # Hessian with respect to the random effects covariance matrix 

2153 # (not its square root). It is used for obtaining standard 

2154 # errors, not for optimization. 

2155 hess = self.hessian(params) 

2156 hess_diag = np.diag(hess) 

2157 if free is not None: 

2158 pcov = np.zeros_like(hess) 

2159 pat = self._freepat.get_packed(use_sqrt=False, has_fe=True) 

2160 ii = np.flatnonzero(pat) 

2161 hess_diag = hess_diag[ii] 

2162 if len(ii) > 0: 

2163 hess1 = hess[np.ix_(ii, ii)] 

2164 pcov[np.ix_(ii, ii)] = np.linalg.inv(-hess1) 

2165 else: 

2166 pcov = np.linalg.inv(-hess) 

2167 if np.any(hess_diag >= 0): 

2168 msg = ("The Hessian matrix at the estimated parameter values " + 

2169 "is not positive definite.") 

2170 warnings.warn(msg, ConvergenceWarning) 

2171 

2172 # Prepare a results class instance 

2173 params_packed = params.get_packed(use_sqrt=False, has_fe=True) 

2174 results = MixedLMResults(self, params_packed, pcov / scale) 

2175 results.params_object = params 

2176 results.fe_params = fe_params 

2177 results.cov_re = cov_re 

2178 results.vcomp = vcomp 

2179 results.scale = scale 

2180 results.cov_re_unscaled = cov_re_unscaled 

2181 results.method = "REML" if self.reml else "ML" 

2182 results.converged = converged 

2183 results.hist = hist 

2184 results.reml = self.reml 

2185 results.cov_pen = self.cov_pen 

2186 results.k_fe = self.k_fe 

2187 results.k_re = self.k_re 

2188 results.k_re2 = self.k_re2 

2189 results.k_vc = self.k_vc 

2190 results.use_sqrt = self.use_sqrt 

2191 results.freepat = self._freepat 

2192 

2193 return MixedLMResultsWrapper(results) 

2194 

2195 def get_distribution(self, params, scale, exog): 

2196 return _mixedlm_distribution(self, params, scale, exog) 

2197 

2198 

2199class _mixedlm_distribution(object): 

2200 """ 

2201 A private class for simulating data from a given mixed linear model. 

2202 

2203 Parameters 

2204 ---------- 

2205 model : MixedLM instance 

2206 A mixed linear model 

2207 params : array_like 

2208 A parameter vector defining a mixed linear model. See 

2209 notes for more information. 

2210 scale : scalar 

2211 The unexplained variance 

2212 exog : array_like 

2213 An array of fixed effect covariates. If None, model.exog 

2214 is used. 

2215 

2216 Notes 

2217 ----- 

2218 The params array is a vector containing fixed effects parameters, 

2219 random effects parameters, and variance component parameters, in 

2220 that order. The lower triangle of the random effects covariance 

2221 matrix is stored. The random effects and variance components 

2222 parameters are divided by the scale parameter. 

2223 

2224 This class is used in Mediation, and possibly elsewhere. 

2225 """ 

2226 

2227 def __init__(self, model, params, scale, exog): 

2228 

2229 self.model = model 

2230 self.exog = exog if exog is not None else model.exog 

2231 

2232 po = MixedLMParams.from_packed( 

2233 params, model.k_fe, model.k_re, False, True) 

2234 

2235 self.fe_params = po.fe_params 

2236 self.cov_re = scale * po.cov_re 

2237 self.vcomp = scale * po.vcomp 

2238 self.scale = scale 

2239 

2240 group_idx = np.zeros(model.nobs, dtype=np.int) 

2241 for k, g in enumerate(model.group_labels): 

2242 group_idx[model.row_indices[g]] = k 

2243 self.group_idx = group_idx 

2244 

2245 def rvs(self, n): 

2246 """ 

2247 Return a vector of simulated values from a mixed linear 

2248 model. 

2249 

2250 The parameter n is ignored, but required by the interface 

2251 """ 

2252 

2253 model = self.model 

2254 

2255 # Fixed effects 

2256 y = np.dot(self.exog, self.fe_params) 

2257 

2258 # Random effects 

2259 u = np.random.normal(size=(model.n_groups, model.k_re)) 

2260 u = np.dot(u, np.linalg.cholesky(self.cov_re).T) 

2261 y += (u[self.group_idx, :] * model.exog_re).sum(1) 

2262 

2263 # Variance components 

2264 for j, _ in enumerate(model.exog_vc.names): 

2265 ex = model.exog_vc.mats[j] 

2266 v = self.vcomp[j] 

2267 for i, g in enumerate(model.group_labels): 

2268 exg = ex[i] 

2269 ii = model.row_indices[g] 

2270 u = np.random.normal(size=exg.shape[1]) 

2271 y[ii] += np.sqrt(v) * np.dot(exg, u) 

2272 

2273 # Residual variance 

2274 y += np.sqrt(self.scale) * np.random.normal(size=len(y)) 

2275 

2276 return y 

2277 

2278 

2279class MixedLMResults(base.LikelihoodModelResults, base.ResultMixin): 

2280 ''' 

2281 Class to contain results of fitting a linear mixed effects model. 

2282 

2283 MixedLMResults inherits from statsmodels.LikelihoodModelResults 

2284 

2285 Parameters 

2286 ---------- 

2287 See statsmodels.LikelihoodModelResults 

2288 

2289 Attributes 

2290 ---------- 

2291 model : class instance 

2292 Pointer to MixedLM model instance that called fit. 

2293 normalized_cov_params : ndarray 

2294 The sampling covariance matrix of the estimates 

2295 params : ndarray 

2296 A packed parameter vector for the profile parameterization. 

2297 The first `k_fe` elements are the estimated fixed effects 

2298 coefficients. The remaining elements are the estimated 

2299 variance parameters. The variance parameters are all divided 

2300 by `scale` and are not the variance parameters shown 

2301 in the summary. 

2302 fe_params : ndarray 

2303 The fitted fixed-effects coefficients 

2304 cov_re : ndarray 

2305 The fitted random-effects covariance matrix 

2306 bse_fe : ndarray 

2307 The standard errors of the fitted fixed effects coefficients 

2308 bse_re : ndarray 

2309 The standard errors of the fitted random effects covariance 

2310 matrix and variance components. The first `k_re * (k_re + 1)` 

2311 parameters are the standard errors for the lower triangle of 

2312 `cov_re`, the remaining elements are the standard errors for 

2313 the variance components. 

2314 

2315 See Also 

2316 -------- 

2317 statsmodels.LikelihoodModelResults 

2318 ''' 

2319 

2320 def __init__(self, model, params, cov_params): 

2321 

2322 super(MixedLMResults, self).__init__(model, params, 

2323 normalized_cov_params=cov_params) 

2324 self.nobs = self.model.nobs 

2325 self.df_resid = self.nobs - np.linalg.matrix_rank(self.model.exog) 

2326 

2327 @cache_readonly 

2328 def fittedvalues(self): 

2329 """ 

2330 Returns the fitted values for the model. 

2331 

2332 The fitted values reflect the mean structure specified by the 

2333 fixed effects and the predicted random effects. 

2334 """ 

2335 fit = np.dot(self.model.exog, self.fe_params) 

2336 re = self.random_effects 

2337 for group_ix, group in enumerate(self.model.group_labels): 

2338 ix = self.model.row_indices[group] 

2339 

2340 mat = [] 

2341 if self.model.exog_re_li is not None: 

2342 mat.append(self.model.exog_re_li[group_ix]) 

2343 for j in range(self.k_vc): 

2344 mat.append(self.model.exog_vc.mats[j][group_ix]) 

2345 mat = np.concatenate(mat, axis=1) 

2346 

2347 fit[ix] += np.dot(mat, re[group]) 

2348 

2349 return fit 

2350 

2351 @cache_readonly 

2352 def resid(self): 

2353 """ 

2354 Returns the residuals for the model. 

2355 

2356 The residuals reflect the mean structure specified by the 

2357 fixed effects and the predicted random effects. 

2358 """ 

2359 return self.model.endog - self.fittedvalues 

2360 

2361 @cache_readonly 

2362 def bse_fe(self): 

2363 """ 

2364 Returns the standard errors of the fixed effect regression 

2365 coefficients. 

2366 """ 

2367 p = self.model.exog.shape[1] 

2368 return np.sqrt(np.diag(self.cov_params())[0:p]) 

2369 

2370 @cache_readonly 

2371 def bse_re(self): 

2372 """ 

2373 Returns the standard errors of the variance parameters. 

2374 

2375 The first `k_re x (k_re + 1)` elements of the returned array 

2376 are the standard errors of the lower triangle of `cov_re`. 

2377 The remaining elements are the standard errors of the variance 

2378 components. 

2379 

2380 Note that the sampling distribution of variance parameters is 

2381 strongly skewed unless the sample size is large, so these 

2382 standard errors may not give meaningful confidence intervals 

2383 or p-values if used in the usual way. 

2384 """ 

2385 p = self.model.exog.shape[1] 

2386 return np.sqrt(self.scale * np.diag(self.cov_params())[p:]) 

2387 

2388 def _expand_re_names(self, group_ix): 

2389 names = list(self.model.data.exog_re_names) 

2390 

2391 for j, v in enumerate(self.model.exog_vc.names): 

2392 vg = self.model.exog_vc.colnames[j][group_ix] 

2393 na = ["%s[%s]" % (v, s) for s in vg] 

2394 names.extend(na) 

2395 

2396 return names 

2397 

2398 @cache_readonly 

2399 def random_effects(self): 

2400 """ 

2401 The conditional means of random effects given the data. 

2402 

2403 Returns 

2404 ------- 

2405 random_effects : dict 

2406 A dictionary mapping the distinct `group` values to the 

2407 conditional means of the random effects for the group 

2408 given the data. 

2409 """ 

2410 try: 

2411 cov_re_inv = np.linalg.inv(self.cov_re) 

2412 except np.linalg.LinAlgError: 

2413 raise ValueError("Cannot predict random effects from " + 

2414 "singular covariance structure.") 

2415 

2416 vcomp = self.vcomp 

2417 k_re = self.k_re 

2418 

2419 ranef_dict = {} 

2420 for group_ix, group in enumerate(self.model.group_labels): 

2421 

2422 endog = self.model.endog_li[group_ix] 

2423 exog = self.model.exog_li[group_ix] 

2424 ex_r = self.model._aex_r[group_ix] 

2425 ex2_r = self.model._aex_r2[group_ix] 

2426 vc_var = self.model._expand_vcomp(vcomp, group_ix) 

2427 

2428 # Get the residuals relative to fixed effects 

2429 resid = endog 

2430 if self.k_fe > 0: 

2431 expval = np.dot(exog, self.fe_params) 

2432 resid = resid - expval 

2433 

2434 solver = _smw_solver(self.scale, ex_r, ex2_r, cov_re_inv, 

2435 1 / vc_var) 

2436 vir = solver(resid) 

2437 

2438 xtvir = _dot(ex_r.T, vir) 

2439 

2440 xtvir[0:k_re] = np.dot(self.cov_re, xtvir[0:k_re]) 

2441 xtvir[k_re:] *= vc_var 

2442 ranef_dict[group] = pd.Series( 

2443 xtvir, index=self._expand_re_names(group_ix)) 

2444 

2445 return ranef_dict 

2446 

2447 @cache_readonly 

2448 def random_effects_cov(self): 

2449 """ 

2450 Returns the conditional covariance matrix of the random 

2451 effects for each group given the data. 

2452 

2453 Returns 

2454 ------- 

2455 random_effects_cov : dict 

2456 A dictionary mapping the distinct values of the `group` 

2457 variable to the conditional covariance matrix of the 

2458 random effects given the data. 

2459 """ 

2460 

2461 try: 

2462 cov_re_inv = np.linalg.inv(self.cov_re) 

2463 except np.linalg.LinAlgError: 

2464 cov_re_inv = None 

2465 

2466 vcomp = self.vcomp 

2467 

2468 ranef_dict = {} 

2469 for group_ix in range(self.model.n_groups): 

2470 

2471 ex_r = self.model._aex_r[group_ix] 

2472 ex2_r = self.model._aex_r2[group_ix] 

2473 label = self.model.group_labels[group_ix] 

2474 vc_var = self.model._expand_vcomp(vcomp, group_ix) 

2475 

2476 solver = _smw_solver(self.scale, ex_r, ex2_r, cov_re_inv, 

2477 1 / vc_var) 

2478 

2479 n = ex_r.shape[0] 

2480 m = self.cov_re.shape[0] 

2481 mat1 = np.empty((n, m + len(vc_var))) 

2482 mat1[:, 0:m] = np.dot(ex_r[:, 0:m], self.cov_re) 

2483 mat1[:, m:] = np.dot(ex_r[:, m:], np.diag(vc_var)) 

2484 mat2 = solver(mat1) 

2485 mat2 = np.dot(mat1.T, mat2) 

2486 

2487 v = -mat2 

2488 v[0:m, 0:m] += self.cov_re 

2489 ix = np.arange(m, v.shape[0]) 

2490 v[ix, ix] += vc_var 

2491 na = self._expand_re_names(group_ix) 

2492 v = pd.DataFrame(v, index=na, columns=na) 

2493 ranef_dict[label] = v 

2494 

2495 return ranef_dict 

2496 

2497 # Need to override since t-tests are only used for fixed effects 

2498 # parameters. 

2499 def t_test(self, r_matrix, scale=None, use_t=None): 

2500 """ 

2501 Compute a t-test for a each linear hypothesis of the form Rb = q 

2502 

2503 Parameters 

2504 ---------- 

2505 r_matrix : array_like 

2506 If an array is given, a p x k 2d array or length k 1d 

2507 array specifying the linear restrictions. It is assumed 

2508 that the linear combination is equal to zero. 

2509 scale : float, optional 

2510 An optional `scale` to use. Default is the scale specified 

2511 by the model fit. 

2512 use_t : bool, optional 

2513 If use_t is None, then the default of the model is used. 

2514 If use_t is True, then the p-values are based on the t 

2515 distribution. 

2516 If use_t is False, then the p-values are based on the normal 

2517 distribution. 

2518 

2519 Returns 

2520 ------- 

2521 res : ContrastResults instance 

2522 The results for the test are attributes of this results instance. 

2523 The available results have the same elements as the parameter table 

2524 in `summary()`. 

2525 """ 

2526 if scale is not None: 

2527 import warnings 

2528 warnings.warn('scale is has no effect and is deprecated. It will' 

2529 'be removed in the next version.', 

2530 DeprecationWarning) 

2531 

2532 if r_matrix.shape[1] != self.k_fe: 

2533 raise ValueError("r_matrix for t-test should have %d columns" 

2534 % self.k_fe) 

2535 

2536 d = self.k_re2 + self.k_vc 

2537 z0 = np.zeros((r_matrix.shape[0], d)) 

2538 r_matrix = np.concatenate((r_matrix, z0), axis=1) 

2539 tst_rslt = super(MixedLMResults, self).t_test(r_matrix, use_t=use_t) 

2540 return tst_rslt 

2541 

2542 def summary(self, yname=None, xname_fe=None, xname_re=None, 

2543 title=None, alpha=.05): 

2544 """ 

2545 Summarize the mixed model regression results. 

2546 

2547 Parameters 

2548 ---------- 

2549 yname : str, optional 

2550 Default is `y` 

2551 xname_fe : list[str], optional 

2552 Fixed effects covariate names 

2553 xname_re : list[str], optional 

2554 Random effects covariate names 

2555 title : str, optional 

2556 Title for the top table. If not None, then this replaces 

2557 the default title 

2558 alpha : float 

2559 significance level for the confidence intervals 

2560 

2561 Returns 

2562 ------- 

2563 smry : Summary instance 

2564 this holds the summary tables and text, which can be 

2565 printed or converted to various output formats. 

2566 

2567 See Also 

2568 -------- 

2569 statsmodels.iolib.summary2.Summary : class to hold summary results 

2570 """ 

2571 

2572 from statsmodels.iolib import summary2 

2573 smry = summary2.Summary() 

2574 

2575 info = OrderedDict() 

2576 info["Model:"] = "MixedLM" 

2577 if yname is None: 

2578 yname = self.model.endog_names 

2579 

2580 param_names = self.model.data.param_names[:] 

2581 k_fe_params = len(self.fe_params) 

2582 k_re_params = len(param_names) - len(self.fe_params) 

2583 

2584 if xname_fe is not None: 

2585 if len(xname_fe) != k_fe_params: 

2586 msg = "xname_fe should be a list of length %d" % k_fe_params 

2587 raise ValueError(msg) 

2588 param_names[:k_fe_params] = xname_fe 

2589 

2590 if xname_re is not None: 

2591 if len(xname_re) != k_re_params: 

2592 msg = "xname_re should be a list of length %d" % k_re_params 

2593 raise ValueError(msg) 

2594 param_names[k_fe_params:] = xname_re 

2595 

2596 info["No. Observations:"] = str(self.model.n_totobs) 

2597 info["No. Groups:"] = str(self.model.n_groups) 

2598 

2599 gs = np.array([len(x) for x in self.model.endog_li]) 

2600 info["Min. group size:"] = "%.0f" % min(gs) 

2601 info["Max. group size:"] = "%.0f" % max(gs) 

2602 info["Mean group size:"] = "%.1f" % np.mean(gs) 

2603 

2604 info["Dependent Variable:"] = yname 

2605 info["Method:"] = self.method 

2606 info["Scale:"] = self.scale 

2607 info["Log-Likelihood:"] = self.llf 

2608 info["Converged:"] = "Yes" if self.converged else "No" 

2609 smry.add_dict(info) 

2610 smry.add_title("Mixed Linear Model Regression Results") 

2611 

2612 float_fmt = "%.3f" 

2613 

2614 sdf = np.nan * np.ones((self.k_fe + self.k_re2 + self.k_vc, 6)) 

2615 

2616 # Coefficient estimates 

2617 sdf[0:self.k_fe, 0] = self.fe_params 

2618 

2619 # Standard errors 

2620 sdf[0:self.k_fe, 1] = np.sqrt(np.diag(self.cov_params()[0:self.k_fe])) 

2621 

2622 # Z-scores 

2623 sdf[0:self.k_fe, 2] = sdf[0:self.k_fe, 0] / sdf[0:self.k_fe, 1] 

2624 

2625 # p-values 

2626 sdf[0:self.k_fe, 3] = 2 * norm.cdf(-np.abs(sdf[0:self.k_fe, 2])) 

2627 

2628 # Confidence intervals 

2629 qm = -norm.ppf(alpha / 2) 

2630 sdf[0:self.k_fe, 4] = sdf[0:self.k_fe, 0] - qm * sdf[0:self.k_fe, 1] 

2631 sdf[0:self.k_fe, 5] = sdf[0:self.k_fe, 0] + qm * sdf[0:self.k_fe, 1] 

2632 

2633 # All random effects variances and covariances 

2634 jj = self.k_fe 

2635 for i in range(self.k_re): 

2636 for j in range(i + 1): 

2637 sdf[jj, 0] = self.cov_re[i, j] 

2638 sdf[jj, 1] = np.sqrt(self.scale) * self.bse[jj] 

2639 jj += 1 

2640 

2641 # Variance components 

2642 for i in range(self.k_vc): 

2643 sdf[jj, 0] = self.vcomp[i] 

2644 sdf[jj, 1] = np.sqrt(self.scale) * self.bse[jj] 

2645 jj += 1 

2646 

2647 sdf = pd.DataFrame(index=param_names, data=sdf) 

2648 sdf.columns = ['Coef.', 'Std.Err.', 'z', 'P>|z|', 

2649 '[' + str(alpha/2), str(1-alpha/2) + ']'] 

2650 for col in sdf.columns: 

2651 sdf[col] = [float_fmt % x if np.isfinite(x) else "" 

2652 for x in sdf[col]] 

2653 

2654 smry.add_df(sdf, align='r') 

2655 

2656 return smry 

2657 

2658 @cache_readonly 

2659 def llf(self): 

2660 return self.model.loglike(self.params_object, profile_fe=False) 

2661 

2662 @cache_readonly 

2663 def aic(self): 

2664 """Akaike information criterion""" 

2665 if self.reml: 

2666 return np.nan 

2667 if self.freepat is not None: 

2668 df = self.freepat.get_packed(use_sqrt=False, has_fe=True).sum() + 1 

2669 else: 

2670 df = self.params.size + 1 

2671 return -2 * (self.llf - df) 

2672 

2673 @cache_readonly 

2674 def bic(self): 

2675 """Bayesian information criterion""" 

2676 if self.reml: 

2677 return np.nan 

2678 if self.freepat is not None: 

2679 df = self.freepat.get_packed(use_sqrt=False, has_fe=True).sum() + 1 

2680 else: 

2681 df = self.params.size + 1 

2682 return -2 * self.llf + np.log(self.nobs) * df 

2683 

2684 def profile_re(self, re_ix, vtype, num_low=5, dist_low=1., num_high=5, 

2685 dist_high=1.): 

2686 """ 

2687 Profile-likelihood inference for variance parameters. 

2688 

2689 Parameters 

2690 ---------- 

2691 re_ix : int 

2692 If vtype is `re`, this value is the index of the variance 

2693 parameter for which to construct a profile likelihood. If 

2694 `vtype` is 'vc' then `re_ix` is the name of the variance 

2695 parameter to be profiled. 

2696 vtype : str 

2697 Either 're' or 'vc', depending on whether the profile 

2698 analysis is for a random effect or a variance component. 

2699 num_low : int 

2700 The number of points at which to calculate the likelihood 

2701 below the MLE of the parameter of interest. 

2702 dist_low : float 

2703 The distance below the MLE of the parameter of interest to 

2704 begin calculating points on the profile likelihood. 

2705 num_high : int 

2706 The number of points at which to calculate the likelihood 

2707 above the MLE of the parameter of interest. 

2708 dist_high : float 

2709 The distance above the MLE of the parameter of interest to 

2710 begin calculating points on the profile likelihood. 

2711 

2712 Returns 

2713 ------- 

2714 An array with two columns. The first column contains the 

2715 values to which the parameter of interest is constrained. The 

2716 second column contains the corresponding likelihood values. 

2717 

2718 Notes 

2719 ----- 

2720 Only variance parameters can be profiled. 

2721 """ 

2722 

2723 pmodel = self.model 

2724 k_fe = pmodel.k_fe 

2725 k_re = pmodel.k_re 

2726 k_vc = pmodel.k_vc 

2727 endog, exog = pmodel.endog, pmodel.exog 

2728 

2729 # Need to permute the columns of the random effects design 

2730 # matrix so that the profiled variable is in the first column. 

2731 if vtype == 're': 

2732 ix = np.arange(k_re) 

2733 ix[0] = re_ix 

2734 ix[re_ix] = 0 

2735 exog_re = pmodel.exog_re.copy()[:, ix] 

2736 

2737 # Permute the covariance structure to match the permuted 

2738 # design matrix. 

2739 params = self.params_object.copy() 

2740 cov_re_unscaled = params.cov_re 

2741 cov_re_unscaled = cov_re_unscaled[np.ix_(ix, ix)] 

2742 params.cov_re = cov_re_unscaled 

2743 ru0 = cov_re_unscaled[0, 0] 

2744 

2745 # Convert dist_low and dist_high to the profile 

2746 # parameterization 

2747 cov_re = self.scale * cov_re_unscaled 

2748 low = (cov_re[0, 0] - dist_low) / self.scale 

2749 high = (cov_re[0, 0] + dist_high) / self.scale 

2750 

2751 elif vtype == 'vc': 

2752 re_ix = self.model.exog_vc.names.index(re_ix) 

2753 params = self.params_object.copy() 

2754 vcomp = self.vcomp 

2755 low = (vcomp[re_ix] - dist_low) / self.scale 

2756 high = (vcomp[re_ix] + dist_high) / self.scale 

2757 ru0 = vcomp[re_ix] / self.scale 

2758 

2759 # Define the sequence of values to which the parameter of 

2760 # interest will be constrained. 

2761 if low <= 0: 

2762 raise ValueError("dist_low is too large and would result in a " 

2763 "negative variance. Try a smaller value.") 

2764 left = np.linspace(low, ru0, num_low + 1) 

2765 right = np.linspace(ru0, high, num_high+1)[1:] 

2766 rvalues = np.concatenate((left, right)) 

2767 

2768 # Indicators of which parameters are free and fixed. 

2769 free = MixedLMParams(k_fe, k_re, k_vc) 

2770 if self.freepat is None: 

2771 free.fe_params = np.ones(k_fe) 

2772 vcomp = np.ones(k_vc) 

2773 mat = np.ones((k_re, k_re)) 

2774 else: 

2775 # If a freepat already has been specified, we add the 

2776 # constraint to it. 

2777 free.fe_params = self.freepat.fe_params 

2778 vcomp = self.freepat.vcomp 

2779 mat = self.freepat.cov_re 

2780 if vtype == 're': 

2781 mat = mat[np.ix_(ix, ix)] 

2782 if vtype == 're': 

2783 mat[0, 0] = 0 

2784 else: 

2785 vcomp[re_ix] = 0 

2786 free.cov_re = mat 

2787 free.vcomp = vcomp 

2788 

2789 klass = self.model.__class__ 

2790 init_kwargs = pmodel._get_init_kwds() 

2791 if vtype == 're': 

2792 init_kwargs['exog_re'] = exog_re 

2793 

2794 likev = [] 

2795 for x in rvalues: 

2796 

2797 model = klass(endog, exog, **init_kwargs) 

2798 

2799 if vtype == 're': 

2800 cov_re = params.cov_re.copy() 

2801 cov_re[0, 0] = x 

2802 params.cov_re = cov_re 

2803 else: 

2804 params.vcomp[re_ix] = x 

2805 

2806 # TODO should use fit_kwargs 

2807 rslt = model.fit(start_params=params, free=free, 

2808 reml=self.reml, cov_pen=self.cov_pen)._results 

2809 likev.append([x * rslt.scale, rslt.llf]) 

2810 

2811 likev = np.asarray(likev) 

2812 

2813 return likev 

2814 

2815 

2816class MixedLMResultsWrapper(base.LikelihoodResultsWrapper): 

2817 _attrs = {'bse_re': ('generic_columns', 'exog_re_names_full'), 

2818 'fe_params': ('generic_columns', 'xnames'), 

2819 'bse_fe': ('generic_columns', 'xnames'), 

2820 'cov_re': ('generic_columns_2d', 'exog_re_names'), 

2821 'cov_re_unscaled': ('generic_columns_2d', 'exog_re_names'), 

2822 } 

2823 _upstream_attrs = base.LikelihoodResultsWrapper._wrap_attrs 

2824 _wrap_attrs = base.wrap.union_dicts(_attrs, _upstream_attrs) 

2825 

2826 _methods = {} 

2827 _upstream_methods = base.LikelihoodResultsWrapper._wrap_methods 

2828 _wrap_methods = base.wrap.union_dicts(_methods, _upstream_methods) 

2829 

2830 

2831def _handle_missing(data, groups, formula, re_formula, vc_formula): 

2832 

2833 tokens = set([]) 

2834 

2835 forms = [formula] 

2836 if re_formula is not None: 

2837 forms.append(re_formula) 

2838 if vc_formula is not None: 

2839 forms.extend(vc_formula.values()) 

2840 

2841 import tokenize 

2842 from io import StringIO 

2843 from statsmodels.compat.python import asunicode 

2844 skiptoks = {"(", ")", "*", ":", "+", "-", "**", "/"} 

2845 

2846 for fml in forms: 

2847 # Unicode conversion is for Py2 compatability 

2848 rl = StringIO(fml) 

2849 

2850 def rlu(): 

2851 line = rl.readline() 

2852 return asunicode(line, 'ascii') 

2853 g = tokenize.generate_tokens(rlu) 

2854 for tok in g: 

2855 if tok not in skiptoks: 

2856 tokens.add(tok.string) 

2857 tokens = sorted(tokens & set(data.columns)) 

2858 

2859 data = data[tokens] 

2860 ii = pd.notnull(data).all(1) 

2861 if type(groups) != "str": 

2862 ii &= pd.notnull(groups) 

2863 

2864 return data.loc[ii, :], groups[np.asarray(ii)]