A common technique to deal with time varying covariates is to immediately cut a spell into two pieces at the duration where a change in the covariate value occurs. The first piece is right censored with the first value of the covariate, while the second piece is left truncated with the new value of the covariate. Thus, with lots of time varying covariates, even moderately sized data sets may grow to be huge, with lots of left truncated and right censored spells, causing the same kind of problems as with originally huge data sets. Typically, data sets may contain five to ten times more spells after this splitting procedure has taken place.
Some techniques for dealing with huge data sets will be discussed and illustrated. Sampling (and resampling) is an obvious method to reduce (and increase again!) the computational burden, for instance sampling in ([1], [3], [4]) and of risk sets. Simple random sampling of individual life histories may be less satisfactory, if the terminal event is rare. In this case a matching approach may be fruitful ([2]).
The technique of viewing the development of the cohort as a marked (two types of marks: indication of new entry, terminal event or censoring, plus covariate information) point process is very efficient computationally, when there are no external time dependent covariates and censoring is not too heavy. However, with time varying internal and external covariates, a static approach based on risk sets is faster if enough computer memory is available.
[2] Broström, G. (1987). The influence of mother's mortality on infant mortality: A case study in matched data survival analysis. Scandinavian Journal of Statistics 14, 113-123.
[3] Cox, D.R. and Oakes, D. (1984). Analysis of Survival Data. Chapman and Hall, London.
[4] Liddell, F.D.K., McDonald, J.C. and Thomas, D.C. (1977). Methods of cohort analysis: Appraisal by application to asbestos mining. Journal of the Royal Statistical Society A 140, 469-491.