The Statistical Analysis of Doubly Truncated Data. Prof Carla Moreira

Читать онлайн.
Название The Statistical Analysis of Doubly Truncated Data
Автор произведения Prof Carla Moreira
Жанр Медицина
Серия
Издательство Медицина
Год выпуска 0
isbn 9781119500476



Скачать книгу

Leukemia 107 6.30 (4.15) Lymphoma 57 8.66 (4.39) N. System Tumour 94 6.38 (4.29) Neuroblastoma 38 3.16 (3.47) Other 105 6.87 (4.70) Missing 5 3.92 (5.18)

      

      1.4.2 AIDS Blood Transfusion Data

      Kalbfleish and Lawless (1989) reported 494 cases of transfusion‐related AIDS, corresponding to individuals diagnosed prior to 1 July 1986 (

). The variable of ultimate interest
is the induction or incubation time, which is the time elapsed from HIV infection to AIDS. Importantly, HIV was unknown before 1982 (
); this implies that cases developing AIDS prior to this date were not reported. Let
denote the time from HIV infection to 1 July 1986 (in months), and introduce
; then, due to the interval sampling, only triplets
satisfying
were observed (Bilker and Wang, 1996). We restrict our analysis to the
cases with consistent data, for which the infection could be attributed to a single transfusion or a short series of transfusions. This dataset is fully reported in Kalbfleish and Lawless (1989), p. 361.

range from 0.5 to 89 (months), while
ranges from
to 45.5. This suggests that the lower limit of the support of
is about
, while the upper limit of the support of
is about 99.5. As discussed in Chapter 2, in such a case the distribution of the incubation time
is identifiable on the interval
(months). The AIDS Blood Transfusion Data also includes information on the age of the individual at infection; see Table 1.2.

and mean (and standard deviation, SD) for the incubation time (months) by age at infection.

Age group
Mean (SD)
30 years
56 27.09 (18.28)
30–60 years 104 33.80 (18.95)
60 years
135 32.46 (16.74)

      

      1.4.3 Equipment‐ S Rounded Failure Time Data

and
for the units installed in the field. This field lifetime distribution is, however, doubly truncated because of the interval sampling. The Equipment‐ S data (Ye and Tang, 2016) concern
failures of a certain device (details are not given due to confidentiality issues) recorded between 1996 and 2011, a 15 year long observational window. Information on the date of installation and the date of failure, rounded to years, was obtained by digitizing Figure 2 in the referred paper. This dataset is therefore a discrete version of the original data in Ye and Tang (2016). In this example the right‐truncation time
is the number of years between installation and 2011, while the left‐truncation time is just
. In Table 1.3 the Equipment‐ S failure times are summarized.

      The observable range for the Equipment‐ S failure times goes from zero to 34 years, which is the maximum observed value for the right‐truncating variable

; hence, estimation of the reliability can only be done conditional on failure within the first