Tuesday, September 13, 2011

Y-axis on both sides in Kaplan-Meier Survival Curve

Here is an example of how to draw Y-axis on both sides in Kaplan-Meier Survival Curve, with same label, axis-titles and positions. This should work for the general plots as well.


Wednesday, July 20, 2011

Nested case-control study

Nested case-control study can be described as follows: for a particular disease, all the patients that become diseased in a given cohort are labeled as "cases". Then corresponding to each "case", a pre-specified number (say, 4) of "controls" or healthy subjects (at the time when disease occurred for the case) are matched (irrespective of whether these healthy subjects became case at a later period). This design is interesting because cost can be minimized at the expense of negligible statistical inefficiencies compared to considering whole cohort. More can be found here.

Wednesday, January 26, 2011

Conditional estimates for OR

When two independent samples from separate population are collected, a product binomial likelihood can be used. However, conditioning on margins, a different formulation is achieved, as follows (shai being population OR):

Why log(RR) instead of RR?

As an example of how the variances of difference measures are calculated, the derivation of approximate variance or SE for RR is shown as follows (uses multivariate delta method):


Comparing Risks

To evaluate whether an exposure is at all contributing to risk of disease, the simplest form of comparison is as follows:
x1 ~ Binomial(n1, pi1)
x2 ~ Binomial(n2, pi2)

Transformation for CI construction

To facilitate construction of confidence intervals, it is common to transform parameters. Delta method can he used to derive the approximate variance in that case. Here is the general theory for logarithmic transformed parameters:

Inference for proportions in small samples

Large sample approximations do not work well in a small sample setting. Exact tests have to be developed for better estimation in that case.

Since we know X (positive cell count for response) follows binomial, without making normal approximation, we can go for exact inference, by calculating the confidence intervals from the quantiles of a binomial distribution.

Inference for proportions in a large sample

From the following 2x2 table, we want to infer about the population risks:



outcome



+
-
Marginal total
exposure
+
a
b
n1
-
c
d
n2
Marginal total

m1
m2
N


Biostatistics and R

Biostatistics is mainly statistics for clinical and epidemiological studies that studies the occurrence of illness (morbidity), death (mortality) in a point of time or in a course of time and finds various models and estimates of risks (probability of that event occurrence).