IBS calculation function which i see in GenABEL.

This function facilitates quality control of genomic data. E.g. people with exteremly high (close to 1) IBS may indicate duplicated samples (or twins), simply high values of IBS may indicate relatives. When weight "freq" is used, IBS for a pair of people i and j is computed as

f_{i,j} = Σ_k \frac{(x_{i,k} - p_k) * (x_{j,k} - p_k)}{(p_k * (1 - p_k))}

where k changes from 1 to N = number of SNPs GW, x_{i,k} is a genotype of ith person at the kth SNP, coded as 0, 1/2, 1 and p_k is the frequency of the "+" allele. This apparently provides an unbiased estimate of the kinship coefficient.

question 1: " p_k is the frequency of the "+" allele". what is "+"allele?

I think the "+"allele is "0" or "1" or "1/2" which amount is the biggest of all three. To some snps, the "+" allele is 0. otherwise, to others, the "+" allele is "1". So firstly, I should calculate "p_k" of all snps.

question 2: what does "\frac" mean?

question 3: when I get the ibs N*N matrix. it must be a triangular matrix. The diagonal is relationship between self. what does other number represent?

## How to calculate IBS?

