IBS calculation function which i see in GenABEL.

This function facilitates quality control of genomic data. E.g. people with exteremly high (close to 1) IBS may indicate duplicated samples (or twins), simply high values of IBS may indicate relatives. When weight "freq" is used, IBS for a pair of people i and j is computed as

f_{i,j} = Σ_k \frac{(x_{i,k} - p_k) * (x_{j,k} - p_k)}{(p_k * (1 - p_k))}

where k changes from 1 to N = number of SNPs GW, x_{i,k} is a genotype of ith person at the kth SNP, coded as 0, 1/2, 1 and p_k is the frequency of the "+" allele. This apparently provides an unbiased estimate of the kinship coefficient.

question 1: " p_k is the frequency of the "+" allele". what is "+"allele?

I think the "+"allele is "0" or "1" or "1/2" which amount is the biggest of all three. To some snps, the "+" allele is 0. otherwise, to others, the "+" allele is "1". So firstly, I should calculate "p_k" of all snps.

question 2: what does "\frac" mean?

question 3: when I get the ibs N*N matrix. it must be a triangular matrix. The diagonal is relationship between self. what does other number represent?

## How to calculate IBS?

**Forum rules**

Please remember not to post any sensitive data on this public forum.

The first few posts of newly registered users will be moderated in order to filter out any spammers.

When get a solution to the problem you posted, please change the topic name (e.g. from "how to ..." to "[SOLVED] how to ..."). This will make it easier for the community to follow the posts yet to be attended.

### Who is online

Users browsing this forum: No registered users and 3 guests