coding output

Questions about GenABEL (aka *ABEL) suite of packages
Forum rules
Please remember not to post any sensitive data on this public forum.
The first few posts of newly registered users will be moderated in order to filter out any spammers.

When get a solution to the problem you posted, please change the topic name (e.g. from "how to ..." to "[SOLVED] how to ..."). This will make it easier for the community to follow the posts yet to be attended.
Posts: 1
Joined: Fri Mar 27, 2015 12:37 pm

coding output

Postby Samantha » Fri Mar 27, 2015 12:48 pm

After formatting data in PLINK to tped and tfam format I read the data into Genabel as so:
convert.snp.tped(tpedfile="gwasGenabelIbsQced.tped",tfamfile="gwasGenabelIbsQced.tfam", outfile="gwasGenabelIbsQced.out",strand="u",bcast=10000)

After recoding for sex chromosomes I calculate genotype counts and allele frequencies using function.

However, when I have a look at the output I get some curious results. My understanding is that Q.2 is the allele frequency of the B allele.
When I look at A1 and A2 in the outputs alleles for some SNPs seem to have been switched from what they are in the raw data, for example:
Chromosome Position Strand A1 A2 NoMeasured CallRate Q.2
ARS-BFGL-NGS-100232 0 0 u G A 10 1 0.20
ARS-BFGL-NGS-100372 0 0 u G A 10 1 0.35
ARS-BFGL-NGS-100549 0 0 u G A 10 1 0.20
ARS-BFGL-NGS-100941 0 0 u A G 10 1 0.05
ARS-BFGL-NGS-101104 0 0 u A G 10 1 0.30

In the raw illumina file data SNP ARS-BFGL-BAC-13205 the polymorphism is coded as AG but in GenAbel it seems to come out as GA ... thus the allele A and allele B seems to have been switched? Is this correct? For my dataset Q.2 ranges from 0.0 to 0.5.The only thing that I can think of is that GenAbel seems to be determining what is the minor and major allele and may be switching once identifed so that the minor allele frequency is reported as Q.2.
Another example is that when I subset data by phenotypes and plot the allele frequencies I get a truncated scatterplot with the spread of allele frequencies halting at ~0.5 for both the x and y axis on the plot.

Am I reading in the data wrong? Missing a step? Is there a way to force GenAbel to stick to the alleles/polymorphism as present in the raw data?



Return to “GenABEL”

Who is online

Users browsing this forum: No registered users and 1 guest