Hi,
I'm not sure that this is the right place to write this post on but it seemed more appropriate than the GenABEL forum:
I'm trying to select samples which will be sequenced whole genome to create my own reference set instead of HapMap or 1000G.
Since I have some isolated population my idea is to choose the best set based on kinship so that the imputations can be more acurate and I loose as less power as possible for association.
I've found out that minimac hides each marker and then compares the imputed marker with the original one. There is an output with three different measures of accuracy, the first is the canonical Rsq as in MACH called looRSQ , the second is empR which on the wiki is defined as: the empirical correlation between true and imputed genotypes for the SNP. If this is negative, the SNP is probably flipped.
The third is empRSQ defined as : the actual R2 value, comparing imputed and true genotypes.
From what I understand I should look at the third one in order to asses acuracy, however I'm not sure and I could not find any litterature on it.
Am I interpreting this correctly or should I try somthing different?
Best
Nicola
Best reference set
Forum rules
Welcome! Please feel free to raise any issue. There is no issues big or small. Let's work on them together.
Please note that the first few posts of newly registered users will be moderated in order to filter out any spammers.
Welcome! Please feel free to raise any issue. There is no issues big or small. Let's work on them together.
Please note that the first few posts of newly registered users will be moderated in order to filter out any spammers.
-
- GenABEL senior expert
- Posts: 151
- Joined: Wed Feb 09, 2011 3:24 pm
Re: Best reference set
From reading your post - note I have no experience with minimac - I would agree that the last measure is probably the best (but it should also correlate almost perfectly with the first). And also #3 should be the same as the #2 - squared. I wonder if this is correct 

Note that (Gen)ABELs are dynamically developing; while this post is intended to provide full information at the time of posting, please read on further posts, if any, as the topic may be updated with novel solutions at a later stage.
best regards,
Yurii
best regards,
Yurii
Return to “Journal Club on Statistical Genomics”
Who is online
Users browsing this forum: No registered users and 1 guest