Hello everyone!

I am trying to do my GWA studies with GenAbel, I would like to compare different ways to control for subpopulations. On the one hand, analysis based on previous information (applying STRUCTURE, for example). On the other hand, leaving that work to GenAbel.. When trying to do so, some questions appeared, could someone help me with this?

1) Which are the differences among these analyses?

Imagine I have a variable called "Population" (within phdata) that classifies individuals into different groups (ex: 1, 2, 1, 3,...)

a1=qtscore(y,data=Data,strata=Population)

a2=qtscore(y~as.factor(Population),data=Data)

a3=mlreg(y~1+as.factor(Population),data=Data,trait='gaussian')

2) How could I implement the results from STRUCTURE software?

STRUCTURE gives as many variables as subpopulations number to indicate "the probability of membership" of an individual to each subpopulation.

My attempt is to include all variables as covariates, is it ok to use one of a2 or a3 analysis in 1)?

3) Which GenABEL function/s would you recommend to control for subpopulation?

Thanks in advance,

Vale

## Population Structure

**Forum rules**

Please remember not to post any sensitive data on this public forum.

The first few posts of newly registered users will be moderated in order to filter out any spammers.

When get a solution to the problem you posted, please change the topic name (e.g. from "how to ..." to "[SOLVED] how to ..."). This will make it easier for the community to follow the posts yet to be attended.

### Re: Population Structure

1) Which are the differences among these analyses?

Imagine I have a variable called "Population" (within phdata) that classifies individuals into different groups (ex: 1, 2, 1, 3,...)

a1=qtscore(y,data=Data,strata=Population)

a2=qtscore(y~as.factor(Population),data=Data)

a3=mlreg(y~1+as.factor(Population),data=Data,trait='gaussian')

a1 = stratified analysis, here you allow for different effects and different variances across strata; a2, a3 = the same, but you keep the variances constant. In your example, the difference between a2 and a3 is that you use a score test in a2 and the Wald test in a3. If you had more covariates, the differece would be additionally in the way these are treated: in qtscore, the residuals are analyzed with the score test, while in mlreg joint model estimation ocures.

In general, if you have well-defined strata (which is not always the case!), a1 is the best. However, unless the higher order moment of the trait's distribution (especially variances) in the strata are quite different (which may ocurre when you combine very environmentally distinct populations), the difference between a1, a2, and a3 will be minor.

2) How could I implement the results from STRUCTURE software?

STRUCTURE gives as many variables as subpopulations number to indicate "the probability of membership"

You can indeed plug in these probabilities as covariates into qtscore or mlreg (but see comments above)

3) Which GenABEL function/s would you recommend to control for subpopulation?

Well, this really depends. See the "General extensive tutorial on GenABEL suite", chapters 6 and 7, available at http://www.genabel.org/tutorials

If you are dealing with not well-defined strata, consider use of mixed models.

Note that (Gen)ABELs are dynamically developing; while this post is intended to provide full information at the time of posting, please read on further posts, if any, as the topic may be updated with novel solutions at a later stage.

best regards,

Yurii

best regards,

Yurii

### Who is online

Users browsing this forum: No registered users and 4 guests