ProbABEL input files

Questions about ProbABEL are welcome here.
Forum rules
Please remember not to post any sensitive data on this public forum.
The first few posts of newly registered users will be moderated in order to filter out any spammers.

When get a solution to the problem you posted, please change the topic name (e.g. from "how to ..." to "[SOLVED] how to ..."). This will make it easier for the community to follow the posts yet to be attended.
divs5
Posts: 4
Joined: Thu Jan 27, 2011 7:43 am

ProbABEL input files

Postby divs5 » Thu Jan 27, 2011 7:56 am

Greetings to all !
It is great to have a public forum for the ABEL tools :D

I have two separate questions, one regarding ProbABEL and the other regarding GenABEL

1) I am currently in the process of trying the ProbABEL tools.
I have been able to prepare the phenotype data in the correct format so far.
The preparedata.R script in the ProbABEL package was very efficient.
Is there any easy way of converting the genotype data (either plink ped/map or GenABEL@gtdata object) into the mldose, mlinfo and mlprob files required for ProbABEL?
I have not been able to find any information related to this in the ProbABEL manual or files. I do not want to impute my data but there seems no way other than running a MACH command to do this.
I would be very grateful if you could advice me how to proceed - maybe you have similar preparegenotypedata.R files?

2)
====
Since this question was unrelated to the first one I have opened a second topic for it here.

Lennart.
====


I look forward to hearing from you and thanks in advance.

Kind regards
Divya

yurii
GenABEL developer
GenABEL developer
Posts: 263
Joined: Fri Jan 21, 2011 5:20 pm

Re: ProbABEL input files

Postby yurii » Thu Jan 27, 2011 3:08 pm

Ok, this is an interesting problem! May be it is good to understand the problem better before we start giving suggestions.

ProbABEL is specifically a tool to analyze imputed data -- so why would you wish to analyze directly typed data using it (and hence have to convert the data to MACH-format)?

Yurii
Note that (Gen)ABELs are dynamically developing; while this post is intended to provide full information at the time of posting, please read on further posts, if any, as the topic may be updated with novel solutions at a later stage.

best regards,
Yurii

divs5
Posts: 4
Joined: Thu Jan 27, 2011 7:43 am

Re: ProbABEL input files

Postby divs5 » Fri Jan 28, 2011 9:22 am

Dear Yurii

I had emailed you last year about the model where i wanted to test for interactions at the GW level and you had advised me to use ProbABEL (genome-wide interaction, with reporting betas and se's for main effects and interactions) or MixABEL, hence I am trying to get my data into the correct format for this. Or did I misunderstand you perhaps?

The model i want to analyze is
Expression ~ SNP*Phenotpe + SNP + Phenotype + age + sex

Kind regards
Divya

yurii
GenABEL developer
GenABEL developer
Posts: 263
Joined: Fri Jan 21, 2011 5:20 pm

Re: ProbABEL input files

Postby yurii » Thu Feb 03, 2011 12:54 pm

Divya,

OK, now it is more clear!

You have two options here for GxE testing: 1) us MixABEL and 2) use conversion script to go to MACH data and then use ProbABEL.

I will outline solution 1) here, and 2) in the next post

So, to do genome-wide GxE using GenABEL's data, you can use MixABEL, which is intended as a replacement for ProbABEL (at least for quantitative traits) and is currently at beta-stage, but tested quite comprehensively. The procedure GWFGLS can take GenABEL's or DatABEL's or standard R matrix data as an input. I presume you are going to analyze quantitative traits, and there is no pedigree/relationship involved. In such case it is quite straightforward (I hope!) to use this procedure.
Note that (Gen)ABELs are dynamically developing; while this post is intended to provide full information at the time of posting, please read on further posts, if any, as the topic may be updated with novel solutions at a later stage.

best regards,
Yurii

yurii
GenABEL developer
GenABEL developer
Posts: 263
Joined: Fri Jan 21, 2011 5:20 pm

Re: ProbABEL input files

Postby yurii » Thu Feb 03, 2011 12:59 pm

Alternative route to do genome-wide GxE using GenABEL's data is 1) convert GenABEL's data to MACH format and then 2) use these data with ProbABEL, either directly, or after conversion to DatABEL format (the latter will allow quicker analysis, use mach2databel function of GenABEL for that; install DatABEL first).

I have quickly drafted an R script, which will probably work quite slowly, but should do the job. Please mind that this is not something we routinely use -- I just drafted it today. So, you definitely need to make few test-runs and make sure that it does proper job -- e.g. run the same analyses in plain R, and check that betas and standard errors are the same, the effect is reported for the right reference/coded allele, etc. If you take this route, please let us know how it worked, so people can use it after wards as well.

Here is the script, again, mind that it may have bugs, you need to test!

Code: Select all

library(GenABEL)

# set base output name here
ofname <- "myfile"

# load the data, prepare the chunk for export in 'srdta'
data(srdta)
srdta <- srdta[1:10,1:10]

# export mlinfo
# Rsq=Quality=2 to indicate typed data
freq <- summary(gtdata(srdta))[,"Q.2"]
maf <- pmin(freq,1.-freq)
mli <- data.frame(SNP=snpnames(srdta),Al1=a2,Al2=a1,
                  Freq1=freq,MAF=maf,Quality=2.0,Rsq=2.0,
                  stringsAsFactors=FALSE)
write.table(mli,file=paste(ofname,".mlinfo",sep=""),
             col.names=T,row.names=F,quote=F)


# export mlprob
cod <- coding(srdta)
a1 <- unlist(strsplit(cod,""))[c(T,F)]
a2 <- unlist(strsplit(cod,""))[c(F,T)]
g0 <- paste(a1,a1,sep="/")
g1 <- paste(a1,a2,sep="/")
g2 <- paste(a2,a2,sep="/")

for (i in 1:nids(srdta)) {
  chargt <- as.character(srdta[i,])
  gvec <- rep(NA,nsnps(srdta)*2)
  gvec[c(T,F)] <- 1*(chargt==g2)
  gvec[c(F,T)] <- 1*(chargt==g1)
  outline <- matrix(c(paste(i,"->",idnames(srdta)[i],sep=""),
                      "ML_PROB",gvec),nrow=1)
  if (i==1) {
    write.table(outline,file=paste(ofname,".mlprob",sep=""),
                 sep=" ",col.names=F,row.names=F,quote=F)
  } else {
    write.table(outline,file=paste(ofname,".mlprob",sep=""),
                 col.names=F,row.names=F,quote=F,append=T)
  }
}

# export mldose
cod <- coding(srdta)
a1 <- unlist(strsplit(cod,""))[c(T,F)]
a2 <- unlist(strsplit(cod,""))[c(F,T)]
g0 <- paste(a1,a1,sep="/")
g1 <- paste(a1,a2,sep="/")
g2 <- paste(a2,a2,sep="/")

for (i in 1:nids(srdta)) {
  chargt <- as.character(srdta[i,])
  gvec <- rep(NA,nsnps(srdta))
  gvec[chargt==g2] <- 2.0
  gvec[chargt==g1] <- 1.0
  gvec[chargt==g0] <- 0.0
  outline <- matrix(c(paste(i,"->",idnames(srdta)[i],sep=""),
                       "MLDOSE",gvec),nrow=1)
  if (i==1) {
    write.table(outline,file=paste(ofname,".mldose",sep=""),
                 sep=" ",col.names=F,row.names=F,quote=F)
  } else {
    write.table(outline,file=paste(ofname,".mldose",sep=""),
                 col.names=F,row.names=F,quote=F,append=T)
  }
}
Note that (Gen)ABELs are dynamically developing; while this post is intended to provide full information at the time of posting, please read on further posts, if any, as the topic may be updated with novel solutions at a later stage.

best regards,
Yurii

yurii
GenABEL developer
GenABEL developer
Posts: 263
Joined: Fri Jan 21, 2011 5:20 pm

Re: ProbABEL input files

Postby yurii » Thu Feb 03, 2011 1:18 pm

Note once again: MixABEL is quite flexible, it can take GenABEL's data as input, or DatABEL's data. You can get to DatABEL's format from MACH files or IMPUTE files.
Note that (Gen)ABELs are dynamically developing; while this post is intended to provide full information at the time of posting, please read on further posts, if any, as the topic may be updated with novel solutions at a later stage.

best regards,
Yurii


Return to “ProbABEL”

Who is online

Users browsing this forum: No registered users and 2 guests