Extract SNPs for association test

Questions about ProbABEL are welcome here.
Forum rules
Please remember not to post any sensitive data on this public forum.
The first few posts of newly registered users will be moderated in order to filter out any spammers.

When get a solution to the problem you posted, please change the topic name (e.g. from "how to ..." to "[SOLVED] how to ..."). This will make it easier for the community to follow the posts yet to be attended.
katiechan
Posts: 3
Joined: Fri Jun 05, 2015 8:14 pm

Extract SNPs for association test

Postby katiechan » Fri Jun 05, 2015 8:23 pm

I have imputed SNPs data in filevector format. I wonder how I can extract SNPs within a certain genomic region, i.e. chr: start-end in ProbABEL for some "candidate gene" association test. Thank you!

lckarssen
Site Admin
Site Admin
Posts: 321
Joined: Tue Jan 04, 2011 3:04 pm
Location: Utrecht, The Netherlands

Re: Extract SNPs for association test

Postby lckarssen » Fri Jun 12, 2015 8:12 am

The easiest way to do this is to use the DatABEL R package. This will allow you to subset a filevector file. In order to get the SNP names (which are stored in the filevector file) within your region of interest you need a file that contains SNP names, chromosomes and positions. Extraction of those SNPs can be done in several ways. Two suggestions:
* Use R: load the file and subset the right chromosome + region to obtain the SNP names you need
* Extract these SNP names from the linux command line using gawk or grep.

Typical R code for once you've got your SNP names would be something like this:

Code: Select all

library(DatABEL)
genodata <- databel("path/to/filevector_file.fvi")
extractedSNPs <- genodata[, selectedSNPnames]



For more details see the DatABEL tutorial at http://www.genabel.org/tutorials.
-------
Lennart Karssen
PolyOmica
The Netherlands
-------

katiechan
Posts: 3
Joined: Fri Jun 05, 2015 8:14 pm

Re: Extract SNPs for association test

Postby katiechan » Thu Aug 06, 2015 11:14 pm

Thank you for the instruction!
I have tried to get a list of snp names (in data frame format) that I would like to extract and the dimension of the list of snp names is:

Code: Select all

[1] 9563    1

But, I got the following error when I tried to do the third lines of code in the previous post:

Code: Select all

Error in convert_intlogcha_index_to_int(j, x, 2) :
  class of 'i' must be numeric or logical or character

After transposing the snp names list. Dimension now becomes:

Code: Select all

[1]    1 9563

I still obtained the same error....

It would be great if I can have some more guidance about how to make this works. Thank you!

lckarssen
Site Admin
Site Admin
Posts: 321
Joined: Tue Jan 04, 2011 3:04 pm
Location: Utrecht, The Netherlands

Re: Extract SNPs for association test

Postby lckarssen » Tue Sep 22, 2015 2:15 pm

The list of SNPs should be a vector, not a data frame. Can you try to use

Code: Select all

extractedSNPs <- genodata[, selectedSNPnames[, 1]]

where 'selectedSNPnames' is your data frame? By adding the '[, 1]' to the name of the data frame you select the first column.
-------
Lennart Karssen
PolyOmica
The Netherlands
-------


Return to “ProbABEL”

Who is online

Users browsing this forum: Bing [Bot] and 1 guest