Monday, April 28, 2014

Big-Y Telomere


This tool is replaced by BAM Analysis Kit with more advanced features.


A telomere is a region of repetitive nucleotide sequences at each end of a chromatid, which protects the end of the chromosome from deterioration or from fusion with neighbouring chromosomes. The longer the telomere, the more long life you have.

I did a small experiment to see if I could extract the telomere length information from BigY BAM and indeed I was able to. So, I made a small tool using telseq on windows using cygwin so that anyone can use it.

The tool provides the following output,
  • telomere.txt - Information on telomere length. 
Supported BAM files:
  • Big-Y BAM
  • Any BAM file with UCSC convention (hg1x) ordering for human reference genome.
Please let me know if any of the other BAM files are supported and/or the above is not supported.

Prerequisites: 
Usage:

Extract the download and click 'BigY Telomere UI'. Select the .BAM file and click 'Start Analysis'.



After clicking 'Start Analysis', a command prompt will automatically open and start executing a few commands.



After a few minutes (depending on your computer speed), the output will be available inside a subfolder called 'out', and the result file will automatically open in notepad. The estimated telomere length is in kb. 

Download:  BigY Telomere (64 bit).zip (20 MB)

License: The download bundles the following software for easy usage.
References:
  • Ding, Zhihao, Massimo Mangino, Abraham Aviv, Tim Spector, and Richard Durbin. "Estimating telomere length from whole genome sequence data." Nucleic acids research (2014): gku181.
Change Log :1.1
  • Some modifications for compatibility.
Change Log :1.0
  • Initial Release.

Saturday, April 26, 2014

Big-Y BAM STR Analysis Tool


This tool is replaced by BAM Analysis Kit with more advanced features.


Similar to Big-Y BAM SNP Analysis Tool where you were able to analyse SNPs, this tool is for STRs.

The tool provides the following output,
  • y_str.csv - contains all identified STRs in BigY BAM file. 
Supported BAM files:
  • Big-Y BAM
  • Any BAM file with UCSC convention (hg1x) ordering for human reference genome.
Please let me know if any of the other BAM files are supported and/or the above is not supported.

Prerequisites: 
Usage:

Extract the download and click 'Big-Y BAM STR Analysis UI.exe'. Select the .BAM file and click 'Start Analysis'.



After clicking 'Start Analysis', a command prompt will automatically open and start executing series of commands.

After nearly an hour (depending on your computer speed), the output will be available inside a subfolder called 'out'.

Download:  Big-Y BAM STR Analysis (64 bit).zip (97.3 MB)

Source Code: (Inside src folder)

License: The download bundles the following software for easy usage. So, if you are using this tool for non-commercial and/or personal use, you should be alight.
References:
  • Li H.*, Handsaker B.*, Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9. [PMID: 19505943]
  • Gymrek M, Golan D, Rosset S, & Erlich Y. lobSTR: A short tandem repeat profiler for personal genomes. Genome Research. 2012 April 22.
Change Log :1.0
  • Initial Release.

Wednesday, April 23, 2014

23andMe to FASTA

If you have a 23andMe raw data file which contains mt-DNA data with refSNPs/RSID but not in FASTA file format, this tool will help you.

Prerequisites: Microsoft .Net Framework 4.0

Usage: Open the 23andMe raw data and save the mtDNA FASTA file. Once saved, you can use FASTA to RSRS (With Visualizer) to get RSRS markers and visualize the mutations from RSRS. You can also use James Lick's mtDNA Haplogroup analysis.

Screenshot:


Download : 23andMe to FASTA.exe (298 KB)

Source Code at GitHub.

Assumption: 23andMe mtDNA raw data uses rCRS as reference (and positions for v2 and v3) and covers all variations from rCRS. If my assumption is wrong, please do alert me and I can fix the tool.

Change Log
Version 1.0
  • Initial Release.

Saturday, April 12, 2014

ISOGG Y-Tree AddOn for Google Chrome

ISOGG Y-Tree AddOn is a chrome browser extension that adds additional functionality of plotting your Y-SNP results on ISOGG Y-Tree webpage (isogg.org/tree). Please note that, this AddOn replaces the ISOGG tree functionality of Big-Y AddOn.

The extension adds a number of features to ISOGG Y-Tree based on your Y-DNA results.
  • Allows upto 10 kits.
  • Highlights Positive and Negative SNPs in ISOGG Y-Tree.
Note: To use this add-on, you must have purchased a Y-DNA test from any of the DNA testing companies for genealogy purposes and had received the results as Y-SNPs. ISOGG Tree is from International Society of Genetic Genealogy (www.isogg.org). Once the AddOn is installed, go to the Options and enter your Y-SNPs. If you want to the SNPs in the format supported by this AddOn from the BigY CSV download you can use Merge-Y to add the file and export the SNPs. This exported SNPs can now be pasted into the AddOn.

Prerequisites: Google Chrome

Screenshot:


ISOGG Y-Tree after plotting.


Usage: Install the addon and go to Options page and enter your Y-SNPs. Then, to go isogg.org/tree to get those entered SNPs plotted.



Install: ISOGG AddOn Chrome AddOn

Source Code at GitHub.

Misc Info: Fast mode is an important feature to accelerate the plotted for better user experience. It works in such a way that the AddOn will have pre-knowledge of what SNPs will be in the Tree.  E.g., Big-Y may have 25000+ SNPs but only a quarter are actually found in Y-Tree. Hence, instead of searching for all 25000+ SNPs in ISOGG Y-Tree which is very inefficient, the AddOn will ignore all the SNPs from Big-Y results that aren't in Y-Tree. Hence, only ~5000+ SNPs are searched against the SNPs in Y-Tree, thus improving the overall user experience. If you are not sure what to do, just leave it as ticked.

If fast mode is unchecked, then plot interval is considered. This is also to adjust your overall user experience based on your requirement. Plot interval is simply the time internal between one plot and the other. If fast mode is enabled, plot interval is 0, which means, the browser literally hangs until the plot is complete. However, if fast mode is not enabled, you have two options. Either you can give preference to plotting but have the ability to watch the SNPs (by selecting 1 ms) or  give preference to user experience where you want to browse the site without any inconvenience irrespective of whether the plotting happens or not (selecting 600 ms).

Related Blogs:

Change Log: 1.0.7
  • All Y-SNPs from ISOGG Y-Tree until June 12th 2015 added to the program for faster plot.
Change Log: 1.0.6
  • All Y-SNPs from ISOGG Y-Tree until Nov 1st 2014 added to the program for faster plot.
Change Log: 1.0.5
  • Doesn't load for Haplogroups A,B and D - bug fixed.
Change Log: 1.0.4
  • Disables the annoying notice of this AddOn separated from BigY AddOn when this AddOn is already installed.
Change Log :1.0.3
  • Kit selection disabled even after scanning had completed when fast mode is unchecked - bug fixed.
Change Log :1.0.2
  • Kit selection bug fixed.
Change Log :1.0.1
  • Icon changes.
Change Log :1.0.0
  • Initial Release.

Wednesday, April 9, 2014

YSNP Novel Variants

If you have downloaded the Novel Variants using Big-Y AddOn for Google Chrome, it gets downloaded exactly as in the table. However, it would be nice to see if there is a mapping of the Y-SNPs and knowing if it is positive or not. This tool exactly does that.

Prerequisites: Microsoft .Net Framework 4.0

Usage: Open the Novel Variants download and save the displayed table or Y-SNPs. After saving the Y-SNPs, you may want to look at it in ISOGG Y-Tree 2014

Screenshot:

Download : YSNP Novel Variants.exe (782 Kb)

Source Code at GitHub.

Change Log
Version 1.1
Version 1.0
  • Initial Release.
Note: Y-SNP data is taken from ISOGG and  Dr Jim Wilson and ScotlandsDNA.

Tuesday, April 8, 2014

23andMe To YSNPs

If you have a 23andMe raw data file which contains Y-DNA data with refSNPs/RSID but not the names of Y-SNPs in ISOGG format, this tool will help you. Please note that only positions of build 37 are supported.

Prerequisites: Microsoft .Net Framework 4.0

Usage: Open the 23andMe raw data and save the Y-SNPs. After saving the Y-SNPs, you may want to look at ISOGG Y-Tree 2014

Screenshot:

Download : 23andMe To YSNPs.exe (782 Kb)

Source Code at GitHub.

Change Log
Version 1.1
Version 1.0
  • Initial Release.
Note: Y-SNP data is taken from ISOGG and  Dr Jim Wilson and ScotlandsDNA.

Tuesday, April 1, 2014

ISOGG Y-Tree Plotter


This tool is replaced by ISOGG Y-Tree AddOn for Google Chrome


Note: This tool is replaced by ISOGG Y-Tree AddOn for Google Chrome. With ever changing Y-Tree, the best solution is to directly plot on the website itself. Hence, this tool is obsolete.

ISOGG Y-Tree 2014 is a desktop application for ISOGG Y-Tree allows you to mark and identify the haplogroup and optimized for Big Y results. This application replaces the earlier My Y-SNP Tree.

Prerequisites: Microsoft .Net Framework 4.0

Usage: Just double click on it, paste your y-SNPs on the textbox provided and click 'Mark on Tree'.

Screenshot:

Download from Google Drive.

Source Code at GitHub.

Citation: International Society of Genetic Genealogy (2014). Y-DNA Haplogroup Tree 2014, Version:  9.35, Date: 10 March 2014, http://www.isogg.org/tree/ [Date of access: 10, Mar, 2014].

Change Log
Version 1.0
  • Y-SNP tree based on ISOGG's latest 2014 Y-Tree and optimized for the use with Big-Y. Initial release.

Merge Y

If you have done different tests for Y-DNA, then you can merge Y-DNA test results from different companies or different products. Currently supports Big-Y AddOn output, Big-Y CSV download, Geno 2.0, 23andMe.

Prerequisites: Microsoft .Net Framework 4.0

Usage: Add the files and save the merged. You can also save the SNPs alone.

If you are using the  Big-Y AddOn, you might have downloaded two files for Known SNP and Novel Variants. This tool helps to identify the new SNPs not mentioned in Novel Variants download and merges them as one download. Along with these two, you can also include other DNA files like Geno 2.0 and 23andMe to merge and remove duplicates.

Screenshot:
Merge-Y screenshot

Download : Merge Y.exe (480 Kb)

Source Code from GitHub.

Change Log
Version 1.3
  • Supports the BigY CSV download for the purpose of exporting SNPs.
Version 1.2
  • Fixes the alignment when position doesn't exist in merged output.
Version 1.1
  • Hangs during view merged or export - fixed. Now, includes a progress-bar to show the progress.
Version 1.0
  • Initial Release.
Note: I don't have sufficient raw files in different formats for the same person to test it effectively. Testing is purely done using simulated data and my own test results. Please let me know if you find any bugs.