Wednesday, August 26, 2015

SRA / FASTQ to BAM Kit

Most Ancient DNA are uploaded as SRA or FASTQ files. This kit is developed to allow anyone to download and convert SRA / FASTQ files to BAM files. Once converted, it can be further processed using BAM Analysis Kit, which can be further used for genetic genealogy.

Usage: Make sure the files ends with .sra / .fastq
sra2bam.bat <sra-file>.sra
(or)
fq2bam.bat <fastq-file>.fastq

Prerequisites: 64-bit Windows

Download : SRA_FASTQ to BAM Kit.zip (3.2 GB)

Change Log
    Version 1.0
    • Initial Release.

    Friday, July 31, 2015

    8300 year old Ancient DNA of Kennewick Man

    The authors sequenced DNA from a 8358 year-old man from Kennewick (Washington state) USA. I converted the raw data of these samples into formats familiar to genetic genealogists. The complete SNPs are available for download. I also found common SNPs when filtered with SNPs tested by DNA testing companies like FTDNA, 23andMe and Ancestry, and uploaded to GEDmatch as Kit# F999970. Haplogroups, site location, and age of the sample etc as per the authors can be found in Ancient DNA page.

    Download: 
    Reference:
    Rasmussen, Morten, Martin Sikora, Anders Albrechtsen, Thorfinn Sand Korneliussen, J. Víctor Moreno-Mayar, G. David Poznik, Christoph PE Zollikofer et al. "The ancestry and affiliations of Kennewick Man." Nature (2015).

    Data Used

    Sunday, July 5, 2015

    Y-STR Kit

    Y-STR Kit will analyse .BAM raw data file or VCF files and outputs in HTML file format with all Y-STR values. It supports build 37 (hg19). If you are selecting VCF file, it must have SNPs/indels and all confident sites (not just the variants). Currently supports FTDNA 111 Y-STR Markers.

    The tool provides the following output,
    • Y-STR_Report.html - Output HTML Report
    • bam_chrY.vcf.gz - VCF output with Indels, SNPs and all confident sites.
    Prerequisites: 
    Usage:

    Extract the download and click 'Y-STR Kit UI.exe'. Select the .BAM or VCF file and click ' Analysis'. After clicking ''Execute', a command prompt will automatically open and start executing series of commands.

    User Interface

    Y-STR Report

    Y-STR Report

    After a few minutes to several hours, the output will be available inside a subfolder called 'out'.

    Download:  Y-STR Kit.zip (76 MB)

    Configuration Guide: Y-STR Kit Guide.pdf

    Source Code:
    Located at 'src' folder and/or uploaded to GitHub

    License: The download bundles the following software for pre-processing BAM and VCF files. So, if you are using this tool for non-commercial and/or personal use, you should be alright.
    For  my binary and source code, you can use MIT License.

    References:
    • Li H.*, Handsaker B.*, Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9. [PMID: 19505943]
    • McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303. [Pubmed]
    Change Log :1.1
    • Bug Fix - Unable to load BAM from folders with spaces fixed.
    Change Log :1.0
    • FTDNA 111 Y-STR Markers

    Friday, July 3, 2015

    Ancient DNA from Peştera cu Oase, Romania

    The authors analyzed DNA from a 37,000-42,000-year-old modern human from Peştera cu Oase, Romania. They found that on the order of six to nine percent of the genome of the Oase individual is derived from Neanderthals, more than any other modern human sequenced to date. Three chromosomal segments of Neanderthal ancestry are over 50 cM in size, indicating that this individual had a Neanderthal ancestor as recently as four to six generations back. The Oase individual does not share more alleles with later Europeans than with East Asians, suggesting that the Oase population did not contribute substantially to later humans in Europe. I converted the raw data of these samples into formats familiar to genetic genealogists. The complete SNPs are available for download. However, there is not much common SNPs when filtered with SNPs tested by DNA testing companies like FTDNA, 23andMe and Ancestry, hence not uploaded to GEDmatch. Haplogroups, site location, and age of the sample etc as per the authors can be found in Ancient DNA page.

    Download: 
    Reference:
    Trinkaus, Erik, Oana Moldovan, Adrian Bîlgăr, Laurenţiu Sarcina, Sheela Athreya, Shara E. Bailey, Ricardo Rodrigo et al. "An early modern human from the Peştera cu Oase, Romania." Proceedings of the National Academy of Sciences 100, no. 20 (2003): 11231-11236.Trinkaus, Erik, Oana Moldovan, Adrian Bîlgăr, Laurenţiu Sarcina, Sheela Athreya, Shara E. Bailey, Ricardo Rodrigo et al. "An early modern human from the Peştera cu Oase, Romania." Proceedings of the National Academy of Sciences 100, no. 20 (2003): 11231-11236.

    Data Used

    Monday, June 29, 2015

    Ancient DNA of Black Death Victim #8291

    Ancient DNA was sequenced from tooth of sample #8291, a Black Death victim from c. 1348 AD, East Smithfield Cemetery, London, UK. I converted the raw data of these samples into formats familiar to genetic genealogists. The complete SNPs are available for download. However, there is not much common SNPs when filtered with SNPs tested by DNA testing companies like FTDNA, 23andMe and Ancestry, hence not uploaded to GEDmatch. Haplogroups, site location, and age of the sample etc as per the authors can be found in Ancient DNA page.

    Download: 
    Reference:
    Schuenemann, Verena J., et al. "Targeted enrichment of ancient pathogens yielding the pPCP1 plasmid of Yersinia pestis from victims of the Black Death." Proceedings of the National Academy of Sciences 108.38 (2011): E746-E752.

    Data Used