Thursday, September 25, 2014

Autosomal DNA Converter (*nix)

In Unix-like environments, converting DNA files can be easily done using simple commands. For windows, you can always use Autosomal DNA Converter (Windows) tool for conversion. Unlike Windows, Unix-based systems don't require any special tools for conversion and the below commands works out-of-box.

Prerequisites: Any Unix-based system

Converting 23andMe to FamilyTreeDNA format

$ echo "RSID,CHROMOSOME,POSITION,RESULT" > output.csv
$ cat input.txt|grep -v '#'|awk -F'\t' '{ print "\""$1"\",\""$2"\",\""$3"\",\""$4"\""; }' >> output.csv

Note: The input.txt is the 23andMe autosomal file and the output.csv will be in FTDNA format.

Converting Ancestry to FamilyTreeDNA format

$ echo "RSID,CHROMOSOME,POSITION,RESULT" > output.csv
$ cat input.txt|grep -v '#'|grep -v 'rsid'|awk -F'\t' '{ print "\""$1"\",\""$2"\",\""$3"\",\""$4$5"\""; }'|sed s/,\"23\",/,\"X\",/g|grep -v '2[4|5]' > output.csv

Note: The input.txt is the Ancestry autosomal file and the output.csv will be in FTDNA format.

Converting FamilyTreeDNA to 23andMe format

$ echo -e "# rsid\tchromosome\tposition\tgenotype" > output.txt
$ cat input.csv|tail -n +2|cut -d, -f1,2,3,4|sed s/\"//g|sed s/,/\\t/g >> output.txt

Note: The input.csv is the FTDNA autosomal file and the output.txt will be in 23andMe format.


Converting Ancestry to 23andMe format

$ echo -e "# rsid\tchromosome\tposition\tgenotype" > output.txt
$ cat input.txt|grep -v '#'|grep -v 'rsid'|awk -F'\t' '{ print $1"\t"$2"\t"$3"\t"$4$5; }'|sed s/\\t23\\t/\\tX\\t\/g |sed s/\\t24\\t/\\tY\\t\/g| grep -P -v '\t25\t' >> output.txt

Note: The input.txt is the Ancestry autosomal file and the output.txt will be in FTDNA format.


Converting 23andMe to Ancestry format

$ echo -e "rsid\tchromosome\tposition\tallele1\tallele2" > output.txt
$ cat input.txt|grep -v '#'|grep -v 'rsid'|awk -F'\t' '{ print $1"\t"$2"\t"$3"\t"substr($4,1,1)"\t"substr($4,2,1); }'|sed s/\\tX\\t/\\t23\\t/g | sed s/\\tY\\t/\\t24\\t/g >> output.txt

Note: The input.txt is the 23andMe autosomal file and the output.txt will be in Ancestry format.

Converting FamilyTreeDNA to Ancestry format

$ echo -e "rsid\tchromosome\tposition\tallele1\tallele2" > output.txt
$ cat input.csv|tail -n +2|cut -d, -f1,2,3,4|sed s/\"//g|sed s/,/\\t/g | awk -F'\t' '{ print $1"\t"$2"\t"$3"\t"substr($4,1,1)"\t"substr($4,2,1); }'|sed s/\\tX\\t/\\t23\\t/g | sed s/\\tY\\t/\\t24\\t/g >> output.txt

Note: The input.csv is the FTDNA autosomal file and the output.txt will be in Ancestry format.