Combining Ancestry and 23andMe Data with AWK

I'm a big fan of learning about health and genetics, and for me, there is no better site than Genetic Lifehacks, which has a ton of well written and accessible detailed articles. With membership, it also allows you to connect your own genetic data to explain whether various genetic factors apply to you.

I originally just had 23andMe data, but in order to get more SNPs, I also took the AncestryDNA test. There is overlap in coverage between the two tests, but each also has unique SNPs (single nucleotide polymorphisms) for various traits.

The first thing I wanted to do on Genetic Lifehacks was to combine the files from both services for a more comprehensive coverage in the articles. However, I encountered a few issues:

  1. Sometimes, the site would have an entry for an SNP, but the data would be -- or 00, indicating missing data. In some cases, the data was only missing in one dataset, but present in the other.
  2. There were some SNPs that had different values in the two datasets. In fact, I found 27 differences out of a possible 165,406 SNPs! While that's a small percentage, I wanted to catch them, especially if an important SNP is involved.

Combine Your Files Here

If you just want to combine your files without running any commands, use this tool. It does everything described below, directly in your browser. Your data never leaves your computer — all processing happens locally in JavaScript.

23andMe File
Drop file or click to select
AncestryDNA File
Drop file or click to select

If this was useful, feel free to buy me a coffee :)