Monday, August 13, 2018

Color Clustering: Working with "4th Cousins"

An example of Color Clustering using Excel

I thought this method would be too messy to work with 4th cousins. But, I figured out yesterday how to make it work: I built my clusters based on the shared matches of 2nd and 3rd cousins and then I just sorted the "4th cousins" into these clusters!  

Here are the steps I used:

STEP 1: Create a Color Cluster chart (see first post)
Using all of AncestryDNA's predicted 2nd & 3rd cousins (who share less than 400 cM with the test taker), create a color cluster chart. (Note: If you are not comfortable with spreadsheets, you can use colored pencils and paper or whatever you have on hand!)

Example of test taker whose DNA sorted
into 4 Color Clusters plus one 
Unclustered (purple) match, Drew
Depending on which relatives have tested, Color Clustering often results in 4 columns which are related to the four sets of great grandparents. See the original post for examples and possible explanations of cases where there are not 4 columns created. 

Note: One match, Mona (red print), sorted into TWO columns. She most likely is related to the test taker through BOTH the yellow and orange families.

STEP 2: Identify these columns if possible.

In this case, we were able to determine the relationship of the test taker to the 4 clusters (C1 through C4). If you cannot identify some (or any) of these groups, you can skip this step.

STEP 3: Compare 4th Cousins Shared Matches to your Color Cluster Chart

Color Clustering using 4th Cousin Matches
(last 10 in grey).
Below the original Color Clustering, I wrote the names of the test taker's first ten "4th cousin" matches (in grey boxes). For each person, I opened the Shared Matches and looked to see which 2nd and 3rd cousin names they matched with and assigned them that color. Note: This is not proof that they are related to that branch of your family, but it is a strong clue! (I do not continue to add more columns; I am only determining which color cluster these matches match!)

STEP 4: Sort 4th cousins who do not have 2nd or 3rd cousin matches by looking at their shared matches.  

Showing 4th cousin, Teresa, had a shared
match that matched Mona so she was
assigned to both the orange & yellow clusters.
One 4th cousin match, Teresa (in red print), did not have a cousin within the 2nd and 3rd cousin matches. But, when I opened the shared matches of her closest match, she matched Mona. Since Mona is in the Orange & Yellow clusters, Teresa was assigned to both clusters.

Note: As with most techniques, this method works best when the branches of your family - especially your 4 sets of great grandparents - are completely unrelated. But, one of the neatest thing about this method is that your matches do NOT have to have FAMILY TREES and this will STILL WORK!!

Note: While the above example uses real data, the names have been changed for privacy. Also, this test taker had a single random person in a 5th column without a tree or any 2nd/3rd cousin matches. We have not identified this "unclustered" Purple match.

  1. Can't tell you how much I appreciate your nitty-gritty step-by-step explanations. I've put off analyzing at this cousin level, but now I think I'll try. TY!

    1. Marian, You're so welcome! Let me know how it goes!

  2. Thank you for this method, I have finally got mine sorted out & it has helped me pinpoint where one particular 4th cousin fits. But whilst my father's side of the family is fairly normal (only 1 overlap that side) my mother's side looks like rows of crepe paper streamers with up to 3 colour overlaps on a lot of the matches. Sadly I am pretty sure I know why I have "pedigree collapse" on her side of the family & it looks like a nasty family secret has been confirmed. But the good thing is that I have a good chance of breaking down a major brick wall on my Dad's side now that I have identified the mystery 4th cousin, so I am excited that your method worked so well for me. Thank you! :)

    1. Nice!!! Please let me know if this helps you break down your brick wall! Would love to hear about that :)

    2. Hi Dana, I did finally break down my brick wall & your colour clustering helped me do it. The story is very long & involved (too long for posting here), could I email it to you?


  3. what if your 4th cousins shared matches shared matches also only match 4th cousins, nothing less?

    1. Hi, Lisa. I just wrote a bit about this. But, basically, you can try opening their highest matches of these 4th cousins and see if any of them match the 2nd/3rd cousins. If not, so far, I am also stuck. But, this usually works!

  4. My 87 year old brother-in-law is adopted. His non-identifying info for NYS indicated that his parents were born in 1903 - English Protestant Canadian carpet weaver and Black Canadian barber (who passed on an Ashkenazi Y-DNA!). It looks like I'm starting to see lines based on the European, African, and Jewish grandparent contributors.

    1. Nice, Sonia! I'm finding this method to be really helpful when working with adoptees.

  5. Hi Dana. This is a fabulous tool! I have a question, whose answer may seem obvious, but I'm asking anyway. I'm searching for an unknown father with a known mother and all of the matches I am color clustering are for the paternal side. Would this mean that I am looking for 2 sets of great grandparents for cluster matches instead of 4 sets?

    1. Hi, Susan. I'm not sure I understand... are you clustering all of the matches and then just concentrating on the father's side? Or are you only clustering the father's side? If just the father's, yes, you'd expect to see 2 sets of great grandparents.

    2. Dana, Yes, I am just clustering the father's side. So your answer confirms my thinking. Thanks so much.


