Thursday, August 16, 2018

Color Clustering: Top 25 Fourth Cousins

For more on Color Clustering & DNA, please visit my new website at: www.danaleeds.com 

For another look at how Color Clustering works with 4th cousins, I created a Color Cluster chart then added the test taker's top twenty-five 4th cousin matches. I was able to easily sort all but one of these 4th cousins into Color Clusters!

Original Color Cluster Chart (click here for Color Cluster Method)

Color Cluster chart for actual test taker.
Names changed for privacy.

This test taker's AncestryDNA 2nd & 3rd cousins fell into 4 Color Clusters, labeled C1 through C4, with three "unclustered" cousins.

  • 2nd & 3rd cousins who are in more than one cluster are in redBarbie, Ken, & Mark.
  • 2nd & 3rd cousins who did not have shared matches with other 2nd & 3rd cousins are in "unclustered" columns: Lena, Sue, and Mike.

Color Cluster chart LABELED

Labeled Color Cluster chart

The test taker's four sets of great grandparents were identified from her research as follows:

  • G1 - Bailey/Bowman (father's father's line)
  • G2 - Stark/Dunn (father's mother's line)
  • G3 - Hillard/Morris (mother's father's line)
  • G4 - Washington/Manning (mother's mother's line)
I looked at each cousin's tree and did one of the following:
  • Put a "NO" in the cell if there was not a tree and I couldn't easily identify to which cluster(s) the cousin belonged.
  • Typed G1, G2, G3, G4, or a combination of those in the colored cell if the person had a tree and I could determine which surnames they fit in under OR if the genealogist had done research and discovered the relationship herself. (Note:  the "unclustered" cousin, Lena, was identified as belonging to G4.)
I then labeled the columns according to the cousins found in them: G1, G1, G2, G3/G4, G4, unclustered, and unclustered.

Adding 4th Cousins

Twenty-five 4th Cousins added into Color Clusters

Directly below this Color Cluster chart, I added the names of the first twenty-five 4th cousins. For each cousin, I looked at AncestryDNA's "Shared Matches" and determined which 2nd/3rd cousins they were matching. I colored in the appropriate cell and labeled the cell with the number of shared centimorgans (cM).

2nd/3rd/4th cousin Color Cluster chart

Above is the final chart which includes all of the 2nd/3rd cousins (sharing <400 cM) and, below it, the first twenty-five 4th cousin matches. A few things to note about the 4th cousins:

  • Owen - at this point, Owen is still not in a cluster
  • Mary & Bill - they both matched previously "unclustered" cousin Sue, so the three created a new cluster. We do not know what part of the family this cluster belongs to at this point
  • Others - a few did not match any 2nd/3rd cousins, but when I opened their top match, they DID match a 2nd/3rd cousin, so I added them to that column
NOTE: Trees were not used to match the 4th cousins to the appropriate Color Clusters. The sorts were based only on shared matches. This Color Cluster method is  a quick, visual way to see how your cousins are related.

If you give this method a try, please let me know what you think and how it works for you.

Happy Sorting!

Monday, August 13, 2018

Color Clustering: Working with "4th Cousins"

Please see an updated version of this post and more on the Leeds Method of DNA Color Clustering on my new website, www.danaleeds.com

If you haven't read my first two posts about the Color Clustering (aka Leeds) Method, read the original posts:


An example of Color Clustering using Excel

I thought this method would be too messy to work with 4th cousins. But, I figured out yesterday how to make it work: I built my clusters based on the shared matches of 2nd and 3rd cousins and then I just sorted the "4th cousins" into these clusters!  

Here are the steps I used:

STEP 1: Create a Color Cluster chart (see first post)
Using all of AncestryDNA's predicted 2nd & 3rd cousins (who share less than 400 cM with the test taker), create a color cluster chart. (Note: If you are not comfortable with spreadsheets, you can use colored pencils and paper or whatever you have on hand!)

Example of test taker whose DNA sorted
into 4 Color Clusters plus one 
Unclustered (purple) match, Drew
Depending on which relatives have tested, Color Clustering often results in 4 columns which are related to the four sets of great grandparents. See the original post for examples and possible explanations of cases where there are not 4 columns created. 

Note: One match, Mona (red print), sorted into TWO columns. She most likely is related to the test taker through BOTH the yellow and orange families.

STEP 2: Identify these columns if possible.


In this case, we were able to determine the relationship of the test taker to the 4 clusters (C1 through C4). If you cannot identify some (or any) of these groups, you can skip this step.

STEP 3: Compare 4th Cousins Shared Matches to your Color Cluster Chart


Color Clustering using 4th Cousin Matches
(last 10 in grey).
Below the original Color Clustering, I wrote the names of the test taker's first ten "4th cousin" matches (in grey boxes). For each person, I opened the Shared Matches and looked to see which 2nd and 3rd cousin names they matched with and assigned them that color. Note: This is not proof that they are related to that branch of your family, but it is a strong clue! (I do not continue to add more columns; I am only determining which color cluster these matches match!)

STEP 4: Sort 4th cousins who do not have 2nd or 3rd cousin matches by looking at their shared matches.  

Showing 4th cousin, Teresa, had a shared
match that matched Mona so she was
assigned to both the orange & yellow clusters.
.
One 4th cousin match, Teresa (in red print), did not have a cousin within the 2nd and 3rd cousin matches. But, when I opened the shared matches of her closest match, she matched Mona. Since Mona is in the Orange & Yellow clusters, Teresa was assigned to both clusters.

Note: As with most techniques, this method works best when the branches of your family - especially your 4 sets of great grandparents - are completely unrelated. But, one of the neatest thing about this method is that your matches do NOT have to have FAMILY TREES and this will STILL WORK!!

Note: While the above example uses real data, the names have been changed for privacy. Also, this test taker had a single random person in a 5th column without a tree or any 2nd/3rd cousin matches. We have not identified this "unclustered" Purple match.

Happy Sorting!

Wednesday, August 8, 2018

Color Clustering: Identifying "In Common" Surnames

Please see an updated version of this post and more on the Leeds Method of DNA Color Clustering on my new website, www.danaleeds.com

After creating Color Clusters using the new Color Cluster Method (aka Leeds Method), the next step is to identify the surnames associated with these groups. (For creating Color Clusters, please read my original Color Clustering post.)

Note: This method is especially useful for people working with adoptees or other unknown parentage cases where they do not already know what surnames to concentrate on!

COLOR CLUSTERS: Identifying Common Surnames

STEP 1: Create Color Clusters and determine which clusters you need to work with (or work with all of them).
Actual data from an adoptee I worked with,
but names changed for privacy.
In this case, the adoptee identified the Blue Cluster as her biological mother's. We were trying to identify her biological father, so we concentrated on the Orange and Yellow Clusters. (The Green column did not have a cluster.)

STEP 2:  Determine which matches have trees and which do not and label.

Actual data from an adoptee I worked with,
but names changed for privacy.

I look at each match and see if they have a tree - whether attached or not attached! I then label them to indicate "tree" or "no tree."

STEP 3: List the "4th Gen" (great grandparents) surnames for each match with a tree. If they don't have 4th Generation matches, use grandparents or even parents.

Actual data from an adoptee I worked with,
but names changed for privacy.
To find the surnames, open the match's "pedigree and surnames" page and look at the surnames under the "4th Gen" column. If their tree is complete enough, you will see 8 surnames at this level - the match's great grandparents. In this example, both Gabby and Jamie have all 8 great grandparents listed on their tree along with their surnames.

STEP 4: Identify common surnames, if any, in each Color Cluster.

Actual data from an adoptee I worked with,
but names changed for privacy.

(I find this step truly amazing!) I have highlighted the shared surnames:
  • Orange Cluster: Griffin & Bartles
  • Yellow Cluster: Paulson, Austin, and Gray
STEP 5: Assign potential surnames to the Color Clusters, if identified, and use these clues to further your research!
Actual data from an adoptee I worked with,
but names changed for privacy.
At this point, you have clues as to what surnames you are looking for in each cluster. Continue your research using these clues!

You also might be able to look at first cousins or other "close family" matches to help label these clusters. (And, a big thank you to John Motzi for his help in refining this process!)

Happy Clustering!

Monday, August 6, 2018

Color Clustering: Creating Color Clusters

Please see an updated version of this post and more on the Leeds Method of DNA Color Clustering on my new website, www.danaleeds.com

Unsure of how other people were sorting their Shared Matches from AncestryDNA, I created my own method. This method is quick - it usually takes less than 10 minutes - and visually shows genetic connections while also "sorting" the matches into groups reflecting the test taker's great grandparents' lines.

Please test out this method and let me know what you think! Although I think it will be valuable for many genealogists, I think it will be especially useful for adoptees, Search Angels, and others who are trying to identify unknown, close relatives.

NOTE: For the examples below, all results are real, but the names are fictitious.

COLOR CLUSTERING: The Method

STEP 1:
Using AncestryDNA, list all of those they label as "second" or "third" cousins, but skip over any second cousin that shares more than 400 cM. 


STEP 2:
Assign a color to your first DNA match (for example, blue to Ralph.)


STEP 3:
Open the shared matches for that person (Ralph), and assign them each the same color in the same column (blue).


STEP 4: 
Find the first person who does not have a color assigned (Robert), and assign him a color in the next column (orange).


STEP 4:
Open the shared matches for that person (Robert), and assign them each the same color in the same column (orange).


STEP 5:
Continue steps 3 & 4 until all of your shared matches have at least one color assigned to them.




COLOR CLUSTERING: Analyzing the Results

4 Columns/No Overlap:

If your results show 4 distinct clusters, like below, without any overlap, your sort is likely showing matches to your 4 sets of great grandparents.


Less than 4 Columns:

If your results show less than 4 clusters, it is likely these clusters represent 3 of your 4 sets of great grandparents and that you have no matches at the 2nd/3rd cousins levels who have tested for the 4th set of great grandparents.



Some Overlap:

If your results show 4 clusters but some of your matches have been assigned more than one color (for example, Herbert & Stacy are both blue and orange), your sort is likely showing either your 4 sets of great grandparents, but also showing you that two of these results (i.e. blue & orange) are on one side of your family. Or, the overlapped clusters (blue & orange) might belong to one set of great grandparents and, in this example, you are missing matches for 1 set of your 4 sets of great grandparents.



Lots of Overlap

In this real example, there is a lot of overlap between all of the clusters except the yellow and brick red clusters. All of the overlapping clusters are on the maternal side of this test taker and visually show a lot of cousins marrying cousins resulting in pedigree collapse. The paternal mother's side is represented by both the yellow and brick red clusters. The paternal father's side has no cousins matching at the 2nd/3rd cousin levels. So, even though there are a lot of clusters and matches, this sort represents only 3 of the 4 sets of great grandparents for this individual.





 A special thank you to everyone who allowed me to access their DNA and gave me feedback!

Please be aware: Your results may vary! This new method is still in its infancy and more test cases are needed to see how it works in various situations.

TIP: When I say "2nd and 3rd cousins," I am using the categories Ancestry has used to define them. The 3rd cousins appear to go down to 90 shared cM which works out well for this process.

TIP: If you chart is "too messy," look at the shared cM of your top matches and take off any that are above 400 shared cM. Then redo the chart. Hopefully, it'll be a lot "cleaner!"

Happy Clustering!

Color Clustering: Top 25 Fourth Cousins

For more on Color Clustering & DNA, please visit my new website at: www.danaleeds.com  For another look at how Color Clustering works...