Sunday, November 27, 2016

Using Longest Segment in Endogamous DNA Analysis

When you're from an endogamous population, you tend to share a lot of small segments with others from that population--which causes relationship predictions that are significantly closer than they are in actuality.  For example, someone who's predicted to be a second cousin may actually be your eighth cousin in 12 ways.  So in such populations, the amount of shared DNA is not necessarily indicative of the actual relationship.  So what can be done?

I don't really have over 2100 fourth cousins--let alone that many who have tested at Ancestry

Many of my known relatives have tested.  So let's take a look at both the amounts of shared DNA I have with each of these individuals as well as the longest segment I share with each.

Name Actual Relationship Shared cM Longest Segment Comments
Sonia Grandmother 1325 219
Sid Uncle 1959 193
Eddie Uncle 1849 143
Ruth Great Aunt 1004 96
Karen 1C1R 461 44
(Karen's 2C)
2C1R 164 22
Scott 1C1R 686 61
George 1C2R
(George's 1C)
(Mitzi's & George's 1/2 1C)
Half 1C2R 209 73
Ken 2C2R 175 57
(2nd cousin to Ken; 1/2 second cousin to Sue)
2C2R 108 16
(1st cousin to Myron; 2nd cousin to Ken; 1/2 second cousin to Sue)
2C2R 104 21
(Myron's niece)
3C1R 89 16
(1/2 second cousin to Ken, Marilyn and Myron)
Half 2C2R 104 21
Beth 2C1R 232 45 2 AJ grandparents
(Beth's nephew)
3C 281 57 1 AJ grandparent
(Beth's niece; Dave's 1C)
3C 110 19 1 AJ grandparent
Elise 3C1R 142 27
Judith 3C1R 193 29
Jonathan (Judith's son) 4C 182 29
(Judith's sister)
3C1R 166 29
Pat 2C1R 118 23
Ben 3C, 4C 101 13 Has 2 Supkoff great grandparents
Howard 4C 41 9 50% AJ
Table Showing Actual & Genetic Relationships Between Myself and Known Relatives
Most of these known relatives--even someone as distant as my fourth cousin Jonathan--share at least one reasonably large segment with me; some share more than one.  So looking at the longest segment one has in common with one's matches can be a good strategy to identify people who may actually be relatively closely related.  Twenty or more cM seems to be a decent benchmark to warrant further investigation.

However, there are a few people here who wouldn't have been found using this strategy--people like Ben, Howard, Myron and Sara.  All happen to be related on my mother's side--so I took a look at how much each shares with my mother's brother Eddie and their first cousin Karen--who not only are one generation closer to each person than I am, but who also would have inherited different DNA than my mother did:

Name Actual Relationship Shared cM Longest Segment Comments
Myron 2C1R 231 49
Sara 3C 161 23
Ben 2C1R, 3C1R 146 17
Howard 3C1R 86 12 50% AJ
Table Showing Actual & Genetic Relationships Between Uncle Eddie and Known Relatives
Eddie shares a significant segment with Myron and a reasonably-sized one with Sara--enough to make me start investigating if I hadn't known them before.  Ben and Howard are a bit better but not much.  How about if I look at Karen?

Name Actual Relationship Shared cM Longest Segment Comments
Myron 2C1R 161 29
Sara 3C 125 20
Ben 2C1R, 3C1R 149 30
Howard 3C1R 130 43 50% AJ
Table Showing Actual & Genetic Relationships Between Uncle Eddie and Known Relatives
My mother's first cousin Karen has large shared segments with these 4 individuals--in particular Howard, with whom Eddie & I didn't share much at all.

So while looking at largest shared segments is a reasonable strategy for identifying relatives, there may be some actual relatives who are missed; however that can be tempered by also looking at longest segments shared with known relatives.

In an upcoming post I'll discuss how contacting matches with long segments in common worked.

Note:  I'm on Twitter.  Follow me (@larasgenealogy).

Want to get future blog posts emailed to you automatically?
Enter your email address:


  1. I have also found this a more predictive method than overall cM. But is there a minimum amount of cM you would look for as well? That is, I have some matches that share one long segment over 20 cM, but not much more than that. Is that one segment itself significant without much more shared? In any event, in this case, as in most cases, we've not been able to find a connection anyway.

    1. There I haven't seen the same sort of pattern. I'm planning on writing up some successes (and failures) for finding connections with looking at largest segment--and I can take a look at total shared to see if it makes a difference.

  2. I look forward to your analysis!

  3. Interesting thoughts to think about in terms of DNA analysis - thanks for sharing