Sib-pair analysis has been for a while the only tool available for the identification of chromosomal regions that potentially harbour genetic variants influencing the phenotype of interest. The approach can identify excess allele sharing, and was initially performed with microsatellites. It consists in the identical-by-descendant analysis of very informative markers that reconstruct the haplotype of parents and how they co-segregate in their offspring. We performed such an analysis on a unique collection of sib-pairs and their families, collected by the New England Centenarian Study (NECS), and identified a significant peak on 4q25 . Follow-up analysis failed to identify genetic variants that could explain the initial linkage finding. Rare mutations that segregate in centenarian siblings are eventually captured by sibling-pair analysis, but this cannot be the case for genetic association studies that loose power as the allele frequency of the tested polymorphisms drops. Furthermore, it is possible that linkage efforts identify chromosomal regions where more causative genetic variants reside, and thus the sum of their effects determines the linkage result, whereas with follow-up genetic association approaches, the analysis involves one common polymorphism at a time or, eventually, haplotypes. Attempts to replicate the initial linkage did not succeed, except for an initial replication effort that successfully replicated the linkage at D4S1564 [49, 50]. A negative replication effort can be due to an initial false-positive finding or to the diversity of the populations used for the replication effort, in terms of genetic background, the environment applying the demographic pressure, the ages of the participants, the number of families and the genetic markers adopted. Recently, Kunkel’s laboratory published a well-performed re-analysis of a part of the sib-pairs used in the initial study, plus new sib-pairs recruited by Elixir Pharmaceuticals . To be noted, some of the largest and more impressive families — those that were genotyped upfront and that showed immediately a significant linkage on 4q25 in the original study — were either not analysed or done so only in part by this second effort. The new analysis adopted a high-density marker panel of SNPs to genotype the patients, allowing a better coverage of the genome. They did not replicate the chromosome 4q25 finding, except when the same stringent criteria were adopted to select a sub-set of centenarian families. Interestingly, a new peak on chromosome 3p24-22 reached significant threshold, and a second peak was highly suggestive of linkage at 9q31-34. This latter peak appeared also in the previous analysis with microsatellites, even if less robustly. The attempt to identify the genetic variant/variants responsible for the 4q25 peak pointed to the initial encouraging genetic variant in the promoter of the microsomal triglyceride transfer protein (MTP) gene . Unfortunately, the finding was not replicated by an independent effort and by our analysis that included more controls [32, 53].
It is plausible that different approaches are needed to follow up genetic linkage findings, to point to the identification of rare variants that co-segregate in families. To this end, exome sequencing data, intersected with linkage data, could give rise to interesting results.
To be noted, the 4q25 locus harbours elongation of very long chain fatty acids protein 6 (ELOVL6), the elongase that transforms C16:0 into C18:0 and C16:1 into C18:1. Polymorphisms in this gene have been associated with insulin sensitivity; a mouse deficient for this gene carried high doses of C16:1 (palmitoleic acid) and did not acquire insulin resistance after a high-fat diet [54, 55]. C16:1 has been identified as an adipose tissue-derived lipid hormone that strongly stimulates muscle insulin action and suppresses hepatosteatosis . Genetically modified, long-living worms have an incredible correlation between their increase in life-span and their palmitoleic acids levels . This is stunning if we consider the increased level of palmitoleic acid that we observed in centenarians’ offspring and that the gene of the major modifier of palmitoleic acid levels (i.e. ELOVL6) is located in the 4q25 longevity locus [11, 48]. The re-sequencing in centenarians of this gene could bring to the identification of rare variants able to influence its activity.
Thus, the old approach of linkage analysis when combined with the new technologies of high massive re- sequencing could produce novel and interpretable results. Re-sequencing alone, because of the enormous amount of information generated, would force the application of a huge statistical correction for the multiple testing, which would cause the loss of most, if not all, the potential findings, as happens with GWAS.
Furthermore, multivariate models, based on machine-learning algorithms (i.e. Bayesian networks , classification and regression trees — CART  — and support vector machines — SVM ), are able to overcome the limitations of the usual “one-SNP-at-the-time” testing strategies usually employed for identifying causative variants. In particular, these kinds of approaches allow for a more in-depth comprehension of the molecular mechanisms underlying multifactorial traits, such as longevity, which result from the interaction of genetic variants (SNPs, mutations) and environmental and clinical determinants (e.g. diet, stress, comorbidities). In this context, bioinformatics plays a key role, allowing genetic information to be managed at a genome-wide level and to be integrated with the clinical information available.