Genetic Diversity in Indian
Populations
{Feature has been uploaded by CSIR (Unit for Science Dissemination), Ministry of Science & Technology, New Delhi}
A
home to more than one billion people, India is a land of matchless diversity in
diverse ways. With scores of culturally diverse communities inhabiting the
nation, each portraying a different language, religion, set of customs and
cuisine, India is not only ethnically distinct that is much apparent but the
human populations of this country are also distinct at the level of genes — the hereditary
material that is passed on from one generation to the next — thanks to novel
findings of a joint team of Indian and American scientists, with key players
from the Centre for Cellular and Molecular Biology
(CSIR), Hyderabad, India and from Harvard Medical School, the Harvard School of
Public Health and the Broad Institute of Harvard and MIT.
Interestingly, if we look
at our genetic material that biochemically comprises the DNA molecule, any two
unrelated individuals surprisingly differ just by 0.1%, as the remaining 99.9%
of DNA is completely identical. What an irony then that all the amazing human
diversity, at the level of genes, is housed only in this variable, and
apparently tiny, portion of our DNA! It is this region of DNA, comprising some
three million base pairs, that is a storehouse of clues to a rich source of
information, and has today helped scientists to reconstruct the historical
origins of human populations in India. It is also the region of our genetic
material, which clearly points to the many genetic variations in human beings
that make select individuals at a higher risk of certain diseases as compared
to others.
For this study on ascertaining genetic variability across various human
populations in India, about 5.6 lakh genetic markers
were analyzed across the genomes of 132 individuals who were selected from 25
diverse groups in India that represented 13 states comprising all six language
families, traditionally upper and lower castes, as well as tribal groups. An
important revelation of this study led by Lalji Singh and David Reich, published
in 24th September 2009 issue of Nature, is that different Indian groups carry genomic material from two distinct
ancestral populations – the ‘Ancestral North Indians’ (ANI) who are related to
western Eurasians, from whom the Indian populations have
inherited 40-80 % of their ancestry and the rest from ‘Ancestral South Indians’
(ASI) who are not related to any group outside India. The ANI ancestry has been
found to be significantly higher in Indo-European than Dravidian speakers,
which suggests that populations descending from ASI may have spoken a Dravidian
language before mixing with populations descending from ANI.
For analyzing the genetic
markers, which are the regions of genetic variations occurring as single
nucleotide polymorphisms (SNPs), the blood samples of select individuals of 25
diverse groups of India were collected. After DNA extraction from these
samples, all DNA samples were genotyped on Affymetrix 6.0 arrays or DNA chips,
and analyzed for genetic variations in 560,123 SNPs. Scientists then employed
novel statistical approaches for studying the genetic variations in these
individuals of diverse groups. Allele frequency differentiation among the
groups as well as inbreeding in each group were assessed using sophisticated
software. A novel toolkit has also been developed by scientists for
understanding the relationships among population groups, thus tracing their
history of origin.
This novel work has thus revealed, through modern genomic technology, that almost all Indian
groups, including the traditional ‘tribes’ as well as ‘castes’, have descended
from the mixtures of ANI and ASI ancestral populations. A significantly higher
ANI ancestry has been found in traditionally upper castes than in middle/lower
caste groups. According to CCMB scientists, it is impossible to distinguish
castes from tribes using the data, which supports the view that castes grew
directly out of tribal-like organizations during the formation of Indian
society.
This
study has also revealed that the Andamanese – a small population of indigenous
people of the Andaman Islands – appear to be related exclusively to the
Ancestral South Indian lineage and completely lack Ancestral North Indian
ancestry. This surely opens a door to the history of the Ancestral South
Indians who diverged from other Eurasians, probably tens of thousands of years
ago. Genetic variation studies on tribal
populations, who have been locked away from the modern world, is the key to unlock not only
the mystery of our own origins but is also
important for understanding the genetic basis of complex diseases. Many of the
environmental risk factors related to modern lifestyles, such as intake of
unhealthy diet and lack of physical exercise, which are at the crux for
triggering many complex diseases, are usually uncommon in tribals. Therefore,
with studies on primitive, isolated tribal populations, it would be possible to
differentiate genetic factors from environmental risk factors for these
diseases. In this direction, CCMB has undertaken a large project on studying
the human genetic diversity in tribal and caste populations of India in
collaboration with Anthropological Survey of India.
It has also come to light
that the ancestry of many groups in modern India could be traced back to a
small number of founding individuals, which explains why these groups have
remained genetically isolated from other groups for thousands of years, with
limited gene flow due to endogamy or marriages within the group. Such ‘founder
events’, as they are popularly called, are the root cause of the exceptionally
high incidence of some genetic diseases among only Indians. According to Lalji
Singh, former director of CCMB and a Bhatnagar Fellow whose pioneering efforts
in this field are commendable, India is genetically not a single large
population, but comprises many smaller isolated populations that have descended
from several founder events.
Just as founder events are
known to increase the incidence of recessive genetic diseases in other human
populations like Finns and Ashkenazi Jews, the same could most likely be the
case for many groups in India, where inter-caste marriages are a taboo.
According to researchers, the founder effects are responsible for an even
higher burden of recessive diseases in India than consanguinity. According to
researchers, this can be confirmed by carrying out a systematic survey of
Indian groups for identifying the communities that have descended from the
strongest founder events. This would help in pinning down the culprit genes
responsible for causing many devastating genetic diseases, thus opening the
door to finding effective therapies and providing appropriate clinical care to
the affected individuals and those at risk.
The history of population
structure in India, therefore, has its root in two ancestral populations — ANI
and ASI — and it is the rampant mixture of these populations that is the
hallmark of all the amazing genetic variations in many Indian groups. The
concepts of ancestral genomic content, their mixture throughout India and
importance of founder events have assumed significance, for these have serious
implications on the health of the Indian populations. The scope of further
research in this field would be to estimate a date when the mixture of these
populations might have occurred. For this, a detailed study of the length of
genetic stretches of ANI ancestry in Indian samples assumes importance. Another
area of scientific interest is exploring the history of ANI and ASI populations
before they began to be mixed.
India, the world's second
most populous nation is
uniquely distinct for its varied diversity. Be it geographic or climatic
diversity, be it the diversity in languages, religions and cultures of its
people, or be it the genetic diversity as evident today, after all it is our
very diversity that imparts strength to our oneness.
Box 1
India Cracks the Human Genome (2009)
In a ground breaking work, CSIR
scientists at the Institute of Genomics and Integrative Biology (IGIB), New
Delhi, completed the first human genome sequencing in India in December 2009, setting
the stage for India’s entry into individual genomics that opened up new
possibilities in disease diagnostics and treatment. The sequenced genome was
that of an anonymous healthy individual from Jharkhand. While the first human
genome sequencing took over a decade, and a whopping 3 billion US dollars to
complete the task, CSIR bagged the unique credit for accomplishing the same in
only 45 days, spending Rs. 15 lakhs (US$ 30,000).
The
IGIB scientists triumphantly generated over 51 gigabases of data, using the
most sophisticated sequencing technology that enables massive parallel
sequencing of millions of fragments of the genetic material, as small as
comprising only 76 base pairs. These small DNA fragment once sequenced, are then
mapped back to the reference genome. This herculean task of finding the
sequence of the entire human genetic material, comprising three billion base
pairs, was possible due to the CSIR supercomputing facility at IGIB. With this
achievement, India became the sixth country after US, China, Korea, Canada and
UK, to demonstrate the capability of sequencing and assembling a complete human
genome.
Understandably,
sequencing of the human genome requires high computational capability and
technological know-how in handling sophisticated machines and analyzing huge
volume of data. The first human genome sequencing initiative was conceived as
early as 1984. In addition to the United States, the ‘International Human
Genome Project Consortium’ comprised geneticists from United Kingdom, France,
Germany, Japan and China. The International Human Genome Project formally
started in 1990 and was completed in 2003, sequencing the genomes of Craig
Venter, James Watson and an anonymous Chinese individual. CSIR could achieve this
by adapting to new technologies and effectively integrating complex information
technology tools with analytical capabilities.
The sequencing of the human genome
would help us to understand the variations at genetic level that make two
individuals different. More importantly, since there is an association between
the genetic variants and predisposition to diseases, human genome sequencing
would be enormously important in diagnosis and management of various diseases
including cancer. Interestingly, the sequencing of the Indian genome has
revealed a large number of hitherto unknown variations that include single
nucleotide polymorphisms (SNPs) as well as many insertion/deletions in our
genetic material. Understanding the functional role of these variations would,
for sure, throw light on identifying the markers linked to specific diseases,
which could be specifically hunted for predicting diseases before they spell
disaster.
Earlier, CSIR scientists also
completed the genome sequencing of zebrafish – an organism
popularly used to model human diseases – that has half the size of the human
genome. With this feat, India became the first country to sequence the wild
type strain of zebrafish.
Box 2
Another Door Opened –
Genetic
Diversity Mapped in Asia
Housing 60% of the human
inhabitants of planet Earth, Asia – the world's largest continent – is a huge
melting pot of genetic diversity. The contributors of this exceedingly rich
human resource are the scores of unknown ancestors who migrated from different
parts of the world and settled down in this region over thousands of years.
Ancestral
human populations are believed to have originally spread out from Africa, from
where they slowly began to adapt different parts of the globe due to the
pressures of climate, food and health conditions. The present genetic human
diversity of the Asian populations is all due to these best adapting
individuals, who proved most fit to survive in a given place. It is the
tracking down of the ancestry of the human populations, through certain
tell-tale signs written in every person’s genes, that has empowered scientists
to remarkably establish a link between two geographically separated groups of
people. To understand the genetic history of the
people living in Asia, over 90 scientists from the Human Genome Organization’s
(HUGO’s) Pan-Asian SNP Consortium undertook the human genetic mapping of
Southeast Asian (SEA) and East Asian (EA) populations, the findings of which have
been published in the December 2009 issue of Science. The hallmark of this human ingenuity
is nothing but the tracing of certain ‘marker’ genes that for example, may
bestow the individual an advantage of better survival in a particular
environment, or a disease-gene marker which could be tracked back in time to discover
the human population from where that altered/mutated gene may have originated.
In this unique attempt 1,928 unrelated
individuals representing 73 populations from 10 countries and 10 linguistic
lineages from mainland China, India, Indonesia, Japan, Malaysia, the
Philippines, Singapore, South Korea, Taiwan and Thailand were studied. For establishing genetic differences
between two unrelated individuals, scientists basically look at more than three
million differences in their genes. Variations at the level of single
nucleotides, are commonly referred to as single nucleotide polymorphism (SNP).
Therefore, it is the tracking of genetic variations through human migrations
that provide clues to evolution of diseases and genetic diversity. Genotyping
of more than 50,000 SNPs was done at eight different centres while the
filtering of collected data was centralized to maximize the standardization of
results. This genetic mapping of people inhabiting different parts of Asia has
opened the door to understand the migratory patterns in human history as well
as the genetic basis of many diseases afflicting human populations of this
region.
This study has revealed that populations from
the same linguistic group tend to cluster together, which means that there is considerable
relatedness within ethnic/linguistic groups. It has also revealed that there
was a south-to-north migration of East Asians, which means that the majority of
East Asian gene pool has been derived from Southeast Asia. According to the
study, the most recent common ancestors of Asians arrived first in India.
Later, some of them migrated to Thailand, and also South to Malaysia,
Indonesia, and the Philippines. The first group of settlers must have gone very
far south before they settled successfully. These included the Malay Negritos,
Philippine Negritos, the East Indonesians, and the early settlers of the
Pacific Islands. Later, one or several groups of people migrated North, mixed
with previous settlers there resulting in various populations now known as
Austronesian, Austro-Asiatic, Tai-Kadai, Hmong-Mien, and Altaic etc.
Interestingly, most of the Indian population showed evidence of shared ancestry
with European population.
Signifying
the implications of this study, nothing can better echo the sentiments of
scientists than the words of Professor Samir Brahmachari, former Director
General, CSIR: “We have breached political and ideological boundaries to
show that the people of Asia are linked by a unifying genetic thread.”