Mapping of positive selection sites in the HIV-1 genome in the context of RNA and protein structural constraints
1 Institute of Microbiology, University Hospital Center and University of Lausanne, Lausanne, Switzerland
2 Rega Institute for Medical Research, KU Leuven, Leuven, Belgium
3 Global Health Institute, School of Life Sciences, EPFL, Lausanne, Switzerland
4 Eötvös Lorand University, Institue of Biology, Budapest
5 Human Immunology Section, Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA
Retrovirology 2011, 8:87 doi:10.1186/1742-4690-8-87Published: 1 November 2011
The HIV-1 genome is subject to pressures that target the virus resulting in escape and adaptation. On the other hand, there is a requirement for sequence conservation because of functional and structural constraints. Mapping the sites of selective pressure and conservation on the viral genome generates a reference for understanding the limits to viral escape, and can serve as a template for the discovery of sites of genetic conflict with known or unknown host proteins.
To build a thorough evolutionary, functional and structural map of the HIV-1 genome, complete subtype B sequences were obtained from the Los Alamos database. We mapped sites under positive selective pressure, amino acid conservation, protein and RNA structure, overlapping coding frames, CD8 T cell, CD4 T cell and antibody epitopes, and sites enriched in AG and AA dinucleotide motives. Globally, 33% of amino acid positions were found to be variable and 12% of the genome was under positive selection. Because interrelated constraining and diversifying forces shape the viral genome, we included the variables from both classes of pressure in a multivariate model to predict conservation or positive selection: structured RNA and α-helix domains independently predicted conservation while CD4 T cell and antibody epitopes were associated with positive selection.
The global map of the viral genome contains positive selected sites that are not in canonical CD8 T cell, CD4 T cell or antibody epitopes; thus, it identifies a class of residues that may be targeted by other host selective pressures. Overall, RNA structure represents the strongest determinant of HIV-1 conservation. These data can inform the combined analysis of host and viral genetic information.