{"id":2520,"date":"2010-08-02T08:57:41","date_gmt":"2010-08-02T08:57:41","guid":{"rendered":"http:\/\/i-base.info\/qa\/?page_id=2520"},"modified":"2025-08-07T10:25:09","modified_gmt":"2025-08-07T10:25:09","slug":"hiv-genome-explained","status":"publish","type":"page","link":"https:\/\/i-base.info\/qa\/factsheets\/hiv-genome-explained","title":{"rendered":"HIV genome: genetic structure and function of HIV explained"},"content":{"rendered":"<p><strong>This information on the\u00a0HIV genome is reproduced with permission from\u00a0Molecules of HIV website by Dan Stowells.\u00a0This an excellent non-technical website on explaining scientific aspects HIV and immunology. We reproduce it here to ensure it remains an online resource, but encourage people to visit the original website. All hyperlinks are to the original site.<\/strong><\/p>\n<hr \/>\n<h2>HIV genome<\/h2>\n<p>The full HIV genome is encoded on one long strand of\u00a0<strong>RNA<\/strong>. (In a free virus particle, there are actually two separate strands of\u00a0RNA, but they&#8217;re exactly the same!)<\/p>\n<p>This is the form it has when it is a free virus particle. When the virus is integrated into the host&#8217;s\u00a0<strong>DNA<\/strong> genome (as a\u00a0<strong>provirus<\/strong>) then its information too is encoded in\u00a0DNA.<\/p>\n<p>The following image shows roughly how the genes are laid out in HIV (remember that\u00a0<strong>HIV-1 and HIV-2<\/strong> are quite different).<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15853\" src=\"https:\/\/i-base.info\/qa\/wp-content\/uploads\/2020\/07\/hiv-genomes.gif\" alt=\"HIV genome\" width=\"598\" height=\"282\" \/><\/p>\n<p>The genes in HIV&#8217;s genome are as follows:<\/p>\n<ul>\n<li><strong>gag <\/strong>(coding for the <strong>viral capsid proteins<\/strong>)<\/li>\n<li><strong>pol<\/strong> (notably, coding for<strong>\u00a0reverse transcriptase<\/strong>)<\/li>\n<\/ul>\n<p>(NB.\u00a0<strong>gag<\/strong> and\u00a0<strong>pol<\/strong> together can be expressed in one long strand called <strong>gag-pol<\/strong>)<\/p>\n<ul>\n<li><strong>env<\/strong> (coding for HIV&#8217;s envelope-associated proteins)<\/li>\n<\/ul>\n<p>And the regulatory genes:<\/p>\n<ul>\n<li><strong>tat<\/strong><\/li>\n<li><strong>rev<\/strong><\/li>\n<li><strong>nef<\/strong><\/li>\n<li><strong>vif<\/strong><\/li>\n<li><strong>vpr<\/strong><\/li>\n<li><strong>vpu<\/strong> (N.B. not present in HIV-2)<\/li>\n<li><strong>vpx<\/strong> (N.B. not present in HIV-1)<\/li>\n<\/ul>\n<p>The HIV genome also has a <strong>&#8220;Long Terminal Repeat&#8221; <\/strong>(<strong>LTR<\/strong>) at each end of its genome &#8211; not quite a gene, but a sequence of\u00a0RNA\/DNA which is the same at either end and which serves some structural and regulatory purposes.<\/p>\n<h2>gag<\/h2>\n<p><strong>gag<\/strong>\u00a0is one of the three &#8220;main&#8221; genes found in all\u00a0retroviruses (along with\u00a0<strong>env<\/strong> and\u00a0<strong>pol<\/strong>). It contains around 1500 nucleotides, and encodes four separate proteins which form the building blocks for the\u00a0<strong>viral core<\/strong>:<\/p>\n<ul>\n<li><strong>Capsid protein<\/strong>, <strong>CA<\/strong>,<strong>\u00a0p24<\/strong><\/li>\n<li><strong>Matrix protein<\/strong>, <strong>MA<\/strong>,\u00a0<strong>p17<\/strong> (this protein isn&#8217;t actually part of the\u00a0viral core but the &#8220;matrix&#8221; which anchors the core to the viral envelope)<\/li>\n<li><strong>Nucleocapsid protein<\/strong>, <strong>NC<\/strong>,\u00a0<strong>p9<\/strong><\/li>\n<li><strong>p6<\/strong><\/li>\n<\/ul>\n<p>The most significant role of the gag gene is therefore to encode important proteins which will make up the\u00a0viral core.<\/p>\n<h2>pol<\/h2>\n<p><strong>pol<\/strong>\u00a0is one of the main retroviral genes. It encodes four proteins, of which the most important is\u00a0<strong>Reverse Transcriptase<\/strong>.\u00a0Reverse Transcriptase performs a job which is unique to\u00a0retroviruses, in that it copies the virus&#8217;\u00a0RNA genome into\u00a0DNA. (Since most organisms and viruses keep their genes in\u00a0DNA form in the first place, they have no need to perform this task.) The copying of the\u00a0HIV genome into\u00a0DNA form is one of the key stages of the\u00a0HIV life-cycle. The other three products of pol are these:<\/p>\n<ul>\n<li><strong>Protease <\/strong>&#8211; which processes proteins made from HIV&#8217;s genome so that they can become part of new fully-functioning HIV particles<\/li>\n<li><strong>RNAse H<\/strong> &#8211; which breaks down the retroviral genome following infection of a cell<\/li>\n<li><strong>Integrase <\/strong>&#8211; which integrates the\u00a0DNA copy of HIV&#8217;s genome into the host\u00a0DNA<\/li>\n<\/ul>\n<h2>env<\/h2>\n<p>The <strong>env<\/strong> gene in HIV encodes a single protein, <strong>gp160<\/strong>. (When gp160 is synthesised in the cell, cellular enzymes add complex carbohydrates and turn it from a protein into a glycoprotein &#8211; hence the name &#8220;gp160&#8221; rather than &#8220;p160&#8221;.)<\/p>\n<p><strong>gp160<\/strong> travels to the cell surface, where cellular enzymes again attack it, this time chopping into two pieces &#8211;\u00a0<strong>gp120<\/strong>, and\u00a0<strong>gp41<\/strong>. If and when new virus particles bud off from the host cell, these two pieces lie on opposite sides of the virus membrane.\u00a0<strong>gp120<\/strong> sits on the outside of the virus particle, forming the virus&#8217;s spikes, while\u00a0<strong>gp41<\/strong> sits just on the inside of the membrane &#8211; each\u00a0<strong>gp41<\/strong> being anchored to a<strong>\u00a0gp120<\/strong> through the membrane.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15854\" src=\"https:\/\/i-base.info\/qa\/wp-content\/uploads\/2020\/07\/gp120-gp41.gif\" alt=\"gp120 and gp41\" width=\"377\" height=\"307\" \/><\/p>\n<p>How many spikes does a HIV particle have?\u00a0It&#8217;s a tricky question, but the answer seems likely to be\u00a0about 9 or 10. This is a lot fewer spikes than you&#8217;ll see on most diagrams of HIV! There&#8217;s a bit of confusion since some studies have decided that HIV particles normally have 72 spikes, whilst some other studies have decided that they have normally no more than ten. It&#8217;s hard to say for certain who&#8217;s right&#8230;.<\/p>\n<h2>tat<\/h2>\n<p><strong>tat<\/strong> is short for <strong>transactivator<\/strong> &#8211; it&#8217;s a regulatory gene which accelerates the production of more HIV virus. In fact, it&#8217;s crucial to HIV, because HIV completely fails to replicate itself without it. Tat protein is also toxic, so the large amounts of tat protein released into the blood by HIV-infected cells are no help for the body.<\/p>\n<p><strong>tat<\/strong> works because the protein encoded by tat binds to the start of a new HIV\u00a0RNA strand &#8211; a part which has been called the <strong>transactivator active region<\/strong>\u00a0or <strong>TAR.<\/strong> The TAR runs from +1 to +59, that is to say, the first 59 nucleotides of the HIV genome. Once the cellular machinery has transcribed this much\u00a0provirus into\u00a0RNA, <strong>tat<\/strong> can bind to it and encourage the transcription of the remainder of the HIV genetic code.<\/p>\n<p>You might have also read about the negative regulators which HIV has &#8211; <strong>NRE<\/strong>,\u00a0<strong>nef<\/strong>,\u00a0<strong>vif<\/strong>. Surely it&#8217;s barmy to have genes for boosting virus reproduction as well as genes for suppressing it! Well actually this is the normal way of things down at the level of genes and proteins. The tug-of-war between the suppressors and the activators can result in an incredibly precise control of how much a gene is expressed. Without this tug-of-war control, gene expression would simply depend on how active the cell&#8217;s transcription machinery was &#8211; either all the genes would be expressed a lot, or all the genes wouldn&#8217;t be expressed much at all. Organisms such as cells need more precision than that!<\/p>\n<p><strong>tat<\/strong> protein size: 101 kD in naturally-occurring HIV-1 (86\u00a0kD in some laboratory-bred types of HIV-1)<\/p>\n<h2>rev<\/h2>\n<p><strong>rev<\/strong> is another of HIV&#8217;s regulator genes. It stimulates the production of HIV proteins, but suppresses the expression of HIV&#8217;s regulatory genes.<\/p>\n<p>How does it achieve this? The messenger RNAs of HIV can either be sent to the protein-producing part of the cell intact, or they can have bits cut out of them first (splicing). The intact <strong>mRNA<\/strong> tends to encode HIV proteins (such as envelope and capsid proteins), while the spliced mRNA encodes regulatory genes such as\u00a0<strong>tat<\/strong> and\u00a0<strong>nef<\/strong>.<\/p>\n<p>So what <strong>rev<\/strong> does is to help intact mRNA to be exported from the cell nucleus. It binds to the mRNA at a specific point (the <strong>RRE<\/strong> or <strong>rev-responsive element<\/strong>), and this complex of\u00a0RNA and rev is sent out of the nucleus. A molecule of rev can &#8220;shuttle&#8221; in and out of the nucleus, potentially taking a new set of\u00a0RNA out each time it leaves the nucleus.<\/p>\n<p>The RRE is not present in completely-spliced HIV mRNA &#8211; it will have been chopped out. Completely-spliced mRNA is sent out of the nucleus by the ordinary cell machinery (without needing help from rev) &#8211; so you could say that rev&#8217;s trick is to cause the mRNA to be exported &#8220;before it&#8217;s ready&#8221;, in a sense.<\/p>\n<h2>nef<\/h2>\n<p>The <strong>negative replication factor<\/strong> (<strong>nef<\/strong>) gene encodes a protein which hangs around in the cytoplasm of the cell, and retards HIV replication. Possibly it does this by modifying cellular proteins that regulate the initiation of transcription &#8211; that is, it affects the proteins which tell the cell whether or not to make RNA copies of the\u00a0DNA code.<\/p>\n<p><strong>nef<\/strong> protein size: 27\u00a0kD<\/p>\n<h2>vif<\/h2>\n<p>The <strong>vif<\/strong> gene codes for <strong>virion infectivity factor,<\/strong> a protein that increases the infectivity of the HIV particle.<\/p>\n<p>The protein is found inside HIV-infected cells, and it works by interfering with one of the immune system&#8217;s defences &#8211; a cellular protein called\u00a0<strong>APOBEC3G<\/strong>. Basically what happens is that <strong>vif<\/strong> sticks to\u00a0APOBEC3G and encourages the cell to degrade it, preventing it doing its job of sneaking into newly-formed virus particles and making them non-productive.<\/p>\n<p>This has been verified in experiments. If you can create a HIV virus with the <strong>vif<\/strong> protein missing (we would call this a <strong>delta-vif<\/strong> strain of HIV), then it can still infect a cell &#8211; but the new virus particles produced from that cell contain APOBEC3G and therefore aren&#8217;t very effective at infecting other cells.<\/p>\n<p><strong>vif<\/strong> protein size: 23 kD<\/p>\n<p>Journal articles about <strong>vif<\/strong>:<\/p>\n<ul>\n<li>Navarro F, Landau NR (2004)Recent insights into HIV-1 Vif.Current Opinion in Immunology\u00a016 (4): 477-482<\/li>\n<li>Rose KM, Marin M, Kozak SL, et al. (2004)The viral infectivity factor (Vif) of HIV-1 unveiled.Trends in Molecular Medicine\u00a010 (6): 291-297<\/li>\n<li>Argyris EG, Pomerantz RJ (2004)HIV-1 Vif versus\u00a0APOBEC3G: newly appreciated warriors in the ancient battle between virus and host.Trends in Microbiology\u00a012 (4): 145-148<\/li>\n<\/ul>\n<h2>vpr<\/h2>\n<p><strong>Viral protein R<\/strong> accelerates the production of HIV proteins.<\/p>\n<p>It also facilitates the nuclear localisation of the\u00a0<strong>preintegration complex<\/strong> &#8211; the agglomeration of viral\u00a0RNA and\u00a0reverse transcriptase and\u00a0integrase proteins which must form in order for the\u00a0HIV genome to be integrated into the host cell&#8217;s genome. <strong>vpr<\/strong> carries &#8220;nuclear localisation signals&#8221; (sequences of protein which are recognised by cellular machinery as indicating that it should be transported into the nucleus), and in a sense it mimicks the behaviour of a protein called importin-beta.<\/p>\n<p>There also seems to be a role for <strong>vpr<\/strong> in stopping the host cell going through the ordinary &#8220;cell cycle&#8221; &#8211; many cells normally go through a regular cycle of splitting to create new cells, but <strong>vpr<\/strong> can stop host cells doing this. It seems that a cell which has been stopped during the so-called &#8220;G2&#8221; phase of the cell cycle is a nicer environment for HIV replication.<\/p>\n<p>More information:<\/p>\n<ul>\n<li><strong>vpr<\/strong> protein size: 15\u00a0kD<\/li>\n<li>There are 100 copies of this protein in every\u00a0HIV virion.<\/li>\n<li>The cellular protein\u00a0<strong>cyclophilin A<\/strong> is important for the production of <strong>vpr<\/strong>.<\/li>\n<\/ul>\n<h2>vpu<\/h2>\n<p><strong>Viral protein U<\/strong> helps with the assembly of new virus particles, and helps them to bud from the host cell. It&#8217;s possible for HIV to replicate and bud without this particular protein, but only 10% or 20% as many new virus particles are produced.<\/p>\n<p><strong>vpu<\/strong> also works within the infected cell to enhace the degradation of <strong>CD4<\/strong> proteins. This has the effect of reducing the amount of\u00a0CD4 sticking out of the infected cell, therefore reducing the likelihood of\u00a0superinfection.<\/p>\n<p>Without the <strong>vpu<\/strong> gene, HIV virus actually kills its host cell quicker! A secondary effect of <strong>vpu<\/strong> is to delay the cytopathic (cell-killing) effects of virus infection, keeping the cell alive slightly longer so that it can produce more virus particles.<\/p>\n<p>More information can be found in these journal articles:<\/p>\n<ul>\n<li>The HIV-1 Vpu protein: a multifunctional enhancer of viral particle release\u00a0Bour S, Strebel K,\u00a0Microbes and infection\u00a05 (11): 1029-1039<\/li>\n<li>Functional Role of Human Immunodeficiency Virus Type 1 vpu, Ernest F. Terwilliger; Eric A. Cohen; Yichen Lu; Joseph G. Sodroski; William A. Haseltine;\u00a0Proceedings of the National Academy of Sciences of the United States of America, Vol. 86, No. 13. (Jul. 1, 1989), pp. 5163-5167.<\/li>\n<\/ul>\n<h2>vpx<\/h2>\n<p><strong>vpx<\/strong> is found in HIV-2 (and SIV),\u00a0but not in HIV-1. It is closely related to\u00a0<strong>vpr<\/strong> (if we compare their genetic sequences), which indicates that its existence might have come about as a duplication of the\u00a0<strong>vpr<\/strong> gene.<\/p>\n<p>Its role in the life of HIV is not entirely clear! It certainly seems to be &#8220;dispensable&#8221;, since types of HIV-2 without a functioning vpx gene still seem to be able to replicate and to infect cells&#8230;. However, it seems that <strong>vpx<\/strong> does have some effect of making viral reproduction\u00a0more efficient, especially in non-dividing cells such as macrophages. The molecular mechanisms behind this are not yet fully understood.<\/p>\n<p>More detailed information can be found in these journal articles:<\/p>\n<ul>\n<li>Dispensable role of the Human-Immunodeficiency-Virus Type-2 Vpx protein in viral replication, Marcon L, Michaels F, Hattori N, Fargnoli K, Gallo RC, Franchini G.\u00a0Journal of Virology\u00a065 (7): 3938-3942 JUL 1991<\/li>\n<li>Vpx and\u00a0Vpr proteins of HIV-2 up-regulate the viral infectivity by a distinct mechanism in lymphocytic cells, Ueno F, Shiota H, Miyaura M, Yoshida A, Sakurai A, Tatsuki J, Koyama AH, Akari H, Adachi A, Fujita M.\u00a0Microbes and Infection\u00a05 (5): 387-395 APR 2003<\/li>\n<\/ul>\n<h2>Long Terminal Repeat<\/h2>\n<p>The <strong>Long Terminal Repeat<\/strong> is something which is often found in strands of\u00a0RNA or\u00a0DNA is the <strong>Long Terminal Repeat<\/strong>. At each end of the string is the same sequence of code at each end of the string. Almost like the repeat at the start and finish of these sentences, almost like!<\/p>\n<p>There are two important functions for the <strong>LTR<\/strong>:<\/p>\n<ul>\n<li>Firstly they are &#8220;sticky ends&#8221; (that&#8217;s a biochemistry term) which the\u00a0integrase protein uses to insert the\u00a0HIV genome into host\u00a0DNA.<\/li>\n<li>Secondly, they act as promoter\/enhancers &#8211; when integrated into the host genome, they influence the cell machinery which transcribes\u00a0DNA, to alter the amount of transcription which occurs.\u00a0<a href=\"http:\/\/www.mcld.co.uk\/hiv\/?q=Protein%20binding%20sites%20in%20the%20LTR\">Protein <\/a>binding sites in the <strong>LTR<\/strong>\u00a0are involved with\u00a0RNA initiation.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This information on the\u00a0HIV genome is reproduced with permission from\u00a0Molecules of HIV website by Dan Stowells.\u00a0This an excellent non-technical website on explaining scientific aspects HIV and immunology. We reproduce it here to ensure it remains an online resource, but encourage &hellip;<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":7201,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2520","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/pages\/2520","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/comments?post=2520"}],"version-history":[{"count":2,"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/pages\/2520\/revisions"}],"predecessor-version":[{"id":26211,"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/pages\/2520\/revisions\/26211"}],"up":[{"embeddable":true,"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/pages\/7201"}],"wp:attachment":[{"href":"https:\/\/i-base.info\/qa\/wp-json\/wp\/v2\/media?parent=2520"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}