A vast diversity of 16 influenza hemagglutinin (HA) subtypes are found in birds. Interestingly, viruses from only two subtypes, H5 and H7, have so far evolved into highly pathogenic avian influenza viruses (HPAIVs) following insertions or substitutions at the HA cleavage site by the viral polymerase. The mechanisms underlying this striking subtype specificity are still unknown. Here, we compiled a comprehensive dataset of 20,488 avian influenza virus HA sequences to investigate differences in nucleotide and amino acid usage at the HA cleavage site between subtypes and how these might impact the genesis of HPAIVs by polymerase stuttering and realignment. We found that sequences of the H5 and H7 subtypes stand out by their high purine content at the HA cleavage site. In addition, fewer substitutions were necessary in H5 and H7 HAs than in HAs from other subtypes to acquire an insertion-prone HA cleavage site sequence, as defined based on in vitro and in vivo data from the literature. Codon usage was more favorable for HPAIV genesis in sequences of viruses isolated from species or geographical regions in which HPAIV genesis is more frequently observed in nature. The results of the present analyses suggest that the subtype restriction of HPAIV genesis to H5 and H7 influenza viruses might be due to the particular codon usage at the HA cleavage site in these subtypes.
In Silico Analyses of the Role of Codon Usage at the Hemagglutinin Cleavage Site in Highly Pathogenic Avian Influenza Genesis