Mycobacterium tuberculosis, the etiological agent of human tuberculosis (TB), results in the highest number of deaths among all known pathogens. While it is considered as a respiratory pathogen and primarily infects the lung, certain M. tuberculosis strains are able to cause disseminated disease and preferably invade the central nervous system, leading to tuberculosis meningitis (TBM). Although it is known that the clinical strains of M. tuberculosis could differ by 400-4,000 single-nucleotide polymorphisms (SNPs), it remains to be determined how these genetic differences lead to striking differences in their pathogenicity. The inadequate understanding of the genetic mechanisms of virulence is one of the major roadblocks towards the development of anti-TB products (eg vaccines, diagnostic reagents and antibacterial agents) and control of the global TB epidemic. The current work aims to identify the genetic determinants of enhanced virulence in M. tuberculosis isolated from TBM through comparative genomic and transcriptomic analyses. To facilitate large-scale genomic comparison with all published M. tuberculosis genomes (n= 4,331), we developed two bioinformatics tools in this study, namely Searchin-MTB-genomes, a tool to search for SNPs of interest in all available M. tuberculosis genomes, and MIRU-profiler, a rapid tool to determine 24-loci MIRU-VNTR profile from the M. tuberculosis genomes. M. tuberculosis strain H112 was originally isolated from the cerebrospinal fluid of a tuberculous meningitis patient. In a previous study, the H112 strain demonstrated an enhanced intracellular growth and reduced induction of tumor-necrosis-factor alpha, relative to 123 other clinical strains and the reference virulent strain M. tuberculosis H37Rv. Therefore, it was defined as hypervirulent strains. In the current study, the complete genome sequences were determined for M. tuberculosis H112 and a less-virulent strain from the same phylogenetic lineage H54. Comparative genomics analysis between H112 and H54 revealed that there were 140 mutations specific to the H112 strain. The presence of H112-specific mutations in previously genome sequenced strains was analyzed using the self-developed bioinformatics tool Search-in-MTB-genomes, which enabled identification of 33 previously genome sequenced strains related to H112. The genetic relationship of these 33 strains to H112 was also confirmed using the MIRU-profiler. A whole-genome phylogenetic analysis revealed that H112 forms a novel clade with several other high virulent strains reported elsewhere. The highly virulent strains reported elsewhere shared a set of H112-specific mutations (12 SNPs and five insertions/deletions) that could not be found in the low or medium virulent strains from the same phylogenetic lineage. Thus the study uncovered a novel genetic clade of high virulent strains and mutations specific to them.
In a subsequent study, the functional effects of high virulent strains-specific mutations were studied by linking the mutations to the gene-expression changes between H112 and H54. The total RNA was extracted from the hypervirulent H112 and control strain H54 under two conditions (exponential growth and stressed condition) and subjected to the differential gene-expression analysis by RNA-sequencing. The analysis revealed that 4 out of 17 high virulent strains-specific mutations were associated with a differential expression of the neighboring gene under both exponential growth and stressed condition. These four genes encoded virulence-associated transcriptional regulators (phoP and higB), proline immunopeptidase (pip) and fatty-acid-dehydrogenaseenzyme …