RESUMEN
The MHC class I region contains crucial genes for the innate and adaptive immune response, playing a key role in susceptibility to many autoimmune and infectious diseases. Genome-wide association studies have identified numerous disease-associated SNPs within this region. However, these associations do not fully capture the immune-biological relevance of specific HLA alleles. HLA imputation techniques may leverage available SNP arrays by predicting allele genotypes based on the linkage disequilibrium between SNPs and specific HLA alleles. Successful imputation requires diverse and large reference panels, especially for admixed populations. This study employed a bioinformatics approach to call SNPs and HLA alleles in multi-ethnic samples from the 1000 genomes (1KG) dataset and admixed individuals from Brazil (SABE), utilising 30X whole-genome sequencing data. Using HIBAG, we created three reference panels: 1KG (n = 2504), SABE (n = 1171), and the full model (n = 3675) encompassing all samples. In extensive cross-validation of these reference panels, the multi-ethnic 1KG reference exhibited overall superior performance than the reference with only Brazilian samples. However, the best results were achieved with the full model. Additionally, we expanded the scope of imputation by developing reference panels for non-classical, MICA, MICB and HLA-H genes, previously unavailable for multi-ethnic populations. Validation in an independent Brazilian dataset showcased the superiority of our reference panels over the Michigan Imputation Server, particularly in predicting HLA-B alleles among Brazilians. Our investigations underscored the need to enhance or adapt reference panels to encompass the target population's genetic diversity, emphasising the significance of multiethnic references for accurate imputation across different populations.
Asunto(s)
Alelos , Etnicidad , Frecuencia de los Genes , Polimorfismo de Nucleótido Simple , Humanos , Brasil , Etnicidad/genética , Antígenos HLA/genética , Desequilibrio de Ligamiento , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Genética de Población/métodos , Antígenos de Histocompatibilidad Clase I/genética , Biología Computacional/métodosRESUMEN
Spontaneous clearance of acute hepatitis C virus (HCV) infection is associated with single nucleotide polymorphisms (SNPs) on the MHC class II. We fine-mapped the MHC region in European (n = 1,600; 594 HCV clearance/1,006 HCV persistence) and African (n = 1,869; 340 HCV clearance/1,529 HCV persistence) ancestry individuals and evaluated HCV peptide binding affinity of classical alleles. In both populations, HLA-DQß1Leu26 (p valueMeta = 1.24 × 10-14) located in pocket 4 was negatively associated with HCV spontaneous clearance and HLA-DQß1Pro55 (p valueMeta = 8.23 × 10-11) located in the peptide binding region was positively associated, independently of HLA-DQß1Leu26. These two amino acids are not in linkage disequilibrium (r2 < 0.1) and explain the SNPs and classical allele associations represented by rs2647011, rs9274711, HLA-DQB1∗03:01, and HLA-DRB1∗01:01. Additionally, HCV persistence classical alleles tagged by HLA-DQß1Leu26 had fewer HCV binding epitopes and lower predicted binding affinities compared to clearance alleles (geometric mean of combined IC50 nM of persistence versus clearance; 2,321 nM versus 761.7 nM, p value = 1.35 × 10-38). In summary, MHC class II fine-mapping revealed key amino acids in HLA-DQß1 explaining allelic and SNP associations with HCV outcomes. This mechanistic advance in understanding of natural recovery and immunogenetics of HCV might set the stage for much needed enhancement and design of vaccine to promote spontaneous clearance of HCV infection.