Genotype determination is based on CPIC gene definition tables, with modifications for the following genes:
CPIC provides recommendations based on the SLCO1B1 star allele genotype. The CPIC guideline for statins and SLCO1B1, ABCG2, and CYP2C9 (PMID:35152405) includes the following excerpt:
The most common and well-studied variant in SLCO1B1 is c.521T>C (rs4149056), and can be genotyped alone (e.g., PCR-based single SNV assay) or multiplexed on a variety of array-based platforms. All SLCO1B1 genetic tests should interrogate c.521T>C; however, while other less common variants in this gene may have limited evidence to guide action, they may also be important.
PharmCAT attempts to determine the star allele genotype for SLCO1B1, but in cases where no call can be determined it provides the CPIC recommendation based on the rs4149056 variant genotype.
The CPIC DPYD allele definition file includes variants that have been assigned normal function (activity value 1), decreased function (activity value 0.5), or no function (activity value 0). According to the CPIC guideline for fluoropyrimidines and DPYD (PMID:29152729), the DPYD phenotype is assigned using a gene activity score, which is calculated as the sum of the activity scores of the two DPYD variants with the lowest variant activity score.
If two different decreased/no function variants are present, they are presumed to be on different gene copies. Irrespective of the presence of decreased/no function variants, patients may carry multiple normal function variants. Common normal function variants may be located on the same gene copy as other normal function variants or decreased/no function variants.
More details are available on the PharmGKB's CPIC DPYD reference page. The DPYD Allele Functionality Table has function assignment information and the DPYD Diplotype-Phenotype Table includes example translations considering one or two variants.
Note: the combination research flag is ignored when calling DPYD.
Effectively phased data is unphased data that is homozygous at all positions or is heterozygous at a single a position. Since we can effectively predict the alleles on each chromosome in this situation, we can treat the data as we would phased data.
If phased or effectively phased data is provided in the VCF file, the
Named Allele Matcher produces an output that lists all detected variants per allele. For example:
[c.498G>A + c.2582A>G]/[c.2846A>T + c.2933A>G] . If no variants are found on an allele, the
Named Allele Matcher returns
Reference for that allele.
If unphased data (that cannot be considered effectively phased) is provided in the VCF file, and the data are not homozygous at all positions, the
Named Allele Matcher will not attempt to call a diplotype. Instead, it produces a list of all detected DPYD variants in the sample. It will, however, check if variants can be called on both strands. If so, it will call the variant twice. For example:
c.1905+1G>A (*2A). If the sample doesn’t contain variants at the positions from the allele definition file, and/or if those positions are omitted from the vcf file, the
Named Allele Matcher returns
The report lists the respective allele functionality for each variant and for
Reference. If a diplotype was called from phased/all-homozygous data, the lowest function variants on each strand will be used to determine the gene activity score and DPYD phenotype. Otherwise, the two lowest function variants found are used to determine the gene activity score and DPYD phenotype. The phenotype and gene activity score are utilized to retrieve the corresponding drug recommendations.
As of October 2022, DPWG recommendations are available for 4 DPYD variation:
c.1129-5923C>G, c.1236G>A (HapB3)
When inferring gene activity score and phenotype from the two variants with the lowest activity value (unphased data) or the lowest per strand (phased data), PharmCAT uses the variants that are included in both CPIC and DPWG if more than one variant with the same activity value is found.
For example, if a sample has been called with a diplotype of
[c.1905+1G>A (*2A) + c.2933A>G]/c.498G>A, which is composed of
c.1905+1G>A (*2A) (no function),
c.2933A>G (no function, unknown to DPWG),
c.498G>A (normal function), the inferred diplotype used to look up the DPYD phenotype and recommendation will be
c.1905+1G>A (*2A)/c.498G>A rather than
Furthermore, to increase the likelihood of a match with DPWG, PharmCAT treats any variant that is unknown to DPWG but has a normal function in CPIC as
Reference (i.e. normal function).
For example, in the above sample, the inferred genotype
c.1905+1G>A (*2A)/c.498G>A will be further translated to
c.1905+1G>A (*2A)/Reference and used to query DPWG data. Since
c.1905+1G>A (*2A) is a no function variant included in the DPWG data, DPWG guidance for
c.1905+1G>A (*2A)/Reference will be included in the report.
PharmGKB annotates PGx-based drug dosing guidelines published by the Royal Dutch Association for the Advancement of Pharmacy - Pharmacogenetics Working Group (DPWG). PharmGKB curates allele function assignments and phenotype mappings from the DPWG to provide genotype specific DPWG guideline recommendations. Where possible, PharmGKB maps DPWG terms to CPIC terms, as outlined on PharmGKB.
CYP3A4 is currently not part of a CPIC guideline. Since the DPWG CYP3A4 documentation includes limit variant notations for the included alleles (only
*22 have variant positions specified, document from March 2022) PharmCAT relies on PharmVar CYP3A4 allele definitions. The CYP3A4
*22 definitions are the same in the DPWG CYP3A4 gene document and PharmVar, while the
*16 allele definition includes besides rs12721627 an additional SNP rs2242480 in PharmVar. Besides
*16, c.1026+12G>A (rs2242480) is part of several star alleles including CYP3A4
*1G). See PharmVar's CYP3A4 documentation for further details.