diff --git a/01-intro.Rmd b/01-intro.Rmd
index 9fa0f965..6ee6fbe2 100644
--- a/01-intro.Rmd
+++ b/01-intro.Rmd
@@ -5,7 +5,7 @@ ottrpal::set_knitr_image_path()
# Introduction
-This course was developed in Summer 2023 and updated in Fall 2025. We welcome any feedback at help@pvactools.org or by submission of [GitHub issues](https://github.com/griffithlab/pVACtools_Intro_Course/issues).
+This course was developed in Summer 2023 and last updated in Summer 2026. We welcome any feedback at help@pvactools.org or by submission of [GitHub issues](https://github.com/griffithlab/pVACtools_Intro_Course/issues).
## Motivation
@@ -24,7 +24,7 @@ prioritization, and selection using a graphical Web-based interface (pVACview),
vaccines. pVACtools is available at [http://www.pvactools.org](http://www.pvactools.org).
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "pVACtools is a cancer immunotherapy tools suite"}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a37485c18b_1_0")
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit?slide=id.g3e342b543ab_0_0#slide=id.g3e342b543ab_0_0")
```
## Background
@@ -43,7 +43,9 @@ stability and recognition by cytotoxic T cells [@Richters2019].
pVACtools can be used as the final step in a well-established variant calling pipeline. It leverages existing tools with functionality related to variant annotation
(Ensembl VEP [@McLaren2016]), identifying neoantigens from specific sources (e.g. fusions via star-fusion [@Haas2019], AGFusion [@Murphy2016], and Arriba [@Uhrig2021]),
HLA typing (OptiType [@Szolek2014], PHLAT [@Bai2018]), peptide-MHC binding prediction (IEDB [@Vita2018], NetMHCpan [@Jurtz2017], MHCflurry [@ODonnell2018],
-MHCnuggets [@Shao2020]), peptide-MHC stability (NetMHCstabpan [@Rasmussen2016]], peptide processing (NetChop [@Nielsen2005]), manufacturability
+MHCnuggets [@Shao2020], MixMHCpred [@Gfeller2023]), presentation (IEDB [@Vita2018], BigMHC [@Albert2023], MHCflurry[@ODonnell2018], MixMHC2pred [@Racle2023]),
+immunogenicity (BigMHC [@Albert2023], DeepImmuno [@Li2021], ImmuoScope [@Shen2025], PRIME [@Gfeller2023]), peptide-MHC stability (NetMHCstabpan [@Rasmussen2016]],
+peptide processing (NetChop [@Nielsen2005]), manufacturability
metrics (vaxrank [@Rubinsteyn2017]), and reference proteome similarity (BLAST [@Altschul1990]). Each of these tools tackles specific tasks within the broader goal of
antigen analysis and is utilized by pVACtools to provide an end-to-end integration of novel algorithms and established tools needed to discover, characterize, prioritize,
and utilize tumor-specific neoantigens in basic research and clinical applications. Combining pVACtools with existing variant calling pipelines provides an end-to-end
diff --git a/02-prerequisites.Rmd b/02-prerequisites.Rmd
index fe071c97..3ea66653 100644
--- a/02-prerequisites.Rmd
+++ b/02-prerequisites.Rmd
@@ -70,10 +70,10 @@ install.packages("colourpicker", dependencies=TRUE)
## Data
-For this course, we have put together a set of input data generated from the breast
+For this course, we have put together a set of input data generated from the breast
cancer cell line HCC1395 and a matched normal lymphoblastoid cell line HCC1395BL.
-Data from this cell line is commonly used as test data in bioinformatics applications.
-For more information on these lines and the generation of test data, please refer to
+Data from this cell line is commonly used as test data in bioinformatics applications.
+For more information on these lines and the generation of test data, please refer to
the [data section of our precision medicine bioinformatics course](https://pmbio.org/module-02-inputs/0002/05/01/Data/).
The input data consists of the following files:
@@ -81,8 +81,8 @@ The input data consists of the following files:
For pVACseq:
- `annotated.expression.vcf.gz`: A somatic (tumor-normal) VCF and its tbi index file. The VCF has been
- annotated with VEP and has coverage and expression information added. It has also been annotated with
- custom VEP plugins that provide wild type and mutant versions of the full length protein sequences
+ annotated with VEP and has coverage and expression information added. It has also been annotated with
+ custom VEP plugins that provide wild type and mutant versions of the full length protein sequences
predicted to arise from each transcript annotated with each variant.
- `phased.vcf.gz`: A phased tumor-germline VCF and its tbi index file to provide information about
in-phase proximal variants that might alter the predicted peptide sequence around a somatic
@@ -90,7 +90,7 @@ For pVACseq:
- `optitype_normal_result.tsv`: A OptiType file with HLA allele typing predictions.
For more detailed information on how the variant input file is created, please refer to the
-[input file preparation](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep.html)
+[input file preparation](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep.html)
section of the pVACtools docs.
For pVACfuse:
diff --git a/03-running_pvactools.Rmd b/03-running_pvactools.Rmd
index b4c148fc..bc55ad25 100644
--- a/03-running_pvactools.Rmd
+++ b/03-running_pvactools.Rmd
@@ -23,11 +23,11 @@ mkdir pVACtools_outputs
docker run \
-v ${PWD}/HCC1395_inputs:/HCC1395_inputs \
-v ${PWD}/pVACtools_outputs:/pVACtools_outputs \
--it griffithlab/pvactools:6.0.3 \
+-it griffithlab/pvactools:7.0.0 \
/bin/bash
```
-This will pull the 6.0.3 version of the griffithlab/pvactools Docker image and
+This will pull the 7.0.0 version of the griffithlab/pvactools Docker image and
start an interactive session (`-it`) of that Docker image using the bash shell (`/bin/bash`).
The `-v ${PWD}/HCC1395_inputs:/HCC1395_inputs`
part of the command will mount the
@@ -35,7 +35,7 @@ part of the command will mount the
so that you will have access to the input data from inside the Docker
container. The `-v ${PWD}/pVACtools_outputs:/pVACtools_outputs` part of the command
will mount the `pVACtools_outputs` folder you just created. We will write the
-outputs from pVACseq and pVACfuse to that folder so that you will have access
+outputs from pVACseq, pVACfuse, and pVACsplice to that folder so that you will have access
to it once you exit the Docker image.
## Running pVACseq
@@ -120,17 +120,14 @@ your run. Here are a list of parameters we generally recommend:
are considered by pVACseq. This flag will lead pVACseq to skip variants that
have a FILTER applied in the VCF to, e.g., exclude variants that were marked
as low quality by the variant caller.
-- `--percentile-threshold`: When considering the peptide-MHC binding affinity
- for filtering and prioritizing neoantigen candidates, by default only the
- IC50 value is being used. Setting this parameter will additionally also filter
- on the predicted percentile. We recommend a value of 2 (2%) for this
- threshold.
-- `--percentile-threshold-strategy`: When running pVACseq with a
- `--percentile-threshold` set, this parameter will influence how both the
- IC50 cutoff and the percentile cutoff are applied. The default,
- `conservative`, will require a candidate to pass both the binding and the
- percentile threshold, while the `exploratory` option will require a candidate
- to only pass either the binding or the percentile threshold.
+- `--use-normalized-percentiles`: Not all prediction algorithms supported by
+ pVACseq output a percentile rank. This option will calculate normalized percentiles
+ for class I epitopes of length 8-11 and all class I algorithms and the 1,000
+ most common human class I MHC alleles based on the same set of 100,000 reference
+ peptides. These percentiles will be used in place of percentiles natively
+ calculated by some algorithms. This ensures that all class I algorithms will
+ return a percentile score since some do not do so natively. It also ensures
+ that the percentiles are calculated consistently between all algorithms.
Additionally there are a number of parameters that might be useful depending
on your specific analysis needs:
@@ -147,6 +144,12 @@ on your specific analysis needs:
unstable. This parameter allows users to set their own rules as to which
peptides are considered problematic and peptides meeting those rules will be marked in the
pVACseq results and deprioritized.
+- `--percentile-threshold-strategy`: By default, pVACseq will
+ filter and prioritize neoantigen candidates on the binding, presentation,
+ and immunogenicity percentiles in addition to the raw IC50 binding affinity.
+ A candidate will need to pass all thresholds. However, setting this parameter
+ to `exploratory` will relax this behavior and only require a candidate to
+ pass one of the thresholds.
- `--transcript-prioritization-strategy` and
`--maximum-transcript-support-level`: Generally, multiple transcripts of a
gene may code for a neoantigen candidate. When picking the best transcript
@@ -177,8 +180,8 @@ Given the considerations outlined above, let's run pVACseq on our sample data.
From the `optitype_normal_result.tsv` we know that the patient's class I alleles are
HLA-A\*29:02, HLA-B\*45:01, HLA-B\*82:02, and HLA-C\*06:02 (indicated that two of three class I
-alleles are homozygous in this sample). We also have clinical typing information that confirms
-these class I alleles as well as identifying DQA1\*03:03, DQB1\*03:02, and DRB1\*04:05 as the
+alleles are homozygous in this sample). We also have clinical typing information that confirms
+these class I alleles as well as identifying DQA1\*03:03, DQB1\*03:02, and DRB1\*04:05 as the
patient's class II alleles.
Note that where needed pVACseq will automatically create HLA class II dimer combinations using
@@ -274,17 +277,14 @@ usually apply. Here are a list of parameters we generally recommend:
neoantigen candidate in the reference proteome and report any hits found.
By default this is done using BLASTp but we recommend using a proteome FASTA
file via the `--peptide-fasta` parameter to speed up this step.
-- `--percentile-threshold`: When considering the peptide-MHC binding affinity
- for filtering and prioritizing neoantigen candidates, by default only the
- IC50 value is being used. Setting this parameter will additionally also filter
- on the predicted percentile. We recommend a value of 2 (2%) for this
- threshold.
-- `--percentile-threshold-strategy`: When running pVACfuse with a
- `--percentile-threshold` set, this parameter will influence how both the
- IC50 cutoff and the percentile cutoff are applied. The default,
- `conservative`, will require a candidate to pass both the binding and the
- percentile threshold, while the `exploratory` option will require a candidate
- to only pass either the binding or the percentile threshold.
+- `--use-normalized-percentiles`: Not all prediction algorithms supported by
+ pVACfuse output a percentile rank. This option will calculate normalized percentiles
+ for class I epitopes of length 8-11 and all class I algorithms and the 1,000
+ most common human class I MHC alleles based on the same set of 100,000 reference
+ peptides. These percentiles will be used in place of percentiles natively
+ calculated by some algorithms. This ensures that all class I algorithms will
+ return a percentile score since some do not do so natively. It also ensures
+ that the percentiles are calculated consistently between all algorithms.
Additionally there are a number of parameters that might be useful depending
on your specific analysis needs:
@@ -298,6 +298,12 @@ on your specific analysis needs:
unstable. This parameter allows users to set their own rules as to which
peptides are considered problematic and peptides meeting those rules will be marked in the
pVACfuse results and deprioritized.
+- `--percentile-threshold-strategy`: By default, pVACfuse will
+ filter and prioritize neoantigen candidates on the binding, presentation,
+ and immunogenicity percentiles in addition to the raw IC50 binding affinity.
+ A candidate will need to pass all thresholds. However, setting this parameter
+ to `exploratory` will relax this behavior and only require a candidate to
+ pass one of the thresholds.
- `--threads`: This argument will allow pVACfuse to run in multi-processing
mode.
- `--keep-tmp-files`: Setting this flag will save intermediate files created by pVACfuse.
@@ -312,7 +318,7 @@ Given the considerations outlined above, let's run pVACfuse on our sample data.
As with pVACseq, we can use the `optitype_normal_result.tsv` file to identify the patient's
class I HLA alleles. These are HLA-A\*29:02, HLA-B\*45:01, HLA-B\*82:02, and HLA-C\*06:02.
-We also have clinical typing information that confirms these class I alleles as well as
+We also have clinical typing information that confirms these class I alleles as well as
identified DQA1\*03:03, DQB1\*03:02, and DRB1\*04:05 as the patient's class II alleles.
For pVACfuse the sample name is not used for any parsing so it doesn't need to
@@ -398,17 +404,14 @@ usually apply. Here is a list of parameters we generally recommend:
neoantigen candidate in the reference proteome and report any hits found.
By default this is done using BLASTp, but we recommend using a proteome FASTA
file via the `--peptide-fasta` parameter to speed up this step.
-- `--percentile-threshold`: When considering the peptide-MHC binding affinity
- for filtering and prioritizing neoantigen candidates, by default only the
- IC50 value is being used. Setting this parameter will additionally filter
- on the predicted percentile. We recommend a value of 2 (2%) for this
- threshold.
-- `--percentile-threshold-strategy`: When running pVACsplice with a
- `--percentile-threshold` set, this parameter will influence how both the
- IC50 cutoff and the percentile cutoff are applied. The default,
- `conservative`, will require a candidate to pass both the binding and the
- percentile threshold, while the `exploratory` option will require a candidate
- to only pass either the binding or the percentile threshold.
+- `--use-normalized-percentiles`: Not all prediction algorithms supported by
+ pVACsplice output a percentile rank. This option will calculate normalized percentiles
+ for class I epitopes of length 8-11 and all class I algorithms and the 1,000
+ most common human class I MHC alleles based on the same set of 100,000 reference
+ peptides. These percentiles will be used in place of percentiles natively
+ calculated by some algorithms. This ensures that all class I algorithms will
+ return a percentile score since some do not do so natively. It also ensures
+ that the percentiles are calculated consistently between all algorithms.
Additionally there are a number of parameters that might be useful depending
on your specific analysis needs:
@@ -422,6 +425,12 @@ on your specific analysis needs:
unstable. This parameter allows users to set their own rules as to which
peptides are considered problematic and peptides meeting those rules will be marked in the
pVACsplice results and deprioritized.
+- `--percentile-threshold-strategy`: By default, pVACsplice will
+ filter and prioritize neoantigen candidates on the binding, presentation,
+ and immunogenicity percentiles in addition to the raw IC50 binding affinity.
+ A candidate will need to pass all thresholds. However, setting this parameter
+ to `exploratory` will relax this behavior and only require a candidate to
+ pass one of the thresholds.
- `--transcript-prioritization-strategy` and
`--maximum-transcript-support-level`: Generally, multiple transcripts of a
gene may code for a neoantigen candidate. When picking the best transcript
diff --git a/04-outputs.Rmd b/04-outputs.Rmd
index 10a2431a..3f5e9b43 100644
--- a/04-outputs.Rmd
+++ b/04-outputs.Rmd
@@ -66,7 +66,17 @@ thresholds depending on the HLA allele of the prediction, as recommended by
Custom thresholds are available for the most common 76 class I HLA alleles.
For all others, the `--binding-threshold` value is used.
-In addition to the binding affinity, other optional parameters can be set to
+Additionally, candidates are filtered on the binding percentile, presentation
+percentile, and immunogenicity percentiles. Similarly to the binding affinity,
+a `Median` and `Best` value is calculated over all of the percentiles outputted
+from the individual binding, presentation, and immunogenicity algorithms,
+respectively. The `--top-score-metric` parameter is also used to set which ones
+of the two is used for filtering (default: Median). By default the percentile
+thresholds are set to 2.0, which can be adjusted using the
+`--binding-percentile-threshold`, `--presentation-percentile-threshold`,
+and `--immunogenicity-percentile-threshold` parameters, respectively.
+
+In addition to these default parameters, other optional parameters can be set to
enabled additional filtering on related metrics:
- `--minimum-fold-change`: The fold change is the ratio of the mutant binding affinity to
@@ -76,21 +86,13 @@ enabled additional filtering on related metrics:
Which one is filtered on for this metric depends again on the
`--top-score-metric` set. When a minimum fold change parameter is set, the binding filter
discards any prediction with a agretopicity below the set cutoff. This
- parameter is not available in pVACfuse because there is no matched wildtype
+ parameter is not available in pVACfuse and pVACsplice because there is no matched wildtype
peptide for each neoantigen candidate.
-- `--percentile-threshold`: The prediction algorithms supported by pVACtools
- also report a percentile score that represents where each neoantigen's predicted
- affinity falls in the range of other values for an HLA allele. Similar to
- the binding affinity itself, pVACtools report the median and the lowest
- percentile scores for the range of scores reported by the prediction
- algorithms chosen by the user and which on is used for filtering is again
- controlled by the `--top-score-metric` parameter.
-- `--percentile-threshold-strategy`: When running pVACtools with a
- `--percentile-threshold` set, this parameter will influence how both the
- binding cutoff and the percentile cutoff are applied. The default,
- `conservative`, will require a candidate to pass both the binding and the
- percentile threshold while the `exploratory` option will require a candidate
- to only pass either the binding or the percentile threshold.
+- `--percentile-threshold-strategy`: This parameter will influence how both the
+ binding cutoff and the percentile cutoffs are applied. The default,
+ `conservative`, will require a candidate to pass all thresholds while the
+ `exploratory` option will require a candidate to only pass either one of
+ the thresholds.
### Coverage Filter
@@ -108,7 +110,7 @@ Additionally, filtering on the normal DNA depth and variant allele frequency
(VAF) requires your VCF to be a tumor-normal sample VCF and the normal sample
to be identified in your pVACseq/pVACsplice run using the `--normal-sample-name`
parameter. If a coverage metric doesn't apply because the underlying data is
-not available, `NA` is reported by pVACtools. By default, the filter will skip
+not available, `NA` is reported by pVACtools. The filter will skip
evaluating a coverage criteria when a neoantigen's value for it is `NA`.
The following thresholds are applied in pVACseq and pVACsplice by this filter:
@@ -141,7 +143,7 @@ The following thresholds are applied in pVACfuse by this filter:
The Transcript Filter removes neoantigens resulting from transcripts that are
considered poor candidates. To determine whether a transcript is poor, the
`--transcript-prioritization-strategy` parameter is used. This parameter
-defines a list of criteria to consider. These are:
+defines a list of criteria to consider. The options are:
- `mane_select`: MANE Select status of the transcript
- `canonical`: Canonical status of the transcripts
@@ -165,44 +167,58 @@ variant we first group the transcripts into sets where all transcripts in a set
code for the same set of neoantigen candidates. For each transcript set we then
determine the best neoantigen candidate as follows:
-- Pick all neoantigens with a variant transcript that doesn't have a Transcript CDS Flag
-- Of the remaining candidates, pick the ones with a variant transcript that have a protein_coding Biotype
-- Of the remaining candidates, pick the ones with a passing transcript according
- to the selected `--transcript-prioritization-strategy` and `--maximum-transcript-support-level`.
-- Of the remaining candidates, pick the entries with no Problematic Positions.
-- Of the remaining candidates, pick the ones passing the Anchor Criteria (explained in
- more detail further below).
-- Sort the remaining candidates on the lowest MT IC50 Score or Percentile (Median or Best
- depending on the `--top-score-metric`, Score or Percentile depending on the
- `--top-score-metric2`), the transcript's MANE Select Status, the
- transcript's Canonical status, the transcript's TSL, the transcript length,
- and the transcript expression. Select the highest sorted entry.
+- If `--allow-inclomplete-transcripts` flag is set, pick the entries without a
+ Transcript CDS Flags set.
+- Of the remaining entries, pick the entries where the Biotype is protein_coding.
+- Of the remaining entries, pick the entries that pass at least one of the
+ transcript criteria selected in the `--transcript-prioritization-strategy`
+ taking into consideration the `--maximum-transcript-support-level` if `tsl`
+ is one of the selected criteria.
+- Of the remaining entries, pick the entries with no Problematic Positions.
+- Of the remaining entries, pick the ones passing the Anchor Criteria
+ (explained in more detail further below).
+- For the remaining entries, calculate a rank for all the metrics specified via
+ the `--top-score-metric2` parameter and sum them. Whether the lowest or median
+ value is considered for each metric is controlled by the `--top-score-metric`
+ parameter. Sort the remaining entries on this sum rank followed by the rank of
+ the first `--top-score-metric2` specified (to break any ties in the sum rank),
+ MANE Select status, Canonical status, Transcript Support Level, Transcript
+ Length, and Transcript Expression. Select the highest sorted entry.
This filter then reports the best neoantigen candidate for each transcript set.
For pVACfuse, the neoantigen candidate for each fusion are similarly grouped
into sets where all transcript1-transcript2 combinations in a set code for the
same set of neoantigen candidates. From there, the best neoantigen candidate
-for each transcript set is determined by picking the candidate with the lowest
-MT IC50 Score (Median or Best depending on the `--top-score-metric`) and the
-highest fusion transcript expression.
+for each transcript set is determined as follows:
+
+- Pick the entries with no Problematic Positions.
+- For the remaining entries, calculate a rank for all the metrics specified via
+ the `--top-score-metric2` parameter and sum them. Whether the lowest or median
+ value is considered for each metric is controlled by the `--top-score-metric`
+ parameter. Sort the remaining entries on this sum rank followed by the rank of
+ the first top-score-metric2 specified (to break any ties in the sum rank), and
+ Expression. Select the highest sorted entry.
For pVACsplice, the neoantigen candidates are grouped into sets with the same splice
site Junction. From there, the best neoantigen candidate for each set is
determined very similarly to pVACseq:
-- Pick all neoantigens with a variant transcript that doesn't have a Transcript CDS Flag
-- Of the remaining candidates, pick the ones with a variant transcript that have a protein_coding Biotype
-- Of the remaining candidates, pick the ones with a passing transcript according
- to the selected `--transcript-prioritization-strategy` and `--maximum-transcript-support-level`.
-- Of the remaining candidates, pick the entries with no Problematic Positions.
-- Of the remaining candidates, pick the ones passing the Anchor Criteria (explained in
- more detail further below).
-- Sort the remaining candidates on the lowest MT IC50 Score or Percentile (Median or Best
- depending on the `--top-score-metric`, Score or Percentile depending on the
- `--top-score-metric2`), the transcript's MANE Select Status, the
- transcript's Canonical status, the transcript's TSL, the transcript WT
- Protein Length, and the transcript expression. Select the highest sorted entry.
+- If `--allow-inclomplete-transcripts` flag is set, pick the entries without a
+ Transcript CDS Flags set.
+- Of the remaining entries, pick the entries where the Biotype is protein_coding.
+- Of the remaining entries, pick the entries that pass at least one of the
+ transcript criteria selected in the `--transcript-prioritization-strategy`
+ taking into consideration the `--maximum-transcript-support-level` if tsl is one
+ of the selected criteria.
+- Of the remaining entries, pick the entries with no Problematic Positions.
+- For the remaining entries, calculate a rank for all the metrics specified via
+ the `--top-score-metric2` parameter and sum them. Whether the lowest or median
+ value is considered for each metric is controlled by the `--top-score-metric`
+ parameter. Sort the remaining entries on this sum rank followed by the rank of
+ the first `--top-score-metric2` specified (to break any ties in the sum rank),
+ MANE Select status, Canonical status, Transcript Support Level, WT Protein
+ Length, Transcript Expression, and Tumor DNA VAF. Select the highest sorted entry.
## Interpreting the aggregated.tsv File
@@ -219,40 +235,10 @@ included neoantigens are pared to the best `--aggregate-inclusion-count-limit` c
### Determining the Best Transcript and Best Peptide of a Variant
-In pVACseq, for each variant, all neoantigen candidates meeting the `--aggregate-inclusion-threshold` are evaluated as follows:
-
-- Pick all neoantigens with a variant transcript that doesn't have a Transcript CDS Flag
-- Of the remaining candidates, pick the ones with a variant transcript that have a protein_coding Biotype
-- Of the remaining candidates, pick the ones with a passing transcript according
- to the selected `--transcript-prioritization-strategy` and `--maximum-transcript-support-level`.
-- Of the remaining candidates, pick the entries with no Problematic Positions.
-- Of the remaining candidates, pick the ones passing the Anchor Criteria (explained in
- more detail further below).
-- Sort the remaining candidates on the lowest MT IC50 Score or Percentile (Median or Best
- depending on the `--top-score-metric`, Score or Percentile depending on the
- `--top-score-metric2`), the transcript's MANE Select Status, the
- transcript's Canonical status, the transcript's TSL, the transcript length,
- and the transcript expression. Select the highest sorted entry.
-
-In pVACfuse, the neoantigen candidate with the lowest IC50 binding affinity for each variant
-and highest transcript expression is selected.
-The value used for the `--top-score-metric` determines whether the lowest or
-median binding affinity is used for this comparison.
-
-For pVACsplice, the best neoantigen candidate for each variant is determined very similarly to pVACseq:
-
-- Pick all neoantigens with a variant transcript that doesn't have a Transcript CDS Flag
-- Of the remaining candidates, pick the ones with a variant transcript that have a protein_coding Biotype
-- Of the remaining candidates, pick the ones with a passing transcript according
- to the selected `--transcript-prioritization-strategy` and `--maximum-transcript-support-level`.
-- Of the remaining candidates, pick the entries with no Problematic Positions.
-- Of the remaining candidates, pick the ones passing the Anchor Criteria (explained in
- more detail further below).
-- Sort the remaining candidates on the lowest MT IC50 Score or Percentile (Median or Best
- depending on the `--top-score-metric`, Score or Percentile depending on the
- `--top-score-metric2`), the transcript's MANE Select Status, the
- transcript's Canonical status, the transcript's TSL, the transcript WT
- Protein Length, and the transcript expression. Select the highest sorted entry.
+The Best Peptide for a variant, fusion, or junction, is generally determined the same
+way in the aggregated report as it is in the the top score filter using the criteria
+described above. However, only the neoantigen candidates meeting the
+`--aggregate-inclusion-threshold` and `--aggregate-inclusion-count-limit` are evaluated.
The chosen entry determines the best neoantigen candidate and the best
transcript coding for it.
@@ -270,14 +256,16 @@ The Tiers available in pVACseq are:
tabl <- "
| Tier | Criteria |
|------|----------|
-| Pass | Best Peptide passes the binding, reference match, expression, transcript, clonal, problematic position, and anchor criteria |
-| PoorBinder | Best Peptide fails the binding criteria but passes the reference match, expression, transcript, clonal, problematic position, and anchor criteria |
-| RefMatch | Best Peptide fails the reference match criteria but passes the binding, expression, transcript, clonal, problematic position, and anchor criteria |
-| PoorTranscript | Best Peptide fails the transcript criteria but passes the binding, reference match, expression, clonal, problematic position, and anchor criteria |
-| LowExpr | Best Peptide meets the low expression criteria and passes the binding, reference match, transcript, clonal, problematic position, and anchor criteria |
-| Anchor | Best Peptide fails the anchor criteria but passes the binding, reference match, expression, transcript, clonal, and problematic position criteria |
-| Subclonal | Best Peptide fails the clonal criteria but passes the binding, reference match, expression, transcript, problematic position, and anchor criteria |
-| ProbPos | Best Peptide fails the problematic position criteria but passes the binding, reference match, expression, transcript, clonal, and anchor criteria |
+| Pass | Best Peptide passes the scores, reference match, expression, transcript, clonal, problematic position, and anchor criteria |
+| PoorBinder | Best Peptide fails the binding criteria but passed the presentation, immunogenicity, reference match, expression, transcript, clonal, problematic position, and anchor criteria |
+| PoorPresentation | Best Peptide fails the presentation criteria but passed the binding, immunogenicity, reference match, expression, transcript, clonal, problematic position, and anchor criteria |
+| PoorImmunogenicity | Best Peptide fails the immunogenicity criteria but passed the binding, presentation, reference match, expression, transcript, clonal, problematic position, and anchor criteria |
+| RefMatch | Best Peptide fails the reference match criteria but passes the scores, expression, transcript, clonal, problematic position, and anchor criteria |
+| PoorTranscript | Best Peptide fails the transcript criteria but passes the scores, reference match, expression, clonal, problematic position, and anchor criteria |
+| LowExpr | Best Peptide meets the low expression criteria and passes the scores, reference match, transcript, clonal, problematic position, and anchor criteria |
+| Anchor | Best Peptide fails the anchor criteria but passes the scores, reference match, expression, transcript, clonal, and problematic position criteria |
+| Subclonal | Best Peptide fails the clonal criteria but passes the scores, reference match, expression, transcript, problematic position, and anchor criteria |
+| ProbPos | Best Peptide fails the problematic position criteria but passes the scores, reference match, expression, transcript, clonal, and anchor criteria |
| Poor | Best Peptide doesn’t fit in any of the above tiers, usually if it fails two or more criteria |
| NoExpr | Best Peptide is not expressed (RNA Expr == 0 or RNA VAF == 0) |
"
@@ -290,14 +278,17 @@ cat(tabl)
tabl <- "
| Criteria | Description | Evaluation |
|----------|-------------|------------|
-| Binding Criteria | Pass if Best Peptide is a strong binder | binding score criteria: `IC50 MT < --binding-threshold` (`--allele-specific-binding-thresholds` flag is respected)
percentile score criteria (if `--percentile-threshold` parameters is set): `%ile MT < --percentile-threshold` (if parameter is set)
`conservative` `--percentile-threshold-strategy`: needs to pass BOTh the binding score criteria AND the percentile score criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass EITHER the binding score criteria OR the percentile score criteria|
-| Expression Criteria | Pass if Best Transcript is expressed | Allele Expr > `--trna-vaf` * `--expn-val` |
-| Reference Match Criteria | Pass if there are no reference protome matches | `Ref Match == False` |
-| Transcript Criteria | Pass if Best Transcript matches any of the user-specified `--transcript-prioritization-strategy` criteria | `TSL <= --maximum-transcript-support level` (if strategy includes `tsl`)
`MANE Select == True` (if strategy includes `mane_select`)
`Canonical == True` (if strategy includes `canonical`) |
-| Low Expression Criteria | Peptide has low expression or no expression but RNA VAF and coverage | (0 < Allele Expr < `--trna-vaf` * `--expn-val`) OR (RNA Expr == 0 AND RNA Depth > `--trna-cov` AND RNA VAF > `--trna-vaf`) |
-| Anchor Criteria | Fail if all mutated amino acids of the Best Peptide (Pos) are at an anchor position and the WT peptide has good binding (IC50 WT < `--binding-threshold`). `--allele-specific-binding-thresholds` flag is respected. |
-| Clonal Criteria | Best Peptide is likely in the founding clone of the tumor | DNA VAF > `--tumor-purity` / 4 |
-| Problematic Position Criteria | Best Peptide does not contain a problematic amino acid as defined by the `--problematic-amino-acids` parameter | `Prob Pos == None`
+| Binding Criteria | Pass if Best Peptide is strong binder | binding score criteria: `IC50 MT < --binding-threshold`
binding percentile score criteria: `IC50 %ile MT < --binding-percentile-threshold`
`conservative` `--percentile-threshold-strategy`: needs to pass BOTH the binding score criteria AND the binding percentile score criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass EITHER the binding score criteria OR the binding percentile score criteria |
+| Presentation Criteria | Pass if the Best Peptide is presented by the MHC | `Pres %ile MT < -- presentation-percentile-threshold` |
+| Immunogenicity Criteria | Pass if the Best Peptide is immunogenic | `IM %ile MT < --immunogenicity-percentile-threshold` |
+| Scores Criteria | Pass if the Best Peptide is a strong binder, presented by the MHC, and/or immunogenic | `conservative` `--percentile-threshold-strategy`: needs to pass the binding criteria, the presentation criteria, AND the immunogenicity criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass the binding criteria, the presentation criteria, OR the immunogenicity criteria |
+| Expression Criteria | Pass if Best Transcript is expressed | `Allele Expr > trna_vaf * expn_val` |
+| Reference Match Criteria | Pass if there are no reference proteome matches | `Ref Match == False` |
+| Transcript Criteria | Pass if Best Transcript matches any of the user-specified `--transcript-prioritization-strategy` criteria | `TSL <= maximum_transcript_support_level` (if `--transcript-prioritization-strategy` includes `tsl`)
`MANE Select == True` (if `--transcript-prioritization-strategy` includes `mane_select`)
`Canonical == True` (if `--transcript-prioritization-strategy` incluces `canonical`) |
+| Low Expression Criteria | Peptide has low expression or no expression but RNA VAF and coverage | `(0 < Allele Expr < trna_vaf * expn_val) OR (RNA Expr == 0 AND RNA Depth > trna_cov AND RNA VAF > trna_vaf)` |
+| Anchor Criteria | Fail if if there are <= 2 mutated amino acids and all mutated amino acids of the Best Peptide (Pos) are at an anchor position and the WT peptide has good binding (`IC50 WT < binding_threshold`) | |
+| Clonal Criteria | Best Peptide is likely in the founding clone of the tumor | `DNA VAF > --tumor-purity / 4` |
+| Problematic Position Criteria | Best Peptide does not contain a problematic amino acid as defined by the `--problematic-amino-acids` parameters | `Prob Pos == None` |
"
cat(tabl)
```
@@ -311,12 +302,14 @@ The Tiers available in pVACfuse are:
tabl <- "
| Tier | Criteria |
|------|---------|
-| Pass | Best Peptide passes the binding, reference match, read support, expression, and problematic position criteria |
-| PoorBinder | Best Peptide fails the binding criteria but passes the reference match, read support, expression, and problematic position criteria |
-| RefMatch | Best Peptide fails the reference match criteria but passes the binding, read support, expression, and problematic position criteria |
-| LowReadSupport | Best Peptide fails the read support criteria but passes the binding, reference match, expression, and problematic position criteria |
-| LowExpr | Best Peptide fails the expression criteria but passes the binding, reference match, read support, and problematic position criteria |
-| ProbPos | Best Peptide fails the problematic position criteria but passes the binding, reference match, read support, and expression |
+| Pass | Best Peptide passes the scores, reference match, read support, expression, and problematic position criteria
+| PoorBinder | Best Peptide fails the binding criteria but passes the presentation, immunogenicity, reference match, read support, expression, and problematic position criteria |
+| PoorImmunogenicity | Best Peptide fails the immunogenicity criteria but passes the binding, presentation, reference match, read support, expression, and problematic position criteria |
+| PoorPresentation | Best Peptide fails the presentation criteria but passes the binding, immunogenicity, reference match, read support, expression, and problematic position criteria |
+| RefMatch | Best Peptide fails the reference match criteria but passes the scores, read support, expression, and problematic position criteria |
+| LowReadSupport | Best Peptide fails the read support criteria but passes the scores, reference match, expression, and problematic position criteria |
+| LowExpr | Best Peptide fails the expression criteria but passes the scores, reference match, read support, and problematic position criteria |
+| ProbPos | Best Peptide fails the problematic position criteria but passes the scores, reference match, read support, and expression |
| Poor | Best Peptide doesn’t fit any of the above tiers, usually if it fails two or more criteria |
"
cat(tabl)
@@ -329,9 +322,12 @@ cat(tabl)
tabl <- "
| Criteria | Description | Evaluation |
|----------|-------------|------------|
-| Binding Criteria | Pass if Best Peptide is a strong binder | binding score criteria: `IC50 MT < --binding-threshold` (`--allele-specific-binding-thresholds` flag is respected)
percentile score criteria (if `--percentile-threshold` parameters is set): `%ile MT < --percentile-threshold` (if parameter is set)
`conservative` `--percentile-threshold-strategy`: needs to pass BOTh the binding score criteria AND the percentile score criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass EITHER the binding score criteria OR the percentile score criteria|
-| Read Support Criteria | Pass if variant has read support | Read Support < `--read-support` |
-| Expression Criteria | Pass if Best Transcript is expressed | Expr > `--expn-val` |
+| Binding Criteria | Pass if Best Peptide is strong binder | binding score criteria: `IC50 MT < --binding-threshold`
binding percentile score criteria: `IC50 %ile MT < --binding-percentile-threshold`
`conservative` `--percentile-threshold-strategy`: needs to pass BOTH the binding score criteria AND the binding percentile score criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass EITHER the binding score criteria OR the binding percentile score criteria |
+| Presentation Criteria | Pass if the Best Peptide is presented by the MHC | `Pres %ile MT < -- presentation-percentile-threshold` |
+| Immunogenicity Criteria | Pass if the Best Peptide is immunogenic | `IM %ile MT < --immunogenicity-percentile-threshold` |
+| Scores Criteria | Pass if the Best Peptide is a strong binder, presented by the MHC, and/or immunogenic | `conservative` `--percentile-threshold-strategy`: needs to pass the binding criteria, the presentation criteria, AND the immunogenicity criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass the binding criteria, the presentation criteria, OR the immunogenicity criteria |
+| Read Support Criteria | Pass if variant has read support | `Read Support < --read-support` |
+| Expression Criteria | Pass if Best Transcript is expressed | `Expr > --expn-val` |
| Reference Match Criteria | Pass if there are no reference protome matches | `Ref Match == False` |
| Problematic Position Criteria | Best Peptide does not contain a problematic amino acid as defined by the `--problematic-amino-acids` parameter | `Prob Pos == None`
"
@@ -348,13 +344,15 @@ tabl <- "
| Tier | Criteria |
|------|----------|
-| Pass | Best Peptide passes the binding, reference match, expression, transcript, clonal, and problematic position criteria |
-| PoorBinder | Best Peptide fails the binding criteria but passes the reference match, expression, transcript, clonal, and problematic position criteria |
-| RefMatch | Best Peptide fails the reference match criteria but passes the binding, expression, transcript, clonal, and problematic position criteria
-| PoorTranscript | Best Peptide fails the transcript criteria but passes the binding, reference match, expression, clonal, and problematic position criteria |
-| LowExpr | Best Peptide meets the low expression criteria and passes the binding, reference match, transcript, clonal, and problematic position criteria |
-| Subclonal | Best Peptide fails the clonal criteria but passes the binding, reference match, expression, transcript, and problematic position criteria |
-| ProbPos | Best Peptide fails the problematic position criteria but passes the binding, reference match, expression, transcript, and clonal criteria |
+| Pass | Best Peptide passes the scores, reference match, expression, transcript, clonal, and problematic position criteria |
+| PoorBinder | Best Peptide fails the binding criteria but passes the presentation, immunogenicity, reference match, read support, expression, and problematic position criteria |
+| PoorImmunogenicity | Best Peptide fails the immunogenicity criteria but passes the binding, presentation, reference match, read support, expression, and problematic position criteria |
+| PoorPresentation | Best Peptide fails the presentation criteria but passes the binding, immunogenicity, reference match, read support, expression, and problematic position criteria |
+| RefMatch | Best Peptide fails the reference match criteria but passes the scores, expression, transcript, clonal, and problematic position criteria |
+| PoorTranscript | Best Peptide fails the transcript criteria but passes the scores, reference match, expression, clonal, and problematic position criteria |
+| LowExpr | Best Peptide meets the low expression criteria and passes the scores, reference match, transcript, clonal, and problematic position criteria |
+| Subclonal | Best Peptide fails the clonal criteria but passes the scores, reference match, expression, transcript, and problematic position criteria |
+| ProbPos | Best Peptide fails the problematic position criteria but passes the scores, reference match, expression, transcript, and clonal criteria |
| Poor | Best Peptide doesn’t fit in any of the above tiers, usually if it fails two or more criteria |
| NoExpr | Best Peptide is not expressed (RNA Expr == 0 or RNA VAF == 0) |
"
@@ -368,12 +366,15 @@ cat(tabl)
tabl <- "
| Criteria | Description | Evaluation |
|----------|-------------|------------|
-| Binding Criteria | Pass if Best Peptide is a strong binder | binding score criteria: `IC50 MT < --binding-threshold` (`--allele-specific-binding-thresholds` flag is respected)
percentile score criteria (if `--percentile-threshold` parameters is set): `%ile MT < --percentile-threshold` (if parameter is set)
`conservative` `--percentile-threshold-strategy`: needs to pass BOTh the binding score criteria AND the percentile score criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass EITHER the binding score criteria OR the percentile score criteria|
-| Expression Criteria | Pass if Best Transcript is expressed | Allele Expr > `--trna-vaf` * `--expn-val` |
+| Binding Criteria | Pass if Best Peptide is strong binder | binding score criteria: `IC50 MT < --binding-threshold`
binding percentile score criteria: `IC50 %ile MT < --binding-percentile-threshold`
`conservative` `--percentile-threshold-strategy`: needs to pass BOTH the binding score criteria AND the binding percentile score criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass EITHER the binding score criteria OR the binding percentile score criteria |
+| Presentation Criteria | Pass if the Best Peptide is presented by the MHC | `Pres %ile MT < -- presentation-percentile-threshold` |
+| Immunogenicity Criteria | Pass if the Best Peptide is immunogenic | `IM %ile MT < --immunogenicity-percentile-threshold` |
+| Scores Criteria | Pass if the Best Peptide is a strong binder, presented by the MHC, and/or immunogenic | `conservative` `--percentile-threshold-strategy`: needs to pass the binding criteria, the presentation criteria, AND the immunogenicity criteria
`exploratory` `--percentile-threshold-strategy`: needs to pass the binding criteria, the presentation criteria, OR the immunogenicity criteria |
+| Expression Criteria | Pass if Best Transcript is expressed | `Allele Expr > --trna-vaf` * `--expn-val` |
| Reference Match Criteria | Pass if there are no reference protome matches | `Ref Match == False` |
| Transcript Criteria | Pass if Best Transcript matches any of the user-specified `--transcript-prioritization-strategy` criteria | `TSL <= --maximum-transcript-support level` (if strategy includes `tsl`)
`MANE Select == True` (if strategy includes `mane_select`)
`Canonical == True` (if strategy includes `canonical`) |
-| Low Expression Criteria | Peptide has low expression or no expression but RNA VAF and coverage | (0 < Allele Expr < `--trna-vaf` * `--expn-val`) OR (RNA Expr == 0 AND RNA Depth > `--trna-cov` AND RNA VAF > `--trna-vaf`) |
-| Clonal Criteria | Best Peptide is likely in the founding clone of the tumor | DNA VAF > `--tumor-purity` / 4 |
+| Low Expression Criteria | Peptide has low expression or no expression but RNA VAF and coverage | `(0 < Allele Expr < --trna-vaf * --expn-val) OR (RNA Expr == 0 AND RNA Depth > --trna-cov AND RNA VAF > --trna-vaf)` |
+| Clonal Criteria | Best Peptide is likely in the founding clone of the tumor | `DNA VAF > --tumor-purity / 4` |
| Problematic Position Criteria | Best Peptide does not contain a problematic amino acid as defined by the `--problematic-amino-acids` parameter | `Prob Pos == None`
"
cat(tabl)
diff --git a/05-pvacview_tour.Rmd b/05-pvacview_tour.Rmd
index e2bddbcb..e431ce36 100644
--- a/05-pvacview_tour.Rmd
+++ b/05-pvacview_tour.Rmd
@@ -80,38 +80,42 @@ The main table in the Aggregate Report of Best Candidates by Variant panel
shows the best neoantigen candidate for each variant.
It lists the gene and amino acid change of the variant as well as additional
information about the best peptide and the best transcript coding for it. These
-include, from left to right, the transcript's MANE Select status, the transcript's
-canonical status, the transcript support level, the best-binding HLA
-allele, the mutated positions of the best peptide, any positions in the peptide
-where the amino acid might be problematic for manufacturing, the total number of
-neoantigen candidates resulting from this variant that are included with
-additional detailed metadata, and the total number
-of neoantigen candidates passing the binding affinity threshold set by the user.
-If a gene of interest list was uploaded, variants on those genes have their gene
-highlighted with a green border.
-
-Next, this table lists the IC50 peptide MHC binding affinity for both the Best Peptide
-(MT) and the matched wild type (WT). These values are either a median of all of the
-binding predictions made by the predictors selected in your pVACseq run, or
-the lowest binding prediction made. This depends on the value set for the
-`--top-score-metric` in your run. The table also shows the median/lowest percentile
-scores of all predictors that provide this value, again depending on the
-`--top-score-metric`. For the mutant values, a heatmap coloring is applied to make it
-easier to visually identify well-binding peptides.
-
-The next few columns show the coverage and expression of the best transcript with
+include, from left to right, whether or not the transcript is considered a good
+transcript considering the `--transcript-prioritization-strategy` and
+`--maximum-transcript-support-level`, the best-binding HLA
+allele, the mutated position(s) of the best peptide, any positions in the peptide
+where the amino acid might be problematic for manufacturing according to the
+`--problematic-amino-acids`, the total number of neoantigen candidates resulting
+from this variant that are included with additional detailed metadata, and the
+total number of neoantigen candidates passing the binding affinity threshold set
+by the user. If a gene of interest list was uploaded, variants on those genes have
+their gene highlighted with a green border.
+
+Next, this table lists the IC50 peptide MHC binding affinity for both the Best
+Peptide (MT) and the matched wild type (WT). It then lists the combined mutant
+percentile score, the mutant binding percentile, the mutant presentation percentile,
+and the mutant immunogenicity percentile.These values are either a median of all of
+the prediction scores made by the predictors selected in your pVACseq run, or the
+lowest score. This depends on the value set for the `--top-score-metric` in your
+run. A heatmap coloring is applied to make it easier to visually identify
+well-binding peptides.
+
+The next few columns show the VAF, coverage, and expression of the best transcript with
a bar plot background to represent where specific values fall across the entire
patient sample.
-The Tier column represents the tier assigned to the best peptide. The neoantigen
-candidates in this view were all sorted into the Pass tier but tiers such as Low
-Expression or Subclonal are applied to easily identify why a neoantigen candidate
-might be unsuitable for vaccine selection.
-
The Ref Match column reflects whether or not the best peptide was found in the
reference proteome which is undesired since such peptides are not novel and
including them in a vaccine might lead to an auto immune response.
+The Tier column represents the tier assigned to the best peptide. The neoantigen
+candidates in this view were mostly sorted into the Pass tier but tiers such as Low
+Expression or Subclonal are applied to easily identify why a neoantigen candidate
+might be unsuitable for vaccine selection. The bottom two variants in the view are
+assigned the PoorBinder tier because of the binding affinity exceeding the desired
+binding threshold. This is also visually indicated by the yellow cell backgrounds
+and the red border around the IC50 MT value cells for these variants.
+
Users are able to set a status for each candidate by clicking the appropriate
buttons: thumbs-up (accept), thumbs-down (reject), or flag (requires further review).
@@ -119,14 +123,7 @@ Clicking on the row for a variant will select it. This
will update the lower panels with details for the selected variant.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Upon successfully uploading the relevant data files, you can explore the different aspects of your neoantigen candidates."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_0")
-```
-
-For candidates not sorted into the Pass tier, red borders visually highlight the
-attributes failed by the candidate.
-
-```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Neoantigen candidates are binned into tiers depending on their suitability for vaccine creation."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_6")
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e3d230b076_0_0")
```
### Variant Information
@@ -199,7 +196,7 @@ one peptide-MHC binding prediction falls within the `--aggregate-inclusion-thre
will be shown in this table. However, for variants resulting in a large number
of well-binding candidates (e.g. frameshift variant with a long downstream
sequence), the number of included candidates will be limited to the best
-`--aggregate-inclusion-count-limit` candidates.For HLA alleles where the peptide is not
+`--aggregate-inclusion-count-limit` candidates. For HLA alleles where the peptide is not
well-binding the prediction details will show `X`. This table also shows the
mutant position, whether or not the neoantigen candidate has any problematic
positions, and whether or not it failed the anchor criteria. This helps in
@@ -242,38 +239,55 @@ In the Additional Peptide Information panel, users can see more information
for the neoantigen candidate selected in the Transcript Set Detailed Data
panel.
-The first tab, "IC50 Plot", shows violin plots of the predicted IC50 binding
-affinity for each prediction algorithm for both the neoantigen candidate and
-its matched wildtype peptide. This can be used to check concordance
-of predictions between the different algorithms. It also allows for a
-detailed comparison between the mutant and wildtype predictions in addition to
-the median or lowest IC50 binding affinity used elsewhere. A solid line is used
-to represent the median score.
+The first tab, %ile Plot, shows a violin plot for the predicted percentile
+scores for each prediction algorithm for both the neoantigen candidate and
+its matched wildtype peptide. This can be used to check concordance of
+predictions between the different algorithms. Each data point is colored
+depending on the type of algorithm, blue for binding algorithms, yellow for
+presentation algorithms, and purple for immunogenicity algorithms. Users may
+filter to plot to only show a subset of algorithms by selecting the desired
+algorithm type in the "Specify what data to show" dropdown.
+
+A solid black line is used to represent the median score. A green solid line
+is used to show the 2% cutoff and dashed line for the 0.5% cutoff.
-```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Additional Peptide Information panel shows more information for the peptide selected in the Transcript Set Detailed Data panel."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_42")
+```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The %ile Plot tab shows violin plots of the percentile score predicted by each algorithm."}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_0")
```
-The %ile Plot tab shows a similar violin plot but for the predicted percentile
-scores as opposed to the IC50 binding affinity. A solid line is also used here
-to represent the median score.
+The "IC50 Plot" tab shows a similar violin plot but for the IC50 binding affinity
+scores as opposed to the percentile. A solid line is also used here to represent
+the median score. A green solid line is used to show the 1000nM cutoff and dashed
+line for the 500nM cutoff.
-```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The %ile Plot tab shows violin plots of the percentile score predicted by each algorithm."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_48")
+```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The IC50 Plot tab shows violin plots of the binding affinity scores predicted by each binding affinity algorithm."}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_5")
```
The next tab, "Binding Data", shows the IC50 binding affinity and
-percentile score but in table format.
+binding percentile scores but in table format. Heatmap table cell backgrounds
+indicate where each value falls in relation to binding threshold and
+binding percentile threshold, respectively. Users may switch between the two
+by changing the selection in the "Cell heatmap background coloring" dropdown.
+
+```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Binding Data tab shows a table of the IC50 binding affinity and percentile values predicted by each algorithm."}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_10")
+```
+
+The Presentation Data tab shows the predicted presentation scores and presentation
+percentiles. Heatmap table cell backgrounds indicate where each percentile value
+falls in relation to presentation percentile threshold.
-```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Binding Data tab shows a table of the IC50 binding affinity and percentile predicted by each algorithm."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_54")
+```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Presentation Data tab shows presentation prediction scores and precentiles for the selected peptide."}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_15")
```
-The Elution Table tab shows the predicted elution scores and percentiles, if
-the appropriate prediction algorithm(s) were chosen.
+The Immunogenicity Data tab shows the predicted immunogenicity scores and immunogenicity
+percentiles. Heatmap table cell backgrounds indicate where each percentile value
+falls in relation to immunogenicity percentile threshold.
-```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Elution Table tab shows elution prediction scores and precentiles for the selected peptide."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_60")
+```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Immunogenicity Data tab shows immunogenicity prediction scores and precentiles for the selected peptide."}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_20")
```
## Regenerate Tiers with Custom Parameters
@@ -291,7 +305,7 @@ in the "Original Parameters for Tiering" panel and the tiers can be reset to
those parameters by pressing the "Reset to original parameters" button.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Users can re-tier the neoantigen candidates by adjusting the tiering thresholds."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3a65970eb36_0_66")
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_31")
```
## Adding Comments to Variants
@@ -314,5 +328,5 @@ export the Aggregated Report with the updated Evaluation column and comments
added. The report can be exported in either TSV or Excel format.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Users can export the neoantigen candidate table after review has been completed."}
-ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_99")
+ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g3e422d3441b_0_38")
```
diff --git a/book.bib b/book.bib
index 07c485c5..7d5bbce0 100644
--- a/book.bib
+++ b/book.bib
@@ -328,6 +328,82 @@ @article{Altschul1990
journal = {Journal of Molecular Biology}
}
+@article{Gfeller2023,
+ author = {D. Gfeller and J. Schmidt and G. Croce and P. Guillaume and S. Bobisse and R. Genolet and L. Queiroz and J. Cesbron and J. Racle and A. Harari},
+ title = {Improved Predictions of Antigen Presentation and {TCR} Recognition with {MixMHCpred2.2} and {PRIME2.0} Reveal Potent {SARS-CoV-2} {CD8}},
+ journal = {Cell Syst},
+ year = 2023,
+ volume = 14,
+ pages = {72--83.e5},
+ pmid = 36603583,
+ pmcid = {PMC9811684},
+ reprinturl = {https://doi.org/10.1016/j.cels.2022.12.002},
+}
+
+@article{Albert2023,
+ author = {B. A. Albert and Y. Yang and X. M. Shao and D. Singh and K. N. Smit and V. Anagnostou and R. Karchin},
+ title = {Deep Neural Networks Predict Class {I} Major Histocompatibility Complex Epitope Presentation and Transfer Learn Neoepitope Immunogenicity},
+ journal = {Nat Mach Intell},
+ year = 2023,
+ volume = 5,
+ pages = {861--872},
+ pmid = 37829001,
+ pmcid = {PMC10569228},
+ reprinturl = {https://doi.org/10.1038/s42256-023-00694-6},
+}
+
+@article{ODonnell2018,
+ author = {T. J. O'Donnell and A. Rubinsteyn and M. Bonsack and A. B. Riemer and U. Laserson and J. Hammerbacher},
+ title = {{MHCflurry:} {Open-Source} {C}lass {I} {MHC} {B}inding {A}ffinity {P}rediction},
+ journal = {Cell Syst},
+ year = 2018,
+ volume = 7,
+ pages = {129--132.e4},
+ pmid = 29960884,
+ reprinturl = {https://doi.org/10.1016/j.cels.2018.05.014},
+}
+
+@article{Racle2023,
+ author = {J. Racle and P. Guillaume and J. Schmidt and J. Michaux and A. Larabi and K. Lau and M. A. S. Perez and G. Croce and R. Genolet and G. Coukos and V. Zoete and F. Pojer and M. Bassani-Sternberg and A. Harari and D. Gfeller},
+ title = {Machine Learning Predictions of {MHC-II} Specificities Reveal Alternative Binding Mode of Class {II} Epitopes},
+ journal = {Immunity},
+ year = 2023,
+ volume = 56,
+ pages = {1359--1375.e13},
+ pmid = 37023751,
+ reprinturl = {https://doi.org/10.1016/j.immuni.2023.03.009},
+}
+
+@article{Li2021,
+ author = {G. Li and B. Iyer and V. B. S. Prasath and Y. Ni and N. Salomonis},
+ title = {{DeepImmuno:} Deep Learning-empowered Prediction and Generation of Immunogenic Peptides for {T}-cell Immunity},
+ journal = {Brief Bioinform},
+ year = 2021,
+ volume = 22,
+ pages = {bbab160},
+ pmid = 34009266,
+ pmcid = {PMC8135853},
+ reprinturl = {https://doi.org/10.1093/bib/bbab160},
+}
+
+@article{Shen2025,
+ author = {Shen, Long-Chen and Zhang, Yumeng and Wang, Zhikang and Littler, Dene R. and Liu, Yan and Tang, Jinhui and Rossjohn, Jamie and Yu, Dong-Jun and Song, Jiangning},
+ date = {2025/08/01},
+ date-added = {2026-05-19 09:31:43 -0500},
+ date-modified = {2026-05-19 09:31:43 -0500},
+ doi = {10.1038/s42256-025-01073-z},
+ id = {Shen2025},
+ isbn = {2522-5839},
+ journal = {Nature Machine Intelligence},
+ number = {8},
+ pages = {1250--1265},
+ title = {Self-iterative multiple-instance learning enables the prediction of CD4+ T cell immunogenic epitopes},
+ url = {https://doi.org/10.1038/s42256-025-01073-z},
+ volume = {7},
+ year = {2025},
+ bdsk-url-1 = {https://doi.org/10.1038/s42256-025-01073-z}
+}
+
@Manual{rmarkdown2021,
title = {rmarkdown: Dynamic Documents for R},
author = {JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone},
diff --git a/resources/dictionary.txt b/resources/dictionary.txt
index 5f7b3adf..2cf95b00 100644
--- a/resources/dictionary.txt
+++ b/resources/dictionary.txt
@@ -3,6 +3,7 @@ Arriba
AnVIL
agretopicity
BIPOC
+BigMHC
Biotype
BLASTp
Bloomberg
@@ -23,6 +24,7 @@ DQB
DRB
Datatrail
DataTrail
+DeepImmuno
Dockerfile
Dockerhub
deprioritized
@@ -58,7 +60,9 @@ IC
IEDB
ITCR
ITN
+ImmuoScope
ile
+immunogenicity
immunotherapies
immunotherapy
isoform
@@ -75,6 +79,8 @@ Leanpub
lymphoblastoid
MHCnuggets
Markua
+MixMHC
+MixMHCpred
mRNA
manufacturability
mentorship
@@ -95,6 +101,7 @@ Neoantigen
neoantigens
neoantigen's
nmol
+nM
OptiType
ottrpal
PHLAT
@@ -102,6 +109,7 @@ pVACbind
proteome
Pandoc
pre
+pred
proteomics
pVAC
pVACfuse
@@ -112,6 +120,7 @@ pVACtools
pVACvector
pVACview
pVACviz
+PoorBinder
RefSeq
RegTools
reproducibility
@@ -123,6 +132,7 @@ SpanningFragCount
tbi
tiering
TSL
+tsl
tsv
UE
UE5