## Summary Table
Evidence | Status
-----------------------|--------------
[PVS1](#wiki-toc-pvs1) | Implemented
[PS1](#wiki-toc-ps1) | Implemented
[PS2](#wiki-toc-ps2) | Not Checked
[PS3](#wiki-toc-ps3) | Not Checked
[PS4](#wiki-toc-ps4) | Implemented
[PM1](#wiki-toc-pm1) | Planned
[PM2](#wiki-toc-pm2) | Implemented
[PM3](#wiki-toc-pm3) | Planned for Trio
[PM4](#wiki-toc-pm4) | Implemented
[PM5](#wiki-toc-pm5) | Broken (╯°□°)╯︵ ┻━┻)
[PM6](#wiki-toc-pm6) | Planned for Trio
[PP1](#wiki-toc-pp1) | Not Checked
[PP2](#wiki-toc-pp2) | Implemented
[PP3](#wiki-toc-pp3) | Implemented
[PP4](#wiki-toc-pp4) | Not Checked
[PP5](#wiki-toc-pp5) | Implemented
[BA1](#wiki-toc-ba1) | Implemented
[BS1](#wiki-toc-bs1) | Planned
[BS2](#wiki-toc-bs2) | Planned
[BS3](#wiki-toc-bs3) | Not Checked
[BS4](#wiki-toc-bs4) | Not Checked
[BP1](#wiki-toc-bp1) | Implemented
[BP2](#wiki-toc-bp2) | Planned for Trio
[BP3](#wiki-toc-bp3) | Planned
[BP4](#wiki-toc-bp4) | Implemented
[BP5](#wiki-toc-bp5) | Not Checked
[BP6](#wiki-toc-bp6) | Implemented
[BP7](#wiki-toc-bp7) | Planned
## Evidence Collection Process
### PVS1
* PVS1 null variant (nonsense, frameshift, canonical ±1 or 2 splice sites, initiation codon, single or multiexon deletion) in a gene where LOF is a known mechanism of disease.
#### Status
* Implemented
#### Resources
* LoF genes list from intervar. https://raw.githubusercontent.com/barslmn/InterVar/master/intervardb/PVS1.LOF.genes.hg19
* Null variants defined as HIGH IMPACT by https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html
#### Conditions
* "gene_symbol" is in LoF gene list.
* "transcript_consequence_terms" is high impact.
#### Shortcomings
* LoF gene list is only predictive and may be missing some actual LoF genes.
* No checks for multiexon deletion.
### PS1
* Same amino acid change as a previously established pathogenic variant regardless of nucleotide change.
#### Status
* Implemented
#### Resources
* Clinvar xml (ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/)
#### Annotation Steps
1. Clinvar data is parsed using https://github.com/barslmn/clinvar.
2. Sample data and clinvar data is merged based on columns "CHR" and "POS".
3. Clinvar feature columns "ALT", "hgvsp", and "clinical_significance" added to original annotation.
#### Conditions
1. "clinical_significance" is pathogenic.
2. Sample "hgvsp" and later added clinvar "hgvsp" changes are the same.
3. Sample "ALT" and clinvar "ALT" are different.
#### Shortcomings
### PS2
* De novo (both maternity and paternity confirmed) in a patient with the disease and no family history.
#### Status
* Not Checked
#### Resources
#### Conditions
#### Shortcomings
### PS3
* Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product
#### Status
* Not Checked
#### Resources
#### Conditions
#### Shortcomings
### PS4
* The prevalence of the variant in affected individuals is significantly increased compared with the prevalence in controls
#### Status
* Implemented
#### Resources
* Intervar
#### Conditions
1. "id" is in id list.
#### Shortcomings
1. No idea how the source is made.
### PM1
* Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of an enzyme) without benign variation
#### Status
* Planned.
#### Resources
#### Conditions
#### Shortcomings
### PM2
* Absent from controls (or at extremely low frequency if recessive) (Table 6) in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium
#### Status
* Implemented
#### Resources
* VEP
#### Conditions
* "gnomad" less than 0.001.
#### Shortcomings
### PM3
* For recessive disorders, detected in trans with a pathogenic variant
#### Status
* Planned for trio
#### Resources
#### Conditions
#### Shortcomings
### PM4
* Protein length changes as a result of in-frame deletions/insertions in a nonrepeat region or stop-loss variants
#### Status
* Implemented
#### Resources
* VEP
#### Conditions
* "transcript_consequence_terms" is "inframe_insertion", "inframe_deletion", or "stop_lost".
#### Shortcomings
* No checks for repeat regions.
### PM5
* Novel missense change at an amino acid residue where a different missense change determined to be pathogenic has been seen before
#### Status
* Broken. (╯°□°)╯︵ ┻━┻)
#### Resources
* Clinvar xml (ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/)
#### Annotation Steps
1. Clinvar data is parsed using https://github.com/barslmn/clinvar.
2. Sample data and clinvar data hgvsp columns parsed till position.
3. Synonym changes removed from clinvar data.
4. Clinvar feature columns "hgvsc", and "clinical_significance" added to original annotation based on protein change position.
#### Conditions
1. "gnomad" less then 0.001.
2. "clinical_significance" is pathogenic.
3. "transcript_consequence_terms" is missense variant.
4. "hgvsc" of the variant and clinvar entry dont match.
#### Shortcomings
### PM6
* Assumed de novo, but without confirmation of paternity and maternity
#### Status
* Planned for trio.
#### Resources
#### Conditions
#### Shortcomings
### PP1
* Cosegregation with disease in multiple affected family members in a gene definitively known to cause the disease
#### Status
* Not Checked.
#### Resources
#### Conditions
#### Shortcomings
### PP2
* Missense variant in a gene that has a low rate of benign missense variation and in which missense variants are a common mechanism of disease
#### Status
* Implemented
#### Resources
* Intervar
#### Conditions
* "transcript_consequence_terms" is a missense variant.
* "gene_symbol" is in PP2 gene list.
#### Shortcomings
### PP3
* Multiple lines of computational evidence support a deleterious effect on the gene or gene product (conservation, evolutionary, splicing impact, etc.)
#### Status
* Implemented
#### Resources
* Vep
#### Conditions
* "sift_score" less than 0.05
* "polyphen_score" greater than 0.908
#### Shortcomings
### PP4
* Patient’s phenotype or family history is highly specific for a disease with a single genetic etiology
#### Status
* Not Checked.
#### Resources
#### Conditions
#### Shortcomings
### PP5
* Reputable source recently reports variant as pathogenic, but the evidence is not available to the laboratory to perform an independent evaluation
#### Status
* Implemented.
#### Resources
* Clinvar
#### Conditions
* "clinical_significance" is Pathogenic.
#### Shortcomings
### Benign
### BA1
* Allele frequency is >5% in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium
#### Status
* Implemented.
#### Resources
* Vep
#### Conditions
* "minor_allele_freq" is greater than 0.05
OR
* "gnomad" is greater than 0.05.
#### Shortcomings
### BS1
* Allele frequency is greater than expected for disorder
#### Status
* Planned for later.
#### Resources
#### Conditions
#### Shortcomings
### BS2
* Observed in a healthy adult individual for a recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorder, with full penetrance expected at an early age
#### Status
* Planned
#### Resources
* Intervar
#### Conditions
#### Shortcomings
### BS3
* Well-established in vitro or in vivo functional studies show no damaging effect on protein function or splicing
#### Status
* Not Checked.
#### Resources
#### Conditions
#### Shortcomings
### BS4
* Lack of segregation in affected members of a family
#### Status
* Not Checked.
#### Resources
#### Conditions
#### Shortcomings
### BP1
* Missense variant in a gene for which primarily truncating variants are known to cause disease
#### Status
* Implemented.
#### Resources
* Intervar
#### Conditions
* "transcript_consequence_terms" is a missense variant.
* "gene_symbol" is in BP1 gene list.
#### Shortcomings
### BP2
* Observed in trans with a pathogenic variant for a fully penetrant dominant gene/disorder or observed in cis with a pathogenic variant in any inheritance pattern
#### Status
* Planned for trio.
#### Resources
#### Conditions
#### Shortcomings
### BP3
* In-frame deletions/insertions in a repetitive region without a known function
#### Status
* Planned.
#### Resources
#### Conditions
#### Shortcomings
### BP4
* Multiple lines of computational evidence suggest no impact on gene or gene product (conservation, evolutionary, splicing impact, etc.)
#### Status
* Implemented
#### Resources
* VEP
#### Conditions
* "sift_score" greater than or equals to 0.05
* "polyphen_score" less than or equals to 0.446
#### Shortcomings
### BP5
* Variant found in a case with an alternate molecular basis for disease
#### Status
* Not Checked.
#### Resources
#### Conditions
#### Shortcomings
### BP6
* Reputable source recently reports variant as benign, but the evidence is not available to the laboratory to perform an independent evaluation
#### Status
* Implemented
#### Resources
* Clinvar
#### Conditions
* "clinical_significance" is benign
#### Shortcomings
### BP7
* A synonymous (silent) variant for which splicing prediction algorithms predict no impact to the splice consensus sequence nor the creation of a new splice site AND the nucleotide is not highly conserved
#### Status
* Planned
#### Resources
#### Conditions
#### Shortcomings