DNA Flashcards
a substitution of the C nucleotide at g.33038255 by an A
NC_000023.10:g.33038255C>A
a substitution of the G nucleotide at c.93+1 (coding DNA reference sequence) by a T
NG_012232.1(NM_004006.2):c.93+1G>T
nucleotides c.79 and c.80 are replaced by TT
LRG_199t1:c.79_80delinsTT
NOTE: changes involving two or more consecutive nucleotides are described as deletion-insertion (delins) so the description c.[79G>T;80C>T] is not correct
NOTE: based on the definition of a substitution, i.e. one nucleotide replaced by one other nucleotide, this change can not be described as a substitution like c.79_80GC>TT or c.79GC>TT
two substitutions replacing codon CGC (position c.145 to c.147) by TGG
NM_004006.2:c.145_147delinsTGG
NOTE: two variants separated by one nucleotide, together affecting one amino acid, should be described as a “delins” so the description c.[145C>T;147C>G] is not correct
a substitution of the G nucleotide at c.54 (coding DNA reference sequence) by A, C or T (IUPAC code “H”
LRG_199t1:c.54G>H
a screen was performed showing that nucleotide c.123 was a “C” as in the coding DNA reference sequence (the nucleotide was not changed).
NM_004006.2:c.123=
NOTE: the description NM_004006.2:c.= can not be used, c.= indicates the entire NM_004006.2 coding DNA reference sequence was analysed and no change was identified.
NOTE: the description LRG_199t1:c.94-23_188+33= indicates no variants where found in the region indicated (exon 3 of the DMD gene).
a mosaic case where at position 85 besides the normal sequence (a T, described as “=”) also chromosomes are found containing a C (c.85T>C)
LRG_199t1:c.85=/T>C
NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first
a chimeric case, i.e. the sample is a mix of cells containing c.85= and c.85T>C.
NM_004006.2:c.85=//T>C
NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first
When I only sequenced RNA (cDNA) and not genomic DNA should I then give the description of a variant at DNA level in parenthesis?
Yes, while the variant at RNA level can be described as r.76a>g on DNA level, based on e.g. a coding DNA reference, sequence it should be described as c.(76A>G).
How should I describe a variant in the promoter region of a gene?
It is recommended to describe variants in the promoter region of a gene based on a genomic reference sequence, e.g. NC_000023.10:g.33357783G>A (chrX, hg19). Describing the variant in relation to a coding DNA reference sequence (for this variant NM_004006.1:c.-128354C>T or NM_000109.3:c.-401C>T) is possible but not really very informative; you do not know how long the 5’UTR is. The variant can also be described using a genomic reference sequence containing the promoter region (for this variant e.g. L01538.1:g.1407C>T), but again this is not really informative. Although NC_000023.10:g.33357783G>A seems complex, it can be used in a genome browsers helping you to quickly zoom in on the region of interest.
Are polymorphisms described like NM_004006.1:c.76A/G?
No, all substitutions are described as NM_004006.1:c.76A>G. In the past, the format c.76A/G has been used to describe “polymorphic” sequence variants. Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.
Can I describe a GC to TG variant as a dinucleotide substitution (NG_012232.1:g.12GC>TG)?
No, this is not allowed. By definition a substitution changes one nucleotide into one other nucleotide. The change GAAGCCAG to GAATGCAG should be described as NG_012232.1:g.12_13delinsTG, i.e. a deletion/insertion (indel) (see Deletion-Insertion and Description - Note). When phase information is not available, the variant should be described as NG_012232.1:g.12G>T(;)13C>G (see Alleles).
The BRCA1 coding DNA reference sequence NM_007294.3 from position c.2074 to c.2080 is ..CATGACA.. A variant frequently found in the population is ..CATAACA.. (NM_007294.3:c.2077G>A). In a patient I found the sequence ..CATATAACA.. Can I describe this variant as NM_007294.3:c.[2077G>A;2077_2078insTA]?
The correct description of this variant is NM_007294.3:c.2077delinsATA.
NOTE: the answer was modified, i.e. the addition “However, since the variant is likely a combination of two other variants it is acceptable to describe it as NM_007294.3:c.[2077G>A;2077_2078insTA]” was removed.
a deletion of the T at position g.19 in the sequence AGAATCACA to AGAA_CACA
NOTE: the recommendation is not to describe the variant as NG_012232.1:g.19delT, i.e. describe the deleted nucleotide sequence. This description is longer, it contains redundant information and chances to make an error increases (e.g. NG_012232.1:g.19delG).
one nucleotide - NG_012232.1:g.19del
a deletion of nucleotides g.19 to g.21 in the sequence AGAATCACA to AGAA___CA
NG_012232.1:g.19_21del
NOTE: the recommendation is not to describe the variant as NG_012232.1:g.19_21delTCA, i.e. describe the deleted nucleotide sequence. This description is longer, it contains redundant information and chances to make an error increases (e.g. NG_012232.1:g.19_21delTTA).
a deletion of nucleotides c.183 to c.186+48 (coding DNA reference sequence), crossing an exon/intron border
NG_012232.1(NM_004006.1):c.183_186+48del
the deletion of the T nucleotide at the exon/exon border in the sequence ..GAT gta..//..cag TCA.. changing to ..GA_ gta..//..cag TCA..
LRG_199t1:c.3921del
NOTE : according to an exception of the 3’rule the variant (NC_000023.10:g.32459297del) is not described as c.3922del since this would shift the position of the variant to the next exon (c. 3922 linking to g.32456507) (see exception in Numbering and see Q&A)
the deletion of the G nucleotide at the exon/intron border in the sequence GAACAGgt…/..agTGCCTT changing to GAACAG_t…/..agTGCCTT (not c.1704del)
LRG_199t1:c.1704+1del
NOTE: this description does not depend on the effect observed on RNA level, giving either altered splicing or r.1704del
the deletion of the G nucleotide at the intron/exon border in the sequence CTGGCCgt…/..agGTTTTA changing to CTGGCCgt…/..ag_TTTTA (not c.1813-1del)
LRG_199t1:c.1813del
a deletion of nucleotides c.4072-1234 to c.5155-246 removing exon 30 (starting at position c.4072) to exon 36 (ending at position c.5154) of the DMD-gene.
NG_012232.1(NM_004006.1):c.4072-1234_5155-246del
NOTE : c.4072-1234_5155-246delXXXXX, the size of the deletion (XXXXX) should not be described
a deletion of exon 30 (starting at position c.4072) to exon 36 (ending at position c.5154) of the DMD-gene. The deletion break point has not been sequenced. Exons 29 (ending at c.4071) and 37 (starting at nucleotide c.5155) have been tested an shown to be not deleted. The deletion therefore starts in intron 29 (position c.4071+1 to c.4072-1) and ends in intron 36 (position c.5154+1 to c.5155-1).
NG_012232.1(NM_004006.1):c.(4071+14072-1)(5154+1_5155-1)del
probe-based description of a deletion, identified by MLPA, of exon 30 (deleted position tested c.4196) to exon 36 (deleted position tested c.5090) of the DMD-gene. The deletion break point has not been sequenced. Exons 29 (position tested c.3996) and 37 (position tested c.5284) are not deleted.
NG_012232.1(NM_004006.1):c.(39964196)(5090_5284)del
a deletion of nucleotides c.720 to c.991 starting in exon 8 (position c.720) and ending in exon 10 (position c.991) of the DMD-gene.
LRG_199t1:c.720_991del
a deletion of the entire DMD gene based on a SNP-array analysis where the maximum size of the deletion lies between SNPs rs396303 and rs7887548 (nucleotides 31060227 and 33417151) and the minimum size between SNPs rs808178 and rs7887103 (nucleotides 31100351 and 33274278).
NC_000023.11:g.(3106022731100351)(33274278_33417151)del
a deletion of the entire DMD gene based on a MLPA assay where the nucleotide positions g.31120496 and g.33339477 are defined by the 3’ nucleotide of the genomically most 5’ located probes (usually the ligation site) for the resp. last and first (brain promoter) exons.
NC_000023.11:g.(?31120496)(33339477_?)del
a mosaic case where from position g.19 to g.21 besides the normal sequence also chromosomes are found containing a deletion of this sequence
NG_012232.1:g.19_21=/del
a chimeric case, i.e. the sample is a mix of cells containing g.19_21= and g.19_21del
NG_012232.1:g.19_21=//del
What is duplication in nucleotide?
a sequence change where, compared to a reference sequence, a copy of one or more nucleotides are inserted **directly 3’ **of the original copy of that sequence.
the duplication of a T at position c.20 in the sequence AGAAGTAGAGG to AGAAGTTAGAGG
NM_004006.2:c.20dup (NC_000023.10:g.33229410dup)
NOTE: it is not allowed to describe the variant as c.19_20insT (see prioritisation)
NOTE: the recommendation is not to describe the variant as NM_004006.2:c.20dupT, i.e. describe the duplicated nucleotide sequence. This description is longer, it contains redundant information and chances to make an error increases (e.g. NM_004006.2:c.20dupG).
a duplication from position c.20 to c.23 in the sequence AGAAGTAGAGG to AGAAGTAGATAGAGG
NM_004006.2:c.20_23dup (NC_000023.10:g.33229407_33229410dup)
NOTE: the recommendation is not to describe the variant as c.20_23dupTAGA, i.e. describe the duplicated nucleotide sequence. This description is longer, it contains redundant information and chances to make an error increases (e.g. c.20_23dupTGGA).
a duplication of nucleotides c.160 to c.264+48 (coding DNA reference sequence), crossing an exon/intron border
LRG_199t1:c.260_264+48dup (NC_000023.10:g.32862852_32862904dup)
the duplication of the T nucleotide at the exon/exon border in the sequence ..GAT gta..//..cag TCA.. changing to ..GATT gta..//..cag TCA..
NOTE : according to an exception of the 3’rule the variant (NC_000023.10:g.32459297dup) is not described as c.3922dup since this would shift the position of the variant to the next exon (c. 3922 linking to g.32456507)
LRG_199t1:c.3921dup
the duplication of the G nucleotide at the exon/intron border in the sequence GAACAGgt…/..agTGCCTT changing to GAACAGggt…/..agTGCCTT (not c.1704dup)
NOTE: this description does not depend on the effect observed on RNA level, giving either altered splicing or r.1704dup
LRG_199t1:c.1704+1dup
the duplication of the G nucleotide at the intron/exon border in the sequence CTGGCCgt…/..agGTTTTA changing to CTGGCCgt…/..agGGTTTTA (not c.1813-1dup)
NOTE: this description does not depend on the effect observed on RNA level, giving either altered splicing or r.1813dup
LRG_199t1:c.1813dup