Identification of expressed regions in cloned DNA
– identification of genes within DNA sequence
– identification of introns & exons within genes
Northern blots – label fragments of cloned DNA
– use to probe Northern blot
– only fragments containing exons hybridize
Southern blots – digest DNA clone, gel & blot
– probe with cDNA for gene of interest
– only DNA fragments containing exons hybridize
Problems:
– do not get exact sites of transcription start & end
– or exact intron/exon boundaries
– cDNAs may not contain all of expressed sequence
– 5' ends often missing
– may get variable exon splicing
– different mRNAs from same gene
5' mRNA ends – identified by:
5' RACE
S1 nuclease mapping
– mix denatured 5' labelled DNA with mRNA
– 5' end of template strand must be within mRNA
– not degraded by S1 nuclease
– S1 nuclease digests 3' single-stranded end of DNA
– does not digest double-stranded DNA/RNA hybrid
– size of remaining DNA (Maxam-Gilbert sequencing)
– tells where RNA transcript begins
primer extension
– mix denatured 5' labelled DNA with mRNA
– 3' end of template strand must be within mRNA
– extend DNA with reverse transcriptase
– end of DNA molecule is 5' end of mRNA
– size fragment like S1 mapping
exons identified by:
3' RACE
RT-PCR – using primer pairs specific for predicted exons
– PCR product only produced if exon in mRNA
heteroduplex mapping – mix mRNA with denatured DNA
– observe under electron microscope
– introns seen as single-stranded DNA loops
– need lots of mRNA to work
– not practical for rare mRNAs
other techniques for identifying genes:
probing for conserved sequences
– use labelled fragments of cloned DNA
– probe genomic library from different species
– coding regions likely to be conserved
(not accumulate mutations as fast)
problem: may identify false positives
– e.g. pseudogenes
– homologous to genes, but not transcribed
gene expression in vivo
– check for expression of gene product from cloned DNA
– by complementation of mutant
– or screening for protein with specific antisera
– only works if cloned DNA from close relative of host cell
– promoter & exon splice sites recognised by host
gene expression in vitro
– try in vitro transcription & translation
– check with antisera for desired protein product
– low efficiency (unless strong promoters present)
computer analysis of DNA sequence
– DNA analysis programs identify:
open reading frames (ORFs)
– start codon & stop codon in same frame
– with significant distance between them
homology with known genes (BLAST search)
– may be nucleic or amino acid homology
conserved functional domains
– e.g. DNA binding or transmembrane domains
Note: not all ORFs will be expressed as genes
– may be pseudogenes
– must be confirmed using experimental techniques