Identification of expressed regions in cloned DNA

– identification of genes within DNA sequence

– identification of introns & exons within genes

Northern blots – label fragments of cloned DNA

– use to probe Northern blot

– only fragments containing exons hybridize

Southern blots – digest DNA clone, gel & blot

– probe with cDNA for gene of interest

– only DNA fragments containing exons hybridize

Problems:

– do not get exact sites of transcription start & end

– or exact intron/exon boundaries

– cDNAs may not contain all of expressed sequence

– 5' ends often missing

– may get variable exon splicing

– different mRNAs from same gene

5' mRNA ends – identified by:

5' RACE

S1 nuclease mapping

– mix denatured 5' labelled DNA with mRNA

– 5' end of template strand must be within mRNA

– not degraded by S1 nuclease

– S1 nuclease digests 3' single-stranded end of DNA

– does not digest double-stranded DNA/RNA hybrid

– size of remaining DNA (Maxam-Gilbert sequencing)

– tells where RNA transcript begins

 

 

 

 

 

primer extension

– mix denatured 5' labelled DNA with mRNA

– 3' end of template strand must be within mRNA

– extend DNA with reverse transcriptase

– end of DNA molecule is 5' end of mRNA

– size fragment like S1 mapping

exons identified by:

3' RACE

RT-PCR – using primer pairs specific for predicted exons

– PCR product only produced if exon in mRNA

heteroduplex mapping – mix mRNA with denatured DNA

– observe under electron microscope

– introns seen as single-stranded DNA loops

– need lots of mRNA to work

– not practical for rare mRNAs

other techniques for identifying genes:

probing for conserved sequences

– use labelled fragments of cloned DNA

– probe genomic library from different species

– coding regions likely to be conserved

(not accumulate mutations as fast)

problem:  may identify false positives

– e.g. pseudogenes

– homologous to genes, but not transcribed

gene expression in vivo

– check for expression of gene product from cloned DNA

– by complementation of mutant

– or screening for protein with specific antisera

– only works if cloned DNA from close relative of host cell

– promoter & exon splice sites recognised by host

 

gene expression in vitro

– try in vitro transcription & translation

– check with antisera for desired protein product

– low efficiency (unless strong promoters present)

computer analysis of DNA sequence

– DNA analysis programs identify:

open reading frames (ORFs)

– start codon & stop codon in same frame

– with significant distance between them

homology with known genes (BLAST search)

– may be nucleic or amino acid homology

conserved functional domains

– e.g. DNA binding or transmembrane domains

Note:  not all ORFs will be expressed as genes

– may be pseudogenes

– must be confirmed using experimental techniques