Sequence Analysis

 

Log in to the computer using the same login and password as last week.  Open BioEdit, click on the File menu, and Open.  Open your edited forward sequence file from last week.

 

BLAST search

 

BLAST (Basic Local Alignment Search Tool) is a way of comparing a nucleic acid or protein sequence to a database of known sequences.  It works by dividing a sequence up into short “words” and comparing these with known sequences.  If a match is found, the alignment is extended in both directions for as long as the sequence matches.  The best matches are then output as a result.

 

BioEdit does BLAST searches using a program that allows internet-based searches of the NCBI databases.  For further information, see the references in the library or the BioEdit help files.

 

Select your sequence by clicking on the name.  Click on the accessory application menu, BLAST, NCBI BLAST over the internet.  Make sure that the following are selected:

 

Program: blastn (nucleic acid sequences), Database:  nr (non-redundant),

Output: HTML,

Select “Filter query for low complexity regions”

(this screen out repetitive sequences found in many unrelated genes)

Select “Perform gapped alignment” (this allows gaps when matching sequences)

 

Click on Search to send file to NCBI.  Results may take some time to arrive, depending on how busy the NCBI is.  You can work on other analyses while you are waiting.

 

Results will be returned as a HTML document listing the best matches for your sequence.  (If the best matches that show up are vector sequences, get an edited sequence from another group.)  There will be a list of links to the various sequences, and below it a series of alignments with the best matching sequences.  The results can be saved to a disk or your directory as an HTML file.  Click on the link to the best matching sequence, and save this file as well.

 

For your report, indicate the region of your edited sequence that shows homology to the best matching sequence, the percent sequence identity (matching sequence), the species and gene that produced the best match, and any features of the matching sequence that are contained in your edited sequence.

 

Search for open reading frames

 

BioEdit will search a sequence for open reading frames (ORFs) in 6 reading frames (3 in either direction).  (ORFs are sequences that contain start codons and stop codons with a minimum distance between them.  They may potentially be genes, depending whether they are transcribed or not.) 

 

Make sure that your sequence is selected.  (The name should be selected.)  Click on the Sequence menu, Nucleic acid, Sorted six-frame translation.  (This will sort ORFs in the order they occur in your sequence.)  Leave Minimum ORF size at 50, Maximum ORF size blank (for unlimited length), and select ATG as the start codon.  Click on the Translate button.

 

The results obtained can be saved to disk as a text file under a new name.  For your report, note the locations of the beginning and end of the ORF(s) found, the direction and reading frame of the ORF(s) on your sequence, and the size of the predicted protein.

 

You may also look at the hydrophobicity profile of your ORF(s).  Open the saved protein sequence file, and click on Sequence menu, Protein, Kyte & Doolittle mean hydrophobicity profile.  Unfortunately, you will not be able to print this, as the printer in the PC lab is not working.  This will therefore not be required in your report.

 

PCR primer design

 

Open your edited forward sequence file.  Click on the sequence name to select it, and click on copy in the Edit menu.  Click on World wide web menu, Primer3 Web-based PCR primer detection.  Paste the sequence into the form provided.  Check the global parameters – you may use the defaults or change them.  (If you change the defaults, note the changes in your report.)  An explanation of the parameters is located below the form.  Click on “Pick Primers”.

 

Select a pair of primers.  Record the Tm and % GC for each primer, and note whether there is any complimentarity between the two primers.  This could cause primer dimers, especially if it is present at the 3' ends of the primers.  Note the position of the primers on your sequence, and any features of your sequence that could be amplified with the primer pair chosen.

 

Lab report:

 

See notes above for content of report.  Format guidelines are on the website.  The report is due in the final lab period for your group.  (Next week.)