930
edits
Changes
Added details about BLAST search
== Locate Sequences and Generate FASTA File ==
* The easiest way to find sequences is to start with a seed sequence then do BLAST searches restricting to RefSeq and the species of interest.
* To find a seed sequence start with NCBI Gene, then find the first Refseq mRNA (should start with NM) then click on that and find the protein (should start with NP)
* Paste that into your FASTA file (see next section) and name accordingly.
* Paste that sequence or its NP id into [https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome NCBI Protein Blast].
* Set the parameters to:
** Database: Reference Proteins (refseq_protein)
** Organism: Start with mouse (''Mus musculus'') or human (''Homo sapiens''), depending on your goal consider adding zebrafish (''Danio rerio''), ''Drosophila melanogaster'', chicken (''Gallus gallus'') and ''Caenorhabditis elegans''
=== Generating a FASTA File===
* Save sequences in notepad, [https://notepad-plus-plus.org/ notepad++] or [https://www.sublimetext.com/ sublime] (not Word) as a <FILENAME>.fasta file.
* Sequence names cannot have spaces. Generally its better to name it as '''mm_Gdf15-NM_004864.4 ''' where mm indicates mouse, Gdf15 is the gene name and NM indicates a [https://www.ncbi.nlm.nih.gov/refseq/ RefSeq mRNA]. If there are multiple mRNA's for the gene, name them
== Create Multiple Sequence Alignment using CLUSTAL Omega ==