A sequence in FASTA format begins with a single-line description, followed
by lines of sequence data. The description line is distinguished from the
sequence data by a greater-than (">") symbol in the first column. It is
recommended, although not necessary, that all lines of text be shorter than 80
characters in length.
An example of sequences in FASTA format is:
>gb|AE004091|AE004091:483-2027, PA0001 GTGTCCGTGGAACTTTGGCAGCAGTGCGTGGATCTTCTCCGCGATGAGCTGCCGTCCCAACAATTCAACA CCTGGATCCGTCCCTTGCAGGTCGAAGCCGAAGGCGACGAATTGCGTGTGTATGCACCCAACCGTTTCGT CCTCGATTGGGTGAACGAGAAATACCTCGGTCGGCTTCTGGAACTGCTCGGTGAACGCGGCGAGGGTCAG TTGCCCGCGCTTTCCTTATTAATAGGCAGCAAGCGTAGCCGTACGCCGCGCGCCGCCATCGTCCCATCGC AGACCCACGTGGCTCCCCCGCCTCCGGTTGCTCCGCCGCCGGCGCCAGTGCAGCCGGTATCGGCCGCGCC CGTGGTAGTGCCACGTGAAGAGCTGCCGCCAGTGACGACGGCTCCCAGCGTGTCGAGCGATCCCTACGAG CCGGAAGAACCCAGCATCGATCCGCTGGCCGCCGCCATGCCGGCTGGAGCAGCGCCTGCGGTGCGCACCG AGCGCAACGTCCAGGTCGAAGGTGCGCTGAAGCACACCAGCTATCTCAACCGTACCTTCACCTTCGAGAA CTTCGTCGAGGGCAAGTCCAACCAGTTGGCCCGCGCCGCCGCCTGGCAGGTGGCGGACAACCTCAAGCAC GGCTACAACCCGCTGTTCCTCTACGGTGGCGTCGGTCTGGGCAAGACCCACCTGATGCATGCGGTGGGCA ACCACCTGCTGAAGAAGAACCCGAACGCCAAGGTGGTCTACCTGCATTCGGAACGTTTCGTCGCGGACAT GGTGAAGGCCTTGCAGCTCAACGCCATCAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCACTGTTG ATCGACGACATCCAGTTCTTCGCCCGTAAGGAGCGCTCCCAGGAGGAGTTCTTCCACACCTTCAATGCCC TTCTCGAAGGCGGCCAGCAGGTGATCCTCACCAGCGACCGCTATCCGAAGGAAATCGAAGGCCTGGAAGA GCGGCTGAAATCCCGCTTCGGCTGGGGCCTGACGGTGGCCGTCGAGCCGCCGGAACTGGAAACCCGGGTG GCGATCCTGATGAAGAAGGCCGAGCAGGCGAAGATCGAGCTGCCGCACGATGCGGCCTTCTTCATCGCCC AGCGCATCCGTTCCAACGTGCGTGAACTGGAAGGTGCGCTGAAGCGGGTGATCGCCCACTCGCACTTCAT GGGCCGGCCGATCACCATCGAGCTGATTCGCGAGTCGCTGAAGGACCTGTTGGCCCTTCAGGACAAGCTG GTCAGCATCGACAACATCCAGCGCACCGTCGCCGAGTACTACAAGATCAAGATATCCGATCTGTTGTCCA AGCGGCGTTCGCGCTCGGTGGCGCGCCCGCGCCAGGTGGCCATGGCGCTCTCCAAGGAGCTGACCAACCA CAGCCTGCCGGAGATCGGCGTGGCCTTCGGCGGTCGGGATCACACCACGGTGTTGCACGCCTGTCGTAAG ATCGCTCAACTTAGGGAATCCGACGCGGATATCCGCGAGGACTACAAGAACCTGCTGCGTACCCTGACAA CCTGA >gb|AE004091|AE004091:2056-3159, PA0002 ATGCATTTCACCATTCAACGCGAAGCCCTGTTGAAACCGCTGCAACTGGTCGCCGGCGTCGTGGAACGCC GCCAGACATTGCCGGTTCTCTCCAACGTCCTGCTGGTGGTCGAAGGCCAGCAACTGTCGCTGACCGGCAC CGACCTCGAAGTCGAGCTGGTTGGTCGCGTGGTACTGGAAGATGCCGCCGAACCCGGCGAGATCACCGTA CCGGCGCGCAAGCTGATGGACATCTGCAAGAGCCTGCCGAACGACGTGCTGATCGACATCCGTGTCGAAG AGCAGAAACTTCTGGTGAAGGCCGGGCGTAGCCGCTTCACCCTGTCCACCCTGCCGGCCAACGATTTCCC CACCGTAGAGGAAGGTCCCGGCTCGCTGAACTTCAGCATTGCCCAGAGCAAGCTGCGTCGCCTGATCGAC CGCACCAGCTTCGCCATGGCCCAGCAGGACGTGCGTTACTACCTCAACGGCATGCTGCTGGAAGTGAACG GCGGCACCCTGCGCTCCGTCGCCACCGACGGCCACCGACTGGCCATGTGCTCGCTGGATGCGCAGATCCC GTCGCAGGACCGCCACCAGGTGATCGTGCCGCGCAAAGGCATCCTCGAACTGGCTCGTCTGCTCACCGAG CAGGACGGCGAAGTCGGCATCGTCCTGGGCCAGCACCATATCCGTGCCACCACTGGCGAATTCACCTTCA CTTCGAAGCTGGTGGACGGCAAGTTCCCGGACTACGAGCGTGTACTGCCGCGCGGTGGCGACAAGCTGGT GGTCGGTGACCGCCAGCAACTGCGCGAAGCCTTCAGCCGTACCGCGATCCTCTCCAACGAGAAGTACCGC GGCATTCGCCTGCAGCTTTCCAACGGTTTGCTGAAAATCCAGGCGAACAACCCGGAGCAGGAAGAGGCCG AGGAAGAAGTGCAGGTCGAGTACAACGGCGGCAACCTGGAGATAGGCTTCAACGTCAGTTACCTGCTCGA CGTGCTGGGTGTGATCGGTACCGAGCAGGTCCGCTTCATCCTTTCCGATTCCAACAGCAGCGCCCTGGTC CACGAGGCCGACAATGACGATTCTGCCTATGTCGTCATGCCGATGCGCCTCTAA >gb|AE004091|AE004091:3169-4278, PA0003 ATGTCCCTGACCCGCGTTTCGGTCACCGCGGTGCGCAACCTGCACCCGGTGACCCTCTCCCCCTCCCCCC GCATCAACATCCTCTACGGCGACAACGGCAGCGGCAAGACCAGCGTGCTCGAAGCCATCCACCTGCTGGG CCTGGCGCGTTCATTCCGCAGTGCGCGCTTGCAGCCGGTGATCCAGTATGAGGAAGCGGCCTGCACCGTA TTCGGCCAGGTGATGTTGGCCAACGGCATCGCCAGCAACCTGGGGATTTCCCGTGAGCGCCAGGGCGAGT TCACCATCCGCATCGATGGGCAGAACGCCCGGAGTGCGGCTCAATTGGCGGAAACTCTCCCACTGCAACT GATCAACCCGGACAGCTTTCGGTTGCTCGAGGGAGCGCCGAAGATCCGGCGACAGTTCCTCGATTGGGGA GTGTTCCACGTGGAACCTCGGTTTCTGCCCGTCTGGCAGCGCCTGCAGAAGGCGCTGCGCCAGCGGAACT CCTGGCTCCGGCATGGTAAACTGGACCCCGCGTCGCAAGCGGCCTGGGACCGGGAATTGAGCCTGGCCAG CGATGAGATCGATGCCTACCGCAGAAGCTATATCCAGGCGTTGAAACCGGTATTCGAGGAAACACTCGCC GAATTGGTTTCACTGGATGACCTGACCCTTAGCTACTACCGAGGCTGGGACAAGGACCGGGACCTCCTGG AGGTTCTGGCTTCCAGCCTGTTGCGCGACCAGCAGATGGGCCACACCCAGGCGGGACCGCAGCGTGCGGA TCTTCGCATACGGTTGGCAGGTCATAACGCCGCGGAGATTCTCTCGCGCGGTCAGCAGAAGCTGGTGGTA TGCGCCCTGCGCATCGCCCAAGGCCATCTGATCAATCGCGCCAAGCGCGGACAGTGCGTCTACCTGGTGG ACGACCTGCCCTCGGAACTGGATGAGCAGCATCGAATGGCTCTTTGCCGCTTGCTTGAAGATTTGGGTTG CCAGGTATTCATCACCTGCGTGGACCCGCAACTATTGAAAGACGGCTGGCGCACGGATACGCCGGTATCC ATGTTCCACGTGGAACATGGAAAAGTCTCTCAGACCACGACCATCGGGAGTGAAGCATGASequences are expected to be represented in the standard IUB/IUPAC amino acid and nucleic acid codes. Lower-case letters are accepted and are mapped into upper-case.
A --> adenosine
C --> cytidine
G --> guanine
T --> thymidine