Biology WorkBench 3.2 – CLUSTALW

Human DNA mismatch repair (hmlh1) mRNA, complete cds_





Fasta label (*) Workbench label
GENPEPT:3880333 Caenorhabditis elegans cosmid T28A8, complete sequence_
GENPEPT:825572 S.cerevisiae chromosome XIII cosmid 8520_
GENPEPT:3192877 Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_
GENPEPT:1724118 Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
GENPEPT:7595954 Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
GENPEPT:466462 Human DNA mismatch repair (hmlh1) mRNA, complete cds_

(*) Clustalw cuts off Fasta labels after the first space (e.g. “>abc def” becomes “>abc”).


Sequence alignment

Consensus key (see documentation for details)
* - single, fully conserved residue
: - conservation of strong groups
. - conservation of weak groups
  - no consensus


CLUSTAL W (1.81) multiple sequence alignment


GENPEPT_7595954      -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_1724118      -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAK
GENPEPT_466462       -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_3192877      ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQ
GENPEPT_825572       --------------------MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDAN
GENPEPT_3880333      MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIKELVENSLDAG
                                             *: *   ***::****::  * **:**: **.:** 

GENPEPT_7595954      STNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEA
GENPEPT_1724118      STNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEA
GENPEPT_466462       STSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEA
GENPEPT_3192877      STHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEA
GENPEPT_825572       ATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEA
GENPEPT_3880333      ATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMHMKTYGFRGEA
                     :* * : :: **:*::*: ***.**.: *: ::****:****  ****  : *:******

GENPEPT_7595954      LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRR
GENPEPT_1724118      LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRK
GENPEPT_466462       LASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRR
GENPEPT_3192877      LASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRR
GENPEPT_825572       LASISHVARVTVTTKVKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRL
GENPEPT_3880333      LASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITATDLFYNLPTRR
                     ***:****::.: :*  . :*.::..: :**:   .** **::** *   ***:*:  * 

GENPEPT_7595954      KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
GENPEPT_1724118      KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
GENPEPT_466462       KALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSVFGNA
GENPEPT_3192877      QALRSPAEEFQRLSEVLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAA
GENPEPT_825572       RALRSHNDEYSKILDVVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKS
GENPEPT_3880333      NKMTTHGEEAKMVNDTLLRFAIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRD
                     . : .  :*   : :.: *:::*   :.*: ::  :    . .    .  : :  : .  

GENPEPT_7595954      VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA
GENPEPT_1724118      VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA
GENPEPT_466462       VSRELIEIG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESTS
GENPEPT_3192877      ISKELLEFS-HRDEVYKFE-AECLITQVNYSAKKCQM----------LLFINQRLVESTA
GENPEPT_825572       VASNLITFHISKVEDLNLESVDGKVCNLNFISKKSISP---------IFFINNRLVTCDL
GENPEPT_3880333      VADTILPLS-LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI
                     ::  :: .   .     :      : :     . .             .*** * * .  

GENPEPT_7595954      LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIE
GENPEPT_1724118      LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_466462       LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_3192877      LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVE
GENPEPT_825572       LRRALNSVYSNYLPKGNRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLH
GENPEPT_3880333      LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFE
                     *: .:: : .    :    *  : : :    :********..* ** ::.*:: :   ..

GENPEPT_7595954      SKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTGVASSSTSGSGDKVYAYQMVR
GENPEPT_1724118      SKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTGIASSSTSGSGDKVHAYQMVR
GENPEPT_466462       SKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVR
GENPEPT_3192877      ARLLGSNATRTFYKQLRLPGAP-----------------DLDETQLADKTQRIYPKEMVR
GENPEPT_825572       AELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRK
GENPEPT_3880333      KVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPMSQSLKSIEAIRKPDTKPEFK
                       :      .    .      .                    . .              :

GENPEPT_7595954      TDSRDQKLDAFLQPVSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAE
GENPEPT_1724118      TDSRDQKLDAFMQPVSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAAD
GENPEPT_466462       TDSREQKLDAFLQPLSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAK
GENPEPT_3192877      TDSTEQKLDKFLAPLVK-------------------------------------------
GENPEPT_825572       AKRQENKLVRIDASQAKITSFLSSS--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTL
GENPEPT_3880333      SSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTRGGAVGPTTSNDDIFGGSGILKRARTED
                     :.    *                                                     

GENPEPT_7595954      SENLERESLMETSDAAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRII
GENPEPT_1724118      SASLERESVIGASEVVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRII
GENPEPT_466462       NQSLEGDTTKGTSEMSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRII
GENPEPT_3192877      ----------------SDSGVSSSSSQEASRLPEES------------FRVTAAKKSREV
GENPEPT_825572       NESEQPRDANTINDNDLKDQPKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNV
GENPEPT_3880333      STGGEKEPEDLNTDFDDVSMVSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREF
                                                   :  .                         .

GENPEPT_7595954      NLTSVLSLQEEISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_1724118      NLTSVLSLQEEINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_466462       NLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_3192877      RLSSVLDMRKRVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL
GENPEPT_825572       NLTSIKKLREKVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL
GENPEPT_3880333      HFESIEVLRKEIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF
                     .: *:  :::.:       * . : .  :** :: .   .  *.   *:  :  ..  *:

GENPEPT_7595954      FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-----EYIVEF
GENPEPT_1724118      FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_466462       FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_3192877      FYQRMIYEFQNCSEITICPPLPLKELLILSLESRAAGWTPEDEDKAELA-----DGAADI
GENPEPT_825572       FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK---------EKIISK
GENPEPT_3880333      FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFANVENRFAAEKL
                     ***  : .* * .   :        : :  * .       :                 . 

GENPEPT_7595954      LKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
GENPEPT_1724118      LKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
GENPEPT_466462       LKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNYVPPLEGLPIFILRLATEVNW
GENPEPT_3192877      LLKKAPIMREYFGLRISEDGM--------LESLPSLLHQHRPCVAHLPVYLLRLATEVDW
GENPEPT_825572       IWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGYIPSLVKLPFFIYRLGKEVDW
GENPEPT_3880333      LAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYFVPQLEKLPFLIATLVLNVDY
                     : . : :: :*:.:.: :           :  :* *:. . * :  **. :  *  :*::

GENPEPT_7595954      DEEKECFESLSKECAMFYSIRKQYILEESTLSGQQSDMPGSTSKPWKWT--VEHIIYKAF
GENPEPT_1724118      DEE-ECFESLSKECAVFYSIRKQYILEESALSGQQSDMPGSPSKPWKWT--VEHIIYKAF
GENPEPT_466462       DEEKECFESLSKECAMFYSIRKQYISEESTLSGQQSEVPGSIPNSWKWT--VEHIVYKAL
GENPEPT_3192877      EQETRCFETFCRETARFY--------------AQLDWREGATAVFSRWT--MEHVLFPAF
GENPEPT_825572       EDEQECLDGILREIALLYIPDMVPKVDTSDASLSEDEKAQFINRKEHISSLLEHVLFPCI
GENPEPT_3880333      DDEQNTFRTICRAIGDLFTLDTN---------FITLDKKISAFSATPWKTLIKEVLMPLV
                     ::* . :  : :  . ::                              .  ::.::   .

GENPEPT_7595954      RSHLLPPKHFTEDGNVLQLANLPDLYKVFERC--
GENPEPT_1724118      RSHLLPPKHFTEDGNVLQLANLPDLCKVFERC--
GENPEPT_466462       RSHILPPKHFTEDGNILQLANLPDLYKVFERC--
GENPEPT_3192877      KKYLLPPR---IKDQIYELTNLPTLYKVFERC--
GENPEPT_825572       KRRFLAPRHILKD--VVEIANLPDLYKVFERC--
GENPEPT_3880333      KRKFIPPEHFKQAGVIRQLADSHDLYKVFERCGT
                     :  ::.*.       : ::::   * ******  




Clustal W dendrogram


Unrooted tree (generated by Phylip’s Drawtree)


Phylip-format dendrogram

(
GENPEPT_466462:0.05763,
(
GENPEPT_7595954:0.03521,
GENPEPT_1724118:0.04802)
:0.02510,
(
GENPEPT_3192877:0.24567,
(
GENPEPT_825572:0.33043,
GENPEPT_3880333:0.37568)
:0.04833)
:0.19089);



Clustal W options and diagnostic messages

Alignment type: Protein                 Alignment order: aligned                

                    Pairwise alignment parameters

Method: accurate                        
Matrix: Gonnet                          
Gap open penalty: 10.00                 Gap extension penalty: 0.10             

                    Multiple alignment parameters

Matrix: Gonnet                          Negative matrix?: no                    
Gap open penalty: 10.00                 Gap extension penalty: 0.20             
% identity for delay: 30                Residue-specific gap penalties: on      
Penalize end gaps: on                   Hydrophilic gap penalties: on           
Gap separation distance: 0              Hydrophilic residues: GPSNDQEKR         




 CLUSTAL W (1.81) Multiple Sequence Alignments



Sequence type explicitly set to Protein
Sequence format is Pearson
Sequence 1: GENPEPT_466462       756 aa
Sequence 2: GENPEPT_7595954      760 aa
Sequence 3: GENPEPT_1724118      757 aa
Sequence 4: GENPEPT_3192877      663 aa
Sequence 5: GENPEPT_825572       769 aa
Sequence 6: GENPEPT_3880333      779 aa
Start of Pairwise alignments
Aligning...
Sequences (1:2) Aligned. Score:  88
Sequences (1:3) Aligned. Score:  86
Sequences (1:4) Aligned. Score:  51
Sequences (1:5) Aligned. Score:  36
Sequences (1:6) Aligned. Score:  32
Sequences (2:3) Aligned. Score:  91
Sequences (2:4) Aligned. Score:  50
Sequences (2:5) Aligned. Score:  36
Sequences (2:6) Aligned. Score:  32
Sequences (3:4) Aligned. Score:  48
Sequences (3:5) Aligned. Score:  36
Sequences (3:6) Aligned. Score:  32
Sequences (4:5) Aligned. Score:  38
Sequences (4:6) Aligned. Score:  32
Sequences (5:6) Aligned. Score:  29
Time for pairwise alignment: 2.274260

Guide tree        file created:   [../tmp-dir/20758.CLUSTALW.dnd]
Start of Multiple Alignment
There are 5 groups
Aligning...
Group 1: Sequences:   2      Score:15725
Group 2: Sequences:   3      Score:15397
Group 3: Sequences:   4      Score:10921
Group 4: Sequences:   5      Score:10155
Group 5: Sequences:   6      Score:7685
Time for multiple alignment: 5.505672

Alignment Score 30082
CLUSTAL-Alignment file created  [../tmp-dir/20758.CLUSTALW.aln]










Citation


    Algorithm Citation:



    Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved
    software for multiple sequence alignment. Computer Applications in the
    Biosciences (CABIOS), 8(2):189-191.


    Thompson J.D., Higgins D.G., Gibson T.J. “CLUSTAL W: improving the
    sensitivity of progressive multiple sequence alignment through sequence
    weighting, position-specific gap penalties and weight matrix choice.”
    Nucleic Acids Res. 22:4673-4680(1994).


    Felsenstein, J. 1989. PHYLIP — Phylogeny Inference Package
    (Version 3.2). Cladistics 5: 164-166.


    Program Citation:



    CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson,
    modified; any errors are due to the modifications.


    PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package)
    version 3.5c. Distributed by the author. Department of Genetics,
    University of Washington, Seattle.




Copyright (C) 1999, Board of Trustees of the University of Illinois.

Read more here: Source link