Fasta label (*) | Workbench label |
---|---|
GENPEPT:3880333 | Caenorhabditis elegans cosmid T28A8, complete sequence_ |
GENPEPT:825572 | S.cerevisiae chromosome XIII cosmid 8520_ |
GENPEPT:3192877 | Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_ |
GENPEPT:1724118 | Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete |
GENPEPT:7595954 | Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds. |
GENPEPT:466462 | Human DNA mismatch repair (hmlh1) mRNA, complete cds_ |
(*) Clustalw cuts off Fasta labels after the first space (e.g. “>abc def” becomes “>abc”).
Sequence alignment
Consensus key (see documentation for details) * - single, fully conserved residue : - conservation of strong groups . - conservation of weak groups - no consensus CLUSTAL W (1.81) multiple sequence alignment GENPEPT_7595954 -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK GENPEPT_1724118 -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAK GENPEPT_466462 -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK GENPEPT_3192877 ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQ GENPEPT_825572 --------------------MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDAN GENPEPT_3880333 MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIKELVENSLDAG *: * ***::****:: * **:**: **.:** GENPEPT_7595954 STNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEA GENPEPT_1724118 STNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEA GENPEPT_466462 STSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEA GENPEPT_3192877 STHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEA GENPEPT_825572 ATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEA GENPEPT_3880333 ATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMHMKTYGFRGEA :* * : :: **:*::*: ***.**.: *: ::****:**** **** : *:****** GENPEPT_7595954 LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRR GENPEPT_1724118 LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRK GENPEPT_466462 LASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRR GENPEPT_3192877 LASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRR GENPEPT_825572 LASISHVARVTVTTKVKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRL GENPEPT_3880333 LASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITATDLFYNLPTRR ***:****::.: :* . :*.::..: :**: .** **::** * ***:*: * GENPEPT_7595954 KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA GENPEPT_1724118 KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA GENPEPT_466462 KALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSVFGNA GENPEPT_3192877 QALRSPAEEFQRLSEVLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAA GENPEPT_825572 RALRSHNDEYSKILDVVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKS GENPEPT_3880333 NKMTTHGEEAKMVNDTLLRFAIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRD . : . :* : :.: *:::* :.*: :: : . . . : : : . GENPEPT_7595954 VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA GENPEPT_1724118 VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA GENPEPT_466462 VSRELIEIG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESTS GENPEPT_3192877 ISKELLEFS-HRDEVYKFE-AECLITQVNYSAKKCQM----------LLFINQRLVESTA GENPEPT_825572 VASNLITFHISKVEDLNLESVDGKVCNLNFISKKSISP---------IFFINNRLVTCDL GENPEPT_3880333 VADTILPLS-LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI :: :: . . : : : . . .*** * * . GENPEPT_7595954 LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIE GENPEPT_1724118 LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE GENPEPT_466462 LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE GENPEPT_3192877 LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVE GENPEPT_825572 LRRALNSVYSNYLPKGNRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLH GENPEPT_3880333 LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFE *: .:: : . : * : : : :********..* ** ::.*:: : .. GENPEPT_7595954 SKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTGVASSSTSGSGDKVYAYQMVR GENPEPT_1724118 SKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTGIASSSTSGSGDKVHAYQMVR GENPEPT_466462 SKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVR GENPEPT_3192877 ARLLGSNATRTFYKQLRLPGAP-----------------DLDETQLADKTQRIYPKEMVR GENPEPT_825572 AELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRK GENPEPT_3880333 KVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPMSQSLKSIEAIRKPDTKPEFK : . . . . . : GENPEPT_7595954 TDSRDQKLDAFLQPVSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAE GENPEPT_1724118 TDSRDQKLDAFMQPVSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAAD GENPEPT_466462 TDSREQKLDAFLQPLSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAK GENPEPT_3192877 TDSTEQKLDKFLAPLVK------------------------------------------- GENPEPT_825572 AKRQENKLVRIDASQAKITSFLSSS--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTL GENPEPT_3880333 SSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTRGGAVGPTTSNDDIFGGSGILKRARTED :. * GENPEPT_7595954 SENLERESLMETSDAAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRII GENPEPT_1724118 SASLERESVIGASEVVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRII GENPEPT_466462 NQSLEGDTTKGTSEMSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRII GENPEPT_3192877 ----------------SDSGVSSSSSQEASRLPEES------------FRVTAAKKSREV GENPEPT_825572 NESEQPRDANTINDNDLKDQPKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNV GENPEPT_3880333 STGGEKEPEDLNTDFDDVSMVSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREF : . . GENPEPT_7595954 NLTSVLSLQEEISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL GENPEPT_1724118 NLTSVLSLQEEINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL GENPEPT_466462 NLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL GENPEPT_3192877 RLSSVLDMRKRVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL GENPEPT_825572 NLTSIKKLREKVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL GENPEPT_3880333 HFESIEVLRKEIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF .: *: :::.: * . : . :** :: . . *. *: : .. *: GENPEPT_7595954 FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-----EYIVEF GENPEPT_1724118 FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-----EYIVEF GENPEPT_466462 FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-----EYIVEF GENPEPT_3192877 FYQRMIYEFQNCSEITICPPLPLKELLILSLESRAAGWTPEDEDKAELA-----DGAADI GENPEPT_825572 FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK---------EKIISK GENPEPT_3880333 FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFANVENRFAAEKL *** : .* * . : : : * . : . GENPEPT_7595954 LKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW GENPEPT_1724118 LKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW GENPEPT_466462 LKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNYVPPLEGLPIFILRLATEVNW GENPEPT_3192877 LLKKAPIMREYFGLRISEDGM--------LESLPSLLHQHRPCVAHLPVYLLRLATEVDW GENPEPT_825572 IWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGYIPSLVKLPFFIYRLGKEVDW GENPEPT_3880333 LAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYFVPQLEKLPFLIATLVLNVDY : . : :: :*:.:.: : : :* *:. . * : **. : * :*:: GENPEPT_7595954 DEEKECFESLSKECAMFYSIRKQYILEESTLSGQQSDMPGSTSKPWKWT--VEHIIYKAF GENPEPT_1724118 DEE-ECFESLSKECAVFYSIRKQYILEESALSGQQSDMPGSPSKPWKWT--VEHIIYKAF GENPEPT_466462 DEEKECFESLSKECAMFYSIRKQYISEESTLSGQQSEVPGSIPNSWKWT--VEHIVYKAL GENPEPT_3192877 EQETRCFETFCRETARFY--------------AQLDWREGATAVFSRWT--MEHVLFPAF GENPEPT_825572 EDEQECLDGILREIALLYIPDMVPKVDTSDASLSEDEKAQFINRKEHISSLLEHVLFPCI GENPEPT_3880333 DDEQNTFRTICRAIGDLFTLDTN---------FITLDKKISAFSATPWKTLIKEVLMPLV ::* . : : : . :: . ::.:: . GENPEPT_7595954 RSHLLPPKHFTEDGNVLQLANLPDLYKVFERC-- GENPEPT_1724118 RSHLLPPKHFTEDGNVLQLANLPDLCKVFERC-- GENPEPT_466462 RSHILPPKHFTEDGNILQLANLPDLYKVFERC-- GENPEPT_3192877 KKYLLPPR---IKDQIYELTNLPTLYKVFERC-- GENPEPT_825572 KRRFLAPRHILKD--VVEIANLPDLYKVFERC-- GENPEPT_3880333 KRKFIPPEHFKQAGVIRQLADSHDLYKVFERCGT : ::.*. : :::: * ******
Clustal W dendrogram
Unrooted tree (generated by Phylip’s Drawtree)
Phylip-format dendrogram
( GENPEPT_466462:0.05763, ( GENPEPT_7595954:0.03521, GENPEPT_1724118:0.04802) :0.02510, ( GENPEPT_3192877:0.24567, ( GENPEPT_825572:0.33043, GENPEPT_3880333:0.37568) :0.04833) :0.19089);
Clustal W options and diagnostic messages
Alignment type: Protein Alignment order: aligned Pairwise alignment parameters Method: accurate Matrix: Gonnet Gap open penalty: 10.00 Gap extension penalty: 0.10 Multiple alignment parameters Matrix: Gonnet Negative matrix?: no Gap open penalty: 10.00 Gap extension penalty: 0.20 % identity for delay: 30 Residue-specific gap penalties: on Penalize end gaps: on Hydrophilic gap penalties: on Gap separation distance: 0 Hydrophilic residues: GPSNDQEKR CLUSTAL W (1.81) Multiple Sequence Alignments Sequence type explicitly set to Protein Sequence format is Pearson Sequence 1: GENPEPT_466462 756 aa Sequence 2: GENPEPT_7595954 760 aa Sequence 3: GENPEPT_1724118 757 aa Sequence 4: GENPEPT_3192877 663 aa Sequence 5: GENPEPT_825572 769 aa Sequence 6: GENPEPT_3880333 779 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 88 Sequences (1:3) Aligned. Score: 86 Sequences (1:4) Aligned. Score: 51 Sequences (1:5) Aligned. Score: 36 Sequences (1:6) Aligned. Score: 32 Sequences (2:3) Aligned. Score: 91 Sequences (2:4) Aligned. Score: 50 Sequences (2:5) Aligned. Score: 36 Sequences (2:6) Aligned. Score: 32 Sequences (3:4) Aligned. Score: 48 Sequences (3:5) Aligned. Score: 36 Sequences (3:6) Aligned. Score: 32 Sequences (4:5) Aligned. Score: 38 Sequences (4:6) Aligned. Score: 32 Sequences (5:6) Aligned. Score: 29 Time for pairwise alignment: 2.274260 Guide tree file created: [../tmp-dir/20758.CLUSTALW.dnd] Start of Multiple Alignment There are 5 groups Aligning... Group 1: Sequences: 2 Score:15725 Group 2: Sequences: 3 Score:15397 Group 3: Sequences: 4 Score:10921 Group 4: Sequences: 5 Score:10155 Group 5: Sequences: 6 Score:7685 Time for multiple alignment: 5.505672 Alignment Score 30082 CLUSTAL-Alignment file created [../tmp-dir/20758.CLUSTALW.aln]
Citation
Algorithm Citation:
Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved
software for multiple sequence alignment. Computer Applications in the
Biosciences (CABIOS), 8(2):189-191.
Thompson J.D., Higgins D.G., Gibson T.J. “CLUSTAL W: improving the
sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice.”
Nucleic Acids Res. 22:4673-4680(1994).
Felsenstein, J. 1989. PHYLIP — Phylogeny Inference Package
(Version 3.2). Cladistics 5: 164-166.
Program Citation:
CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson,
modified; any errors are due to the modifications.
PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package)
version 3.5c. Distributed by the author. Department of Genetics,
University of Washington, Seattle.
Copyright (C) 1999, Board of Trustees of the University of Illinois.
Read more here: Source link