Examples for the input submission that resulted in the output below:
The following information has been received by the server:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
b.rost
EMBL, 69012 Heidelberg, Europe
rost@embl-heidelberg.de
# CYTOCHROME C OXIDASE POLYPEPTIDE I (cox1_parde)
MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ
YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP
DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI
FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT
MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF
GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW
GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF
AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW
NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET
LPKPEDWDRAQAHR
________________________________________________________________________________
The sequence had been interpreted as being:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
>P1; t2
(#) cytochrome c oxidase polypeptide i (cox1_parde)
MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ
YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP
DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI
FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT
MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF
GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW
GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF
AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW
NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET
LPKPEDWDRAQAHR
The alignment that has been used as input to the network is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
--- ------------------------------------------------------------
--- MAXHOM multiple sequence alignment
--- ------------------------------------------------------------
---
--- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY
--- ID : identifier of aligned (homologous) protein
--- STRID : PDB identifier (only for known structures)
--- PIDE : percentage of pairwise sequence identity
--- WSIM : percentage of weighted similarity
--- LALI : number of residues aligned
--- NGAP : number of insertions and deletions (indels)
--- LGAP : number of residues in all indels
--- LSEQ2 : length of aligned sequence
--- ACCNUM : SwissProt accession number
--- NAME : one-line description of aligned protein
---
--- MAXHOM ALIGNMENT HEADER: SUMMARY
ID STRID IDE WSIM LALI NGAP LGAP LEN2 ACCNUM NAME
cox1_parde 100 100 554 0 0 554 P08305 SUBUNIT 1).
cox1_rhosh 78 84 550 3 13 565 P33517 SUBUNIT 1).
cox1_scapl 67 76 153 2 3 157 P29654 CYTOCHROME C OXIDASE POLY
cox1_gomva 67 76 152 2 3 155 P29646 CYTOCHROME C OXIDASE POLY
cox1_polsp 67 76 155 2 3 158 P29650 CYTOCHROME C OXIDASE POLY
cox1_lepsp 67 76 154 2 3 157 P29644 CYTOCHROME C OXIDASE POLY
cox1_megat 67 76 157 2 3 160 P29648 CYTOCHROME C OXIDASE POLY
cox1_lepoc 67 76 157 2 3 163 P29647 CYTOCHROME C OXIDASE POLY
cox1_pomni 66 76 158 2 3 161 P29652 CYTOCHROME C OXIDASE POLY
cox1_saltr 64 76 107 1 1 109 P29653 CYTOCHROME C OXIDASE POLY
cox1_geosd 62 72 149 2 3 152 P29645 CYTOCHROME C OXIDASE POLY
cox1_braja 61 70 513 4 20 541 P31833 CYTOCHROME C OXIDASE POLY
cox1_panbu 61 72 181 2 3 184 P29649 CYTOCHROME C OXIDASE POLY
cox1_prowi 61 69 506 6 27 514 Q05143 CYTOCHROME C OXIDASE POLY
cox1_polsx 60 71 181 2 3 184 P29651 CYTOCHROME C OXIDASE POLY
cox1_amica 60 72 185 2 3 188 P29643 CYTOCHROME C OXIDASE POLY
cox1_marpo 60 69 512 5 25 522 P26856 CYTOCHROME C OXIDASE POLY
cox1_maize 60 69 511 5 27 528 P08742 CYTOCHROME C OXIDASE POLY
cox1_orysa 60 69 511 5 27 524 P14578 CYTOCHROME C OXIDASE POLY
cox1_wheat 60 69 511 5 27 524 P08741 CYTOCHROME C OXIDASE POLY
cox1_sorbi 59 69 519 5 27 530 P05502 CYTOCHROME C OXIDASE POLY
cox1_betvu 59 66 505 6 27 516 P24794 CYTOCHROME C OXIDASE POLY
cox1_soybn 57 66 509 5 27 527 P07506 CYTOCHROME C OXIDASE POLY
cox1_oenbe 56 65 509 5 27 527 P08743 CYTOCHROME C OXIDASE POLY
cox1_pea 56 66 509 5 27 527 P12786 CYTOCHROME C OXIDASE POLY
cox1_parli 56 67 502 7 31 517 P12700 CYTOCHROME C OXIDASE POLY
cox1_crola 56 67 499 6 30 516 P34188 CYTOCHROME C OXIDASE POLY
cox1_cypca 56 66 499 6 30 516 P24985 CYTOCHROME C OXIDASE POLY
cox1_strpu 56 66 502 7 31 517 P15544 CYTOCHROME C OXIDASE POLY
cox1_pisoc 56 66 504 7 31 517 P25001 CYTOCHROME C OXIDASE POLY
cox1_chick 56 65 500 7 31 515 P18943 CYTOCHROME C OXIDASE POLY
cox1_triru 55 66 508 5 38 528 Q01555 CYTOCHROME C OXIDASE POLY
cox1_yeast 55 66 503 5 27 512 P00401 CYTOCHROME C OXIDASE POLY
cox1_podan 55 66 515 5 38 541 P20681 CYTOCHROME C OXIDASE POLY
cox1_mouse 55 66 501 6 30 514 P00397 CYTOCHROME C OXIDASE POLY
cox1_rat 55 66 501 6 30 514 P05503 CYTOCHROME C OXIDASE POLY
cox1_human 55 66 501 6 30 513 P00395 CYTOCHROME C OXIDASE POLY
cox1_neucr 55 66 525 5 38 557 P03945 CYTOCHROME C OXIDASE POLY
cox1_emeni 55 66 524 5 38 567 P00402 CYTOCHROME C OXIDASE POLY
cox1_xenla 55 66 499 6 30 519 P00398 CYTOCHROME C OXIDASE POLY
cox1_bovin 55 66 501 6 30 514 P00396 CYTOCHROME C OXIDASE POLY
cox1_balph 55 66 501 6 30 516 P24983 CYTOCHROME C OXIDASE POLY
cox1_balmu 54 66 501 6 30 516 P41293 CYTOCHROME C OXIDASE POLY
cox1_chlre 54 63 494 6 33 505 P08681 CYTOCHROME C OXIDASE POLY
cox1_didma 54 66 500 6 30 513 P41310 CYTOCHROME C OXIDASE POLY
cox1_drome 54 65 507 6 30 512 P00399 CYTOCHROME C OXIDASE POLY
cox1_droya 54 65 507 6 30 512 P00400 CYTOCHROME C OXIDASE POLY
cox1_halgr 54 66 501 6 30 514 P38595 CYTOCHROME C OXIDASE POLY
cox1_phovi 54 66 501 6 30 514 Q00527 CYTOCHROME C OXIDASE POLY
cox1_anoqu 54 65 507 6 30 514 P33504 CYTOCHROME C OXIDASE POLY
cox1_anoga 53 65 507 6 30 514 P34838 CYTOCHROME C OXIDASE POLY
cox1_cotja 52 63 100 1 17 102 P24984 CYTOCHROME C OXIDASE POLY
cox1_schpo 52 63 510 8 41 537 P07657 CYTOCHROME C OXIDASE POLY
cox1_apime 50 62 501 6 31 521 P20374 CYTOCHROME C OXIDASE POLY
cox1_ascsu 48 59 503 6 29 525 P24881 CYTOCHROME C OXIDASE POLY
cox1_caeel 48 59 509 6 29 525 P24893 CYTOCHROME C OXIDASE POLY
cox1_thep3 46 52 504 8 32 615 P16262 SUBUNIT 1).
cox1_bacfi 45 52 514 7 30 624 Q04440 SUBUNIT 1).
cox1_syny3 44 51 515 5 29 533 Q06473 SUBUNIT 1).
cox1_bacsu 41 49 526 8 32 621 P24010 SUBUNIT 1) (CAA-3605 SUBU
qox1_bacsu 39 46 530 9 43 649 P34956 SUBUNIT QOXB).
cox1_leita 38 49 501 4 24 549 P14544 CYTOCHROME C OXIDASE POLY
cyob_ecoli 38 43 520 6 25 663 P18401 CYTOCHROME O UBIQUINOL OX
qoxm_sulac 37 42 477 9 37 788 P39481 QUINOL OXIDASE POLYPEPTID
cox1_halha 37 45 530 6 27 593 P33518 SUBUNIT 1).
cox1_trybb 37 48 501 4 24 549 P04371 CYTOCHROME C OXIDASE POLY
cox1_parte 32 39 518 8 148 645 P05489 CYTOCHROME C OXIDASE POLY
cox1_tetpy 32 41 543 6 150 698 P11947 CYTOCHROME C OXIDASE POLY
---
--- MAXHOM ALIGNMENT: IN MSF FORMAT
MSF of: /home/phd/tmp/t2_12833.hssp from: 1 to: 554
/home/phd/tmp/t2_12833.ret_msf MSF: 554 Type: P 15-Nov-95 04:01:3 Check: 3510 ..
Name: t2_12833 Len: 554 Check: 3342 Weight: 1.00
Name: cox1_parde Len: 554 Check: 3342 Weight: 1.00
Name: cox1_rhosh Len: 554 Check: 2597 Weight: 1.00
Name: cox1_scapl Len: 554 Check: 6174 Weight: 1.00
Name: cox1_gomva Len: 554 Check: 4345 Weight: 1.00
Name: cox1_polsp Len: 554 Check: 8332 Weight: 1.00
Name: cox1_lepsp Len: 554 Check: 7195 Weight: 1.00
Name: cox1_megat Len: 554 Check: 1022 Weight: 1.00
Name: cox1_lepoc Len: 554 Check: 689 Weight: 1.00
Name: cox1_pomni Len: 554 Check: 1867 Weight: 1.00
Name: cox1_saltr Len: 554 Check: 6114 Weight: 1.00
Name: cox1_geosd Len: 554 Check: 6855 Weight: 1.00
Name: cox1_braja Len: 554 Check: 665 Weight: 1.00
Name: cox1_panbu Len: 554 Check: 7209 Weight: 1.00
Name: cox1_prowi Len: 554 Check: 482 Weight: 1.00
Name: cox1_polsx Len: 554 Check: 6803 Weight: 1.00
Name: cox1_amica Len: 554 Check: 2170 Weight: 1.00
Name: cox1_marpo Len: 554 Check: 5772 Weight: 1.00
Name: cox1_maize Len: 554 Check: 8474 Weight: 1.00
Name: cox1_orysa Len: 554 Check: 7904 Weight: 1.00
Name: cox1_wheat Len: 554 Check: 8039 Weight: 1.00
Name: cox1_sorbi Len: 554 Check: 8243 Weight: 1.00
Name: cox1_betvu Len: 554 Check: 4687 Weight: 1.00
Name: cox1_soybn Len: 554 Check: 1559 Weight: 1.00
Name: cox1_oenbe Len: 554 Check: 9577 Weight: 1.00
Name: cox1_pea Len: 554 Check: 1152 Weight: 1.00
Name: cox1_parli Len: 554 Check: 7876 Weight: 1.00
Name: cox1_crola Len: 554 Check: 9853 Weight: 1.00
Name: cox1_cypca Len: 554 Check: 1625 Weight: 1.00
Name: cox1_strpu Len: 554 Check: 9042 Weight: 1.00
Name: cox1_pisoc Len: 554 Check: 2768 Weight: 1.00
Name: cox1_chick Len: 554 Check: 8102 Weight: 1.00
Name: cox1_triru Len: 554 Check: 1564 Weight: 1.00
Name: cox1_yeast Len: 554 Check: 9105 Weight: 1.00
Name: cox1_podan Len: 554 Check: 8167 Weight: 1.00
Name: cox1_mouse Len: 554 Check: 8729 Weight: 1.00
Name: cox1_rat Len: 554 Check: 1347 Weight: 1.00
Name: cox1_human Len: 554 Check: 1065 Weight: 1.00
Name: cox1_neucr Len: 554 Check: 6844 Weight: 1.00
Name: cox1_emeni Len: 554 Check: 5797 Weight: 1.00
Name: cox1_xenla Len: 554 Check: 9868 Weight: 1.00
Name: cox1_bovin Len: 554 Check: 587 Weight: 1.00
Name: cox1_balph Len: 554 Check: 3044 Weight: 1.00
Name: cox1_balmu Len: 554 Check: 2779 Weight: 1.00
Name: cox1_chlre Len: 554 Check: 2203 Weight: 1.00
Name: cox1_didma Len: 554 Check: 7872 Weight: 1.00
Name: cox1_drome Len: 554 Check: 8497 Weight: 1.00
Name: cox1_droya Len: 554 Check: 1096 Weight: 1.00
Name: cox1_halgr Len: 554 Check: 9892 Weight: 1.00
Name: cox1_phovi Len: 554 Check: 319 Weight: 1.00
Name: cox1_anoqu Len: 554 Check: 9265 Weight: 1.00
Name: cox1_anoga Len: 554 Check: 1521 Weight: 1.00
Name: cox1_cotja Len: 554 Check: 7055 Weight: 1.00
Name: cox1_schpo Len: 554 Check: 6761 Weight: 1.00
Name: cox1_apime Len: 554 Check: 3689 Weight: 1.00
Name: cox1_ascsu Len: 554 Check: 252 Weight: 1.00
Name: cox1_caeel Len: 554 Check: 393 Weight: 1.00
Name: cox1_thep3 Len: 554 Check: 9619 Weight: 1.00
Name: cox1_bacfi Len: 554 Check: 9313 Weight: 1.00
Name: cox1_syny3 Len: 554 Check: 5842 Weight: 1.00
Name: cox1_bacsu Len: 554 Check: 7123 Weight: 1.00
Name: qox1_bacsu Len: 554 Check: 8281 Weight: 1.00
Name: cox1_leita Len: 554 Check: 5925 Weight: 1.00
Name: cyob_ecoli Len: 554 Check: 2221 Weight: 1.00
Name: qoxm_sulac Len: 554 Check: 7747 Weight: 1.00
Name: cox1_halha Len: 554 Check: 6043 Weight: 1.00
Name: cox1_trybb Len: 554 Check: 6181 Weight: 1.00
Name: cox1_parte Len: 554 Check: 5633 Weight: 1.00
Name: cox1_tetpy Len: 554 Check: 7995 Weight: 1.00
//
1 50
t2_12833 MSAQISDSIE EKRGFFTRWF MSTNHKDIGV LYLFTAGLAG LISVTLTVYM
cox1_parde MSAQISDSIE EKRGFFTRWF MSTNHKDIGV LYLFTAGLAG LISVTLTVYM
cox1_rhosh ..AAIHGHEH DRRGFFTRWF MSTNHKDIGV LYLFTGGLVG LISVAFTVYM
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja .......... .....WRRYV YSTNHKDIGT MYLIFAVIAG VIGAAMSIAI
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi .......... ....MVTRWL YSTNHKDIGT MYLIFGAFSG VLGTVFSLLI
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo .......... ....FAQRWL FSTNHKDIGT LYLIFGAIAG VMGTCFSVLI
cox1_maize .......... .....LVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_orysa .......... .....LVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_wheat .......... .....MVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_sorbi .......... .....LVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_betvu .......... .......... VSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_soybn .......... .......RWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_oenbe .......... .......RWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_pea .......... .......RWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_parli .......... .....LSRWL FSTNHKDIGT LYLIFGAWAG MVGTAMSVII
cox1_crola .......... ......TRWF FSTNHKDIGT LYLVFGAWAG MVGTALSLLI
cox1_cypca .......... ......TRWF FSTNHKDIGT LYLVFGAWAG MVGTALSLLI
cox1_strpu .......... .....LSRWL FSTNHKDIGT LYLIFGAWAG MVGTAMSVII
cox1_pisoc .......... .....LSRWL FSTNHKDIGT LYLIFGAWAG MIGTAMSVII
cox1_chick .......... ....FINRWL FSTNHKDIGT LYLIFGTWAG MAGTALSLLI
cox1_triru .......... ......ERWF LSTNAKDIGT LYLMFRYFSG LVGTAFSVLI
cox1_yeast .......... ....MVQRWL YSTNAKDIAV LYFMLAIFSG MAGTAMSLII
cox1_podan .......... ....WIERWM LSTNAKDIGN LYLIFALFSG LLGTAFSVLI
cox1_mouse .......... ....FINRWL FSTNHKDIGT LYLLFGAWAG MVGTALSILI
cox1_rat .......... ....FVNRWL FSTNHKDIGT LYLLFGAWAG MVGTALSILI
cox1_human .......... ....FADRWL FSTNHKDIGT LYLLFGAWAG VLGTALSLLI
cox1_neucr ........MS SISIWTERWF LSTNAKDIGV LYLIFALFSG LLGTAFSVLI
cox1_emeni IESSSFLTFK QPTEWQERWY LSSNAKDIGT LYLMFALFSG LLGTAFSVLI
cox1_xenla .......... ......TRWL FSTNHKDIGT LYLVFGAWAG LVGTALSLLI
cox1_bovin .......... ....FINRWL FSTNHKDIGT LYLLFGAWAG MVGTALSLLI
cox1_balph .......... ....FMNRWL FSTNHKDIGT LYLLFGAWAG MVGTGLSLLI
cox1_balmu .......... ....FMNRWL FSTNHKDIGT LYLLFGAWAG MVGTGLSLLI
cox1_chlre .......... .......RWL YSTSHKDIGL LYLVFAFFGG LLGTSLSMLI
cox1_didma .......... ....FINRWL FSTNHKDIGT LYLLFGAWAG MVGTALSLLI
cox1_drome .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_droya .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_halgr .......... ....FMDRWL FSTNHKDIGT LYLLFGAWAG MAGTALSLLI
cox1_phovi .......... ....FMNRWL FSTNHKDIGT LYLLFGAWAG MVGTALSLLI
cox1_anoqu .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_anoga .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_cotja .......... ....FINRWL FSTNHKDIGT LYLIFGTWAG MAGTALSLLI
cox1_schpo .......... ....YVNRWI FSTNAKDIAI LYLLFGLVSG IIGSVFSFII
cox1_apime .......... .....MMKWF MSTNHKNIGI LYIILALWSG MLGSSMSLII
cox1_ascsu .......... ..QGGLSVWL ESSNHKDIGT LYFLFGLWSG MVGTSLSLVI
cox1_caeel ......NLYK KYQGGLAVWL ESSNHKDIGT LYFIFGLWSG MVGTSFSLLI
cox1_thep3 .......... ........YL TTVDHKKIAH LYLISGGFFF LLGGLEALFI
cox1_bacfi .........K QEKSVIWDWL TTVDHKKIAI MYLIAGTLFF VKAGVMALFM
cox1_syny3 IAAENLTANH PRRKWTDYFT FCVDHKVIGI QYLVTSFLFF FIGGSFAEAM
cox1_bacsu LNALTEK..R TRGSMLWDYL TTVDHKKIAI LYLVAGGFFF LVGGIEAMFI
qox1_bacsu LGAQVstYFK KWKWLWSEWI TTVDHKKLGI MYIISAVIML FRGGVDGLMM
cox1_leita .......... .......... LSVSHKMIGL CYLLVAILSG FVGYVYSLFI
cyob_ecoli .......... ....LWKEWL TSVDHKRLGI MYIIVAIVML LRGFADAIMM
qoxm_sulac .......... .........L YTTNASDVGQ MYIVLGIVAL IIGSVNAALI
cox1_halha LGERTGYTHE EKPGGIIRWF TTVDHKDIGI LYGVYGTIAF AWGGVSVLLM
cox1_trybb .......... .......... LSVSHKMIGI CYLLVAILCG FIGYIYSLFI
cox1_parte .......... .......... ...NHKRIAL NYFYFSMWTG LSGAALATMI
cox1_tetpy IKKLFTYLND LRKHILKKYV YTINHKRIAI NYLYFSMVTG LSGAALATMI
51 100
t2_12833 RMELQHPGVQ YMCLEGMRLV ADAAAECTPN AHLWNVVVTY HGILMMFFVV
cox1_parde RMELQHPGVQ YMCLEGMRLV ADAAAECTPN AHLWNVVVTY HGILMMFFVV
cox1_rhosh RMELMAPGVQ FMCAEHlsLW PSAVENCTPN GHLWNVMITG HGILMMFFVV
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja RAELMYPGVQ IFH....... .........E THTYNVFVTS HGLIMIFFMV
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi RMELAQPGNQ IL........ .......NGN HQLYNVIITA HAFLMIFFML
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo RMELAQPGNQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_maize RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_orysa RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_wheat RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_sorbi RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_betvu RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_soybn RMELARPGDQ IL........ .......GGN HQLYNVLITG HAFLMIFFMV
cox1_oenbe RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_pea RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFFMIFFMV
cox1_parli RAELAQPGSL LN........ .........D DQIYNVVVTA HALVMIFFMV
cox1_crola RAELNQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_cypca RAELSQPGSL LS........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_strpu RAELAQPGSL LN........ .........D DQIYNVVVTA HALVMIFFMV
cox1_pisoc RTELAQPGSL LQ........ .........D DQIYNVIVTA HALVMIFFMV
cox1_chick RAELGQPGTL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_triru RLELSAPGVQ YI........ ........AD NQLYNSIITA HAILMIFFMV
cox1_yeast RLELAAPGSQ YL........ .......HGN SQLFNVLVVG HAVLMIFFLV
cox1_podan RMELSGPSVQ YI........ ........AD NQLYNSIITA HALLMIFFMV
cox1_mouse RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_rat RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_human RAELGQPGNL LG........ .........N DHIYNVIVTA HAFVMIFFMV
cox1_neucr RMELSGPGVQ YI........ ........AD NQLYNAIITA HAILMIFFMV
cox1_emeni RLELSGPGVQ YI........ ........AD NQLYNSIITA HAIMMIFFMV
cox1_xenla RAELSQPGTL LG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_bovin RAELGQPGTL LG........ .........D DQIYNVVVTA HAFVMIFFMV
cox1_balph RAELGQPGTL IG........ .........D DQVYNVLVTA HAFVMIFFMV
cox1_balmu RAELGQPGTL IG........ .........D DQVYNVLVTA HAFVMIFFMV
cox1_chlre RYELALPGRG LL........ .......DGN GQLYNVIITG HGIIMLLFMV
cox1_didma RAELGQPGTL IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_drome RAELGHPGAL IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_droya RAELGHPGAL IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_halgr RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_phovi RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_anoqu RAELGHPGAF IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_anoga RAELGHPGAF IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_cotja RAELGQPGTL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_schpo RMELSAPGSQ FL........ .......SGN GQLYNVAISA HGILMIFFFI
cox1_apime RMELSSPGSW IS........ .........N DQIYNTIVTS HAFLMIFFMV
cox1_ascsu RLELAKPGLL LG........ .........S GQLYNSVITA HAILMIFFMV
cox1_caeel RLELAKPGFF LS........ .........N GQLYNSVITA HAILMIFFMV
cox1_thep3 RIQLAKPNND FLV....... .......... GGLYNEVLTM HGTTMIFLAA
cox1_bacfi RIQLMYPEMN FL........ .........S GQTFNEFITM HGTIMLFLAA
cox1_syny3 RTELATPSPD FV........ .........Q PEMYNQLMTL HGTIMIFLWI
cox1_bacsu RIQLAKPENA FL........ .........S AQAYNEVMTM HGTTMIFLAA
qox1_bacsu RAQLALPNNS FL........ .........D SNHYNEIFTT HGTIMIIFMA
cox1_leita RLELSLIGCG IL........ .......FGD YQFYNVLITS HGLIMVFAFI
cyob_ecoli RSQQALASAG EAGFLP.... .......... PHHYDQIFTA HGVIMIFFVA
qoxm_sulac RDQLSFNNL. .......... .........N AVDYYDAVTL HGIFMIFFVV
cox1_halha RTELATSSET LI........ .........S PSLYNGLLTS HGITMLFLFG
cox1_trybb RLELSLIGCG VL........ .......FGD YQFYNVLITS HGLIMVFAFI
cox1_parte RLEMAYPGSP FF........ .......KGD SIKYLQVATA HGLIMVFFVV
cox1_tetpy RMELAHPESP FFKGDSLR.. .......... ...YLQVVTA HGLIMVFFVV
101 150
t2_12833 IPALFGGFGN YFMPLHIGAP DMAFPRLNNL SYWLYVCGVS LAIASLLSPG
cox1_parde IPALFGGFGN YFMPLHIGAP DMAFPRLNNL SYWLYVCGVS LAIASLLSPG
cox1_rhosh IPALFGGFGN YFMPLHIGAP DMAFPRMNNL SYWLYVAGTS LAVASLFAPG
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja MPAMIGGFGN WFVPLMIGAP DMAFPRMNNI SFWLLPASFG LLLMSTFVEG
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi MPALMGGFGN WFLPILIGAP DMAFPRLNNI SFWLLPPSLL LLVSSALVEV
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo MPAMIGGFGN WFVPILIGSP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_maize MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_orysa MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_wheat MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_sorbi MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_betvu MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_soybn MPAMIGGSGN WSVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_oenbe MPAMIGGSGN WSVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_pea MPAMIGGSGN WSVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_parli MPIMIGGFGN WLIPLMIGAP DMAFPRMNNM SFWLIPPSFI LLLASAGVES
cox1_crola MPILIGGFGN WLVPLMIGAP HMAFPRMNNM SFWLLPPSFL LLLASSGVEA
cox1_cypca MPILIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSGVEA
cox1_strpu MPIMIGGFGN WLIPLMIGAP DMAFPRMNNM SFWLIPPSFI LLLASAGVEN
cox1_pisoc MPIMIGGFGN WLIPLMIGAP DMAFPRMNNM SFWLIPPSFL LLLASAGVES
cox1_chick MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSTVEA
cox1_triru MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLIPSLL LFVFASIIEN
cox1_yeast MPALIGGFGN YLLPLMIGAT DTAFPRINNI AFWVLPMGLV CLVTSTLVES
cox1_podan MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLPPSLI LLVFSACIEG
cox1_mouse MPMMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_rat MPMMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_human MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSLL LLLASAMVEA
cox1_neucr MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLPPSLL LLVFSACIEG
cox1_emeni MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLVPSLL LFVFSATIEN
cox1_xenla MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSGVEA
cox1_bovin MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_balph MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLMASSMIEA
cox1_balmu MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLMASSMIEA
cox1_chlre MPALFGGFGN WLLPIMIGAP DMAFPRLNNI SFWLNPPALA LLLLSTLVEQ
cox1_didma MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSTIEA
cox1_drome MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWLLPPALS LLLVSSMVEN
cox1_droya MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWLLPPALS LLLVSSMVEN
cox1_halgr MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_phovi MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_anoqu MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWMLPPSLT LLISSSMVEN
cox1_anoga MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWMLPPSLT LLISSSMVEN
cox1_cotja MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM S......... ..........
cox1_schpo IPALFGAFGN YLVPLMIGAP DVAYPRVNNF TFWLLPPALM LLLISALTEE
cox1_apime MPFLIGGFGN WLIPLMLGSP DMAFPRMNNI SFWLLPPSLF MLLLSNLFYP
cox1_ascsu MPTMIGGFGN WMLPLMLGAP DMSFPRLNNL SFWLLPTAMF LILDACFVDM
cox1_caeel MPTMIGGFGN WLLPLMLGAP DMSFPRLNNL SFWLLPTSML LILDACFVDM
cox1_thep3 MPLVFA.FMN AVVPLQIGAR DVAFPFLNAL GFWMFFFGGL FLNCSWFLGG
cox1_bacfi TPLLFA.FMN YVIPLQIGAR DVAFPFVNAL GFWIFFFGGL LLSLSWFFGG
cox1_syny3 VPA.GAAFAN YLIPLMVGTE DMAFPRLNAV AFWLTPPGGI LLISSFFVGA
cox1_bacsu MPLLFA.LMN AVVPLQIGAR DVSFPFLNAL GFWLFFFGGI FLNLSWFLGG
qox1_bacsu MP.FLIGLIN VVVPLQIGAR DVAFPYLNNL SFWTFFVGAM LFNISFVIGG
cox1_leita MPVMMGGLVN YFIPVMAGFP DMVFPRLNNM SFWMYLAGFG CVVNGFLTEE
cyob_ecoli MP.FVIGLMN LVVPLQIGAR DVAFPFLNNL SFWFTVVGVI LVNVSLGVGE
qoxm_sulac MP.LSTGFAN YLVPRMIGAH DLYWPKINAL SFWMLVPAVI LAAISPLLGA
cox1_halha TP.MIAAFGN YFIPLLIDAD DMAFPRINAI AFWLLPPGAI LIWSGFLIPG
cox1_trybb MPITMGGFTN YFAPVMVGFP DMVFPRINNM SFWMFIGGFG CLVSGFLTEE
cox1_parte VPIFFGGFAN FLIPYHVGSK DVAFPRLNSI GFWIQPLGFL LVAKIAFLRT
cox1_tetpy VPILFGGFAN FLIPYHVGSK DVAYPRLNSI GFWIQPCGYI LLAKIGFLRP
151 200
t2_12833 GSDQPGAGVG WVLYPPLSTT EAGYAMDLAI FAVHVSGATS ILGAINIITT
cox1_parde GSDQPGAGVG WVLYPPLSTT EAGYAMDLAI FAVHVSGATS ILGAINIITT
cox1_rhosh GNGQLGSGIG WVLYPPLSTS ESGYSTDLAI FAVHLSGASS ILGAINMITT
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja EPGANGVGAG WTMYVPLSSS gpGPAVDFAI LSLHLAGASS ILGAINFITT
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi GA.....GTG WTVYPPLASI asGGSVDLAI FSLHLAGVSS ILGAINFICT
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo GC.....GSG WTVYPPLSGI tsGGSVDLAI FSLHLSGVSS ILGSINFITT
cox1_maize GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_orysa GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_wheat GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGISS ILGSINFITT
cox1_sorbi GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_betvu GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_soybn GS.....GTG WTVYPPLSGI tsGGAVDSAI SSLHLSGVSS ILGSINFITT
cox1_oenbe GS.....GTG WTVYPPLSGI tsGGAVDSAI SSLHLSGVSS ILGSINFITT
cox1_pea GS.....GTG WTVYPPLSGI tsGGAVDSAI SSLHLSGVSS ILGSINFLTT
cox1_parli GA.....GTG WTIYPPLSSn hAGGSVDLAI FSLHLAGASS ILASINFITT
cox1_crola GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_cypca GA.....GTG WSVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_strpu GA.....GTG WTIYPPLSSn hAGSSVDLAI FSLHLAGASS ILGLINFITT
cox1_pisoc GT.....GTG WTIYPPLSSg hAGGSVDLAI FSLHLAGASS ILASINFITT
cox1_chick GA.....GTG WTVYPPLAGn hAGASVDLAI FH.YLAGVSS ILGAINFITT
cox1_triru GA.....GTG WTLYPPLASI qsGPSVDLAI FGLHLSGISS LLGAMNFITT
cox1_yeast GA.....GTG WTVYPPLSSI qsGPSVDLAI FALHLTSISS LLGAINFIVT
cox1_podan GA.....GTG WTIYPPLSGV qsGPSVDLAI FALHLSGVSS LLGAMNFITT
cox1_mouse GA.....GTG WTVYPPLAGN paGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_rat GA.....GTG WTVYPPLAGn hAGVSVDLTI FSLHLAGVSS ILGAINFITT
cox1_human GA.....GTG WTVYPPLAGN ypGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_neucr GA.....GTG WTIYPPLSGV qsGPSVDLAI FALHLSGVSS LLGSINFITT
cox1_emeni GA.....GTG WTLYPPLSGI qsGPSVDLAI FGLHLSGISS MLGAMNFITT
cox1_xenla GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGISS ILGAINFITT
cox1_bovin GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_balph GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_balmu GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_chlre GP.....GTG WTAYPPLSVQ HSGTSVDLAI LSLHLNGLSS ILGAVNMLVT
cox1_didma GA.....GTG WTVYPPLAGn hAGASVDLAI FSLHLAGISS ILGAINFITT
cox1_drome GA.....GTG WTVYPPLSAg hGGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_droya GA.....GTG WTVYPPLSSg hGGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_halgr GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_phovi GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_anoqu GA.....GTG WTVYPPLSSg hAGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_anoga GA.....GTG WTVYPPLSSg hAGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo GP.....GGG WTVYPPLSSI tsGPAIDLAI LSLQLTGISS TLGSVNLIAT
cox1_apime SP.....GTG WTVYPPLSAy hSSPSVDFAI FSLHMSGISS IMGSLNLMVT
cox1_ascsu GC.....GTS WTVYPPLSTM gpGGSVDLAI FSLHCAGVSS ILGAINFMTT
cox1_caeel GC.....GTS WTVYPPLSTM gpGSSVDLAI FSLHAAGLSS ILGGINFMCT
cox1_thep3 APD.....AG WTSYASLSLD saHHGIDFYT LGLQISGFGT IMGAINFLVT
cox1_bacfi GPD.....AG WTAYVPLSSR dgGLGIDFYV LGLQVSGIGT LISAINFLVT
cox1_syny3 PQA......G WTSYPPLSLL SGKWGEELWI LSLLLVGTSS ILGAINFVTT
cox1_bacsu APD.....AG WTSYASLSLH SKGHGIDFSI LGLQISGLGT LIAGINFLAT
qox1_bacsu SPN.....AG WTSYMPLASN dpGPGENYYL LGLQIAGIGT LMTGINFMVT
cox1_leita GM.....GVG WTLYPTLICI dsSLACDFVM FAVHLLGISS ILNSINLLGT
cyob_ecoli FAQ.....TG WLAYPPLSGI epGVGVDYWI WSLQLSGIGT TLTGINFFVT
qoxm_sulac VD......LG WYMYAPLSVE tyGLGTNLIQ IALILSGLSS TLTGVNFVMT
cox1_halha IAT...AQTS WTMYTPLSLQ MSSPAVDMMM LGLHLTGVSA TMGAINFIAT
cox1_trybb GM.....GVG WTLYPTLICI dsSLACDFII FSVHFLGISS ILNSINVVGT
cox1_parte TSWkaAVTAG WTFITPFSSn sGFGAQDVLS VAVVLAGIST TISLLTLITR
cox1_tetpy QFWrtLTTAG WTFITPFSSn tGVGSQDILI LSVVFAGIST TISFTNLLIT
201 250
t2_12833 FLNMRAPGMT LFKVPLFAWA VFITAWMILL SLPVLAGGIT MLLMDRNFGT
cox1_parde FLNMRAPGMT LFKVPLFAWA VFITAWMILL SLPVLAGGIT MLLMDRNFGT
cox1_rhosh FLNMRAPGMT MHKVPLFAWS IFVTAWLILL ALPVLAGAIT MLLTDRNFGT
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja IFNMRAPGMT LHKMPLFVWS ILVTVFLLLL SLPVLAGAIT MLLTDRNFGT
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi VFNMRAPGMS ML.DLLFVWA VFITAWLLLL CLPVLAGGIT MLLTDRNFNT
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo IFNMRAPGLT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_maize IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_orysa IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_wheat IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_sorbi IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_betvu IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNR
cox1_soybn ISNMRGPGMT MHRSPLFVWS VPVTAFPLLL SLPVLAGAIT MLLTDRNFNT
cox1_oenbe ISNMRGLGMT MHRSPLFVWS VLATAFPILL SLPVLAGAIT MLLTDRNFNT
cox1_pea ISNMRGPGMT MHRSPLFVWS VPVTAFPLLL SLPVLAGAIT MLLTDRNFNT
cox1_parli IINMRTPGMS FDRLPLFVWS VFVTAFLLLL SLPVLAGAIT MLLTDRNINT
cox1_crola TINMKPPALS QYQTPLFVWA VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_cypca TINMKPPAIS QYQTPLFVWS VLVTAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_strpu IINMRTPGMS LDRLPLFVWS VFVTAFLLLL SLPVLAGAIT MLLTDRNINT
cox1_pisoc IINMRTPGMS FDRLPLFVWS VFVTAFLLLL SLPVLAGAIT MLLTDRNINT
cox1_chick IINMKPPALS QYQTPLFVWS VLITAILLLL SLPVLAAGIT MLLTDRNLNT
cox1_triru IINMRSPGIR LHKLALFGWA VLITAVLLLL SLPVLAGAIT MLLTDRNFNT
cox1_yeast TLNMRTNGMT MHKLPLFVWS IFITAFLLLL SLPVLSAGIT MLLLDRNFNT
cox1_podan IMNMRTPSIR LHKLALFGWA VIITAVLLLL SLPVLAGAIT MLLTDRNFNT
cox1_mouse IINMKPPAMT QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_rat IINMKPPAMT QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_human IINMKPPAMT QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_neucr IVNMRTPGIR LHKLALFGWA VVITAVLLLL SLPVLAGAIT MLLTDRNFNT
cox1_emeni ILNMRSPGIR LHKLALFGWA VIITAVLLLL SLPVLAGGIT MVLTDRNFNT
cox1_xenla TINMKPPAMS QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_bovin IINMKPPAMS QYQTPLFVWS VMITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_balph IINMKPPAMT QYQTPLFVWS VLVTAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_balmu IINMKPPAMT QYQTPLFVWS VLVTAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_chlre VAGLRAPGMK LLHMPLFVWA IALTAVLVIL AVPVLAAALV MLLTDRNINT
cox1_didma IINMKPPAMS QYQTPLFVWS VMITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_drome VINMRSTGIS LDRMPLFVWS VVITALLLLL SLPVLAGAIT MLLTDRNLNT
cox1_droya VINMRSTGIT LDRMPLFVWS VVITALLLLL SLPVLAGAIT MLLTDRNLNT
cox1_halgr IINMKPPAMS QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_phovi IINMKPPAMS QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_anoqu VINMRAPGIT LDRMPLFVWS VVITAVLLLL SLPVLAGAIT MLLTDRNLNT
cox1_anoga VINMRSPGIT LDRMPLFVWS VVITAVLLLL SLPVLAGAIT MLLTDRNLNT
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo MINMRAPGLS LYQMPLFAWA IMITSILLLL TLPVLAGGLF MLFSDRNLNT
cox1_apime IMMMKNFSMN YDQISLFPWS VFITAILLIM SLPVLAGAIT MLLFDRNFNT
cox1_ascsu TKNLRSSSIS LEHMSLFVWT VFVTVFLLVL SLPVLAGAIT MLLTDRNLNT
cox1_caeel TKNLRSSSIS LEHMTLFVWT VFVTVFLLVL SLPVLAGAIT MLLTDRNLNT
cox1_thep3 IINMRAPGMT FMRMPMFTWA TFVTSALILF AFPPLTVGLI FMMMDRLFGG
cox1_bacfi IVNMRAPGMT MMRLPLFVWT SFISSTLILF AFTPLAAGLA LLMLDRLFEA
cox1_syny3 ILKMRIKDMD LHVCPCFAGA MLATSSLILL STPLLASALI LLSFDLIAGT
cox1_bacsu IINMRAPGMT YMRLPLFTWT TFVASALILF AFPPLTVGLA LMMLDRLFGT
qox1_bacsu ILKMRTKGMT LMRMPMFTWT TLITMVIIVF AFPVLTVALA LLSFDRLFGA
cox1_leita LFCCRRKFFS FLSWSLFIWA ALITAILLII TLPVLAGGVT LILCDRNFNT
cyob_ecoli ILKMRAPGMT MFKMPVFTWA SLCANVLIIA SFPILTVTVA LLTLDRYLGT
qoxm_sulac ITKMK..KVP YLKMPLFVWG FFTTAILMII AMPSLTAGLV FAYLERLWGT
cox1_halha IFTERGEDVG WPDLDIFSWT MLTQSGLILF AFPLFGSALI MLLLDRNFGT
cox1_trybb IFCCRRKYFS FLIWTLFIWG ALLTSILLII TLPVLAGGVT LLLCDRNFNT
cox1_parte .RTLVAPGLR NRRvpFITIS LLLTLRLLAI VTPILGAAVL MSLMDRHWQT
cox1_tetpy RRTLAMPGMR HRRvpFVTIS IFLTLRMLAT ITPVLGAAVI MMAFDRHWQT
251 300
t2_12833 QFFDPAGGGD PVLYQHILWF FGHPEVYMLI LPGFGIISHV ISTFARKPIF
cox1_parde QFFDPAGGGD PVLYQHILWF FGHPEVYMLI LPGFGIISHV ISTFARKPIF
cox1_rhosh TFFQPSGGGD PVLYQHILWF FGHPEVYIIV LPAFGIVSHV IATFAKKPIF
cox1_scapl .......... .......FWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_gomva .......... .........F FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_polsp .......... .....HLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_lepsp .......... .....HLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_megat .......... ...YQHLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_lepoc .......... ...YQHLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_pomni .......... ...YQHLFWF FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_saltr .......... .......FWF FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_geosd .......... ...YEHLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_braja TFFAPDGGGD PVLFQHLFWF FGHPEVYILI LPGFGMISQI VSTFSRKPVF
cox1_panbu .......... ........WF FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_prowi SFFDPAGGGD PILYQHLFWF FGHPEVYILI IPGFGIISHV IATFSKKPIF
cox1_polsx .......... .......FWF FGHPEVYILI LPGFGMISHI VAYYSGKnpF
cox1_amica .......... ....QHLFWF FGHPEVYILI LPGFGMVSHI VAYYakKEPF
cox1_marpo TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_maize TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_orysa TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_wheat TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_sorbi TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_betvu PFLIR.WGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSGKPVF
cox1_soybn TFSDPAGGGD PILYQHLFRF FGHPEVYIPI LPGSGIISHI VSTFSGKPVF
cox1_oenbe TFSDPAGGGD PILYQHLFRF FGHPEVYILI LPGSGIISHI VSTFSGKPVF
cox1_pea TFSDPAGGGD PILYQHLFRF FGHPEVYIPI LPGSGIISHI VSTFSGKPVF
cox1_parli TFFDPAGGGD PILFQHLFWF FGHPEVYILI LPGFGMISHV IAHYSGKrpF
cox1_crola TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VAYYakKEPF
cox1_cypca TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VAYYskKEPF
cox1_strpu TFFDPAGGGD PILFQHLFWL FGHPEVYILI LPGFGMISHV IAHYSGKrpF
cox1_pisoc TFFDPAGGGD PILFQHLFWF FGHPEVYILI LPGFGMISHV IAHYAGKnpF
cox1_chick TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHV VAYYakKEPF
cox1_triru SFFELAGGGD PIFIQHLFWF FGHPEVYILI VPGFGIISTV ISANSSKNVF
cox1_yeast SFFEVAGGGD PILYEHLFWF FGHPEVYILI IPGFGIISHV VSTYSKKPVF
cox1_podan SFFETAGGGD PILFQHLFWF FGHPEVYILI IPAFGIISTT ISAYSNKSVF
cox1_mouse TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VTYYskKEPF
cox1_rat TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VTYYskKEPF
cox1_human TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_neucr SFFETAGGGD PILFQHLFWF FGHPEVYILI IPGFGIISTT ISAYSNKSVF
cox1_emeni SFFEVAGGGD PILFQHLFWF FGHPEVYILI IPGFGIISTV IAAGSGKNVF
cox1_xenla TFFDPAGGGD PVLYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_bovin TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_balph TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_balmu TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_chlre AYFCE..SGD LILYQHLFWF FGHPEVYILI LPAFGIVSQV VSFFSQKPVF
cox1_didma TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_drome SFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI IseSGKKETF
cox1_droya SFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI IseSGKKETF
cox1_halgr TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_phovi TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_anoqu SFFDPAGGGE PNLYQHLFWF FGHPEVYILI LPGFGMISHI IteSGKKETF
cox1_anoga SFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI IteSGKKETF
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo SFYAPEGGGD PVLYQHLFWF FGHPEVYILI MPAFGVVSHI IPSLAHKPIF
cox1_apime SFFDPMGGGD PILYQHLFWF FGHPEVYILI LPGFGLISHI VmeSGKKEIF
cox1_ascsu SFFDPSTGGN PLIYQHLFWF FGHPEVYILI LPAFGIISQS SLYLtkKEVF
cox1_caeel SFFDPSTGGN PLIYQHLFWF FGHPEVYILI LPAFGIVSQS TLYLtkKEVF
cox1_thep3 NFFNPAAGGN TIIWEHLFWV FGHPEVYILV LPAFGIFSEI FATFSRKRLF
cox1_bacfi QYFIPSMGGN VVLWQHIFWI FGHPEVYILV LPAFGIISEV IPAFSRKRLF
cox1_syny3 SFFNRVGGGD PVVYQHLFWF YSHPAVYIMI LPFFGVISEV IPVHARKPIF
cox1_bacsu NFFNPELGGN TVIWEHLFWI FGHPEVYILI LPAFGIFSEV IPVFARKRLF
qox1_bacsu HFFTLEAGGM PMLWANLFWI WGHPEVYIVI LPAFGIFSEI ISSFARKQLF
cox1_leita SFYDVVGGGD LILFQHIFWF FGHPEVYIIL LPVFGLISTI VEVIGFRCVF
cyob_ecoli HFFTNDMGGN MMMYINLIWA WGHPEVYILI LPVFGVFSEI AATFSRKRLF
qoxm_sulac PFFDSALGGS PVLWQQLFWF FGHPEVYILI LPAMGLVSEL LPKMARREIF
cox1_halha TFFTVA.GGD PIFWQHLFWF FGHPEVYVLV LPPMGIVSLI LPKFSGRKLF
cox1_trybb SFYDVVGGGD LVLFQHLFWF FGHPEVYIII LPVFGLVSTI IEVTSFRCVF
cox1_parte SFFDFAYGGD PILFQHLFWF FGHPEVYILI IPSFGVANIV LPFYTMRRMS
cox1_tetpy TFFEYAYGGD PILSQHLFWF FGHPEVYVLI IPTFGFINMI VPHNNTRRVA
301 350
t2_12833 GYLPMVLAMA AIAFLGFIVW AHHMYTAGMS LTQQTYFQMA TMTIAVPTGI
cox1_parde GYLPMVLAMA AIAFLGFIVW AHHMYTAGMS LTQQTYFQMA TMTIAVPTGI
cox1_rhosh GYLPMVYAMV AIGVLGFVVW AHHMYTAGLS LTQQSYFMMA TMVIAVPTGI
cox1_scapl GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_gomva GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_polsp GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_lepsp GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_megat GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_lepoc GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_pomni GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_saltr GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_geosd GCMGMIWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMVIAIPTGV
cox1_braja GYLGMAYAMV AIGGIGFVVW AHHMYTVGMS SATQAYFVAA TMVIAVPTGV
cox1_panbu GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_prowi GYLGMVYAMC SIGILGFIVW AHHMYVVGLD IDTRAYFTAA TMIIAVPTGI
cox1_polsx GYMGMVWAMM AIGLLGFIVW AHHMYTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_amica GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMVIAIPTGV
cox1_marpo GYLGMVYAMI SIGVLGFIVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_maize GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_orysa GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_wheat GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_sorbi GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_betvu GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_soybn GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_oenbe GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGV
cox1_pea GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_parli GYLGMVYAMI AIGVLGFLVW AHHMFTVGMD VDTRAYFTAA TMIIAVPTGI
cox1_crola GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_cypca GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_strpu GYLGLVYAMI AIGVLGFLVW AHHMFTVGMD VDTRAYFTAA TMIIAVPTGL
cox1_pisoc GYLGMVYAII SIGILGFLVW AHHMFTVGMD VDTRAYFTAA TMIIAVPTGI
cox1_chick GYMGMVWAML SIGFLGFIVW AHHMFTVRMD VDTRAYFTSA TMIIAIPTGI
cox1_triru GYLGMVYAMM SIGILGFVFW SHHMYTVGLD VDTRAYFIAA TLIIAVPTGI
cox1_yeast GEISMVYAMA SIGLLGFLVW SHHMYIVGLD ADTRAYFTSA TMIIAIPTGI
cox1_podan GYIGMVYAMM SIGILGFIVW SHHMYTVGLD VDTRAYFTAA TLIIAVPTGI
cox1_mouse GYMGMVWAMM SIGFLGFIVW AHHMFTVGLD VDTRACFTSA TMIIAIPTGV
cox1_rat GYMGMVWTMM SIGFLGFIVW AHHMFTVGLD VDTRAYFTSA TMIIAIPTGV
cox1_human GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_neucr GYIGMVYAMM SIGILGFIVW SHHMYTVGLD VDTRAYFTAA TLIIAVPTGI
cox1_emeni GYLGMVYAMM SIGVLGFLVW SHHMYTVGLD VDTRAYFTAA TLIIAVPTGI
cox1_xenla GYMGMVWAMM SIGLLGFIVW AHHMFTVDLN VDTRAYFTSA TMIIAIPTGV
cox1_bovin GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_balph GYMGMVWAMV SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_balmu GYMGMIWAMV SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_chlre GLTGMICAMG AISLLGFIVW AHHMFTVGLD LDTVAYFTSA TMIIAVPTGM
cox1_didma GYMGMVWAMM SIGFLGFIVW AHHMFTVGLD VDTRAYFTSA TMIIAIPTGV
cox1_drome GSLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_droya GSLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_halgr GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_phovi GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_anoqu GNLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_anoga GNLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo GKEGMLWAML SIALLGLMVW SHHLFTVGLD VDTRAYFSAA TMVIAIPTGI
cox1_apime GNLSMIYAML GIGFLGFIVW AHHMFTVGLD VDTRAYFTSA TMIIAVPTGI
cox1_ascsu GSLGMVYAIL SIGLIGCVVW AHHMYTVGMD LDSRAYFTAA TMVIAVPTGV
cox1_caeel GALGMVYAIL SIGLIGCVVW AHHMYTVGMD LDSRAYFSAA TMVIAVPTGV
cox1_thep3 GYSSMVFATV LIAFLGFMVW AHHMFTVGMG PIANAIFAVA TMTIAVPTGV
cox1_bacfi GYTAMVFATM IIAFLGFMVW AHHMFTVGMG PVANSIFAVA TMTIAVPTGI
cox1_syny3 GYRAIAYSSL AISFLGLIVW AHHMFTHGTP GWLRMFFMAT TMLIAVPTGI
cox1_bacsu GYSSMVFAI. VLGFLGFMVW VHHMFTTGLG PIANAIFAVA TMAIAIPTGI
qox1_bacsu GYKAMVGSII AISVLSFLVW THHFFTMGNS ASVNSFFSIT TMAISIPTGV
cox1_leita STVAMIYSMI LIAILGMFVW AHHMFVVGMD VDSRAYFGGV SILIGLPTCV
cyob_ecoli GYTSLVWATV CITVLSFIVW LHHFFTMGAG ANVNAFFGIT TMIIAIPTGV
qoxm_sulac GYTAIALSSI AIAFLSAlvW MHHMFTAIDN TLVQIVSSAT TMAIAIPSGV
cox1_halha GFKFVVYSTL AIGVLSFGVW AHHMFTTGID PRIRSSFMAV SLAISIPSAV
cox1_trybb SSVAMIYSML LISVLGMFVW AHHMFVVGMD VDSRAYFGSI TVLIGLPTCI
cox1_parte SKHHMIWAVY VMAYMGFVVW GHHMYLVGLD HRSRNIYSTI TIMICLPATI
cox1_tetpy SKHHMIWAIY VMAYMGYLVW GHHMYLVGLD HRSRTMYSTI TIMISMPATI
351 400
t2_12833 KVFSWIATMW GGSIEFKTPM LWALAFLFTV GGVTGVVIAQ GSLDRVYHDT
cox1_parde KVFSWIATMW GGSIEFKTPM LWALAFLFTV GGVTGVVIAQ GSLDRVYHDT
cox1_rhosh KIFSWIATMW GGSIELKTPM LWALgfLFTV GGVTGIVLSQ ASVDRYYHDT
cox1_scapl KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_gomva KVFSWLATLH GGSIKWETPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_polsp KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_lepsp KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_megat KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSIDIVLHDT
cox1_lepoc KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_pomni KVFSWLATLH GASIKWETPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_saltr KVFSWLATLH GGSIKWETPL LWAL...... .......... ..........
cox1_geosd KVFSWLATLH GGSLKWETXX XXALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_braja KIFSWIATMW GGSIEFRAPM IWAVgfLFTV GGVTGVVLAN AGVDRVLQET
cox1_panbu KVFSWLATLH GGSIKWDTPM LWALgfLFTV GGLTGIILAN SSLDIVLHDT
cox1_prowi KIFSWVATMW GGSIELRTPM LFAVgfLFTV GGLTGVVLAN SGLDVAFHDT
cox1_polsx KVFSWLATLH GGAIKWETPM LWALgfLFTV GGLTGIILAN SSLDIMLHDT
cox1_amica KVFSWLATLH GGAIKWETPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_marpo KIFSWIATMW GGSIQYKTPM LFAVgfLFTV GGLTGIVLAN SGVDIALHDT
cox1_maize KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_orysa KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_wheat KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_sorbi KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_betvu KIFSWIATMW GGSIQYKTPM LFAVgfLFTV GGLTGIVLAN SGLDIALHDT
cox1_soybn KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_oenbe KIFSWIATMW GGSIQYKTPM LFAVgfLFTV GGLAGIVPAN SGLDIALHDT
cox1_pea KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVPAN SGLDIALHDT
cox1_parli KVFSWMATLQ GSNLQWETPL LWALgfLFTL GGLTGIVLAN SSIDVVLHDT
cox1_crola KVFSWLATLH GGTIKWDTPM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_cypca KVFSWLATLH GGSIKWETPM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_strpu KVFSWMAKLQ GSNLQWSLPL LWTLgfLFTL GGLTGIVLAN SSIDFVLHDT
cox1_pisoc KVFSWMATLQ GSNLRWDTPL LWALgfLFTI GGLTGVVLAN SSIDIILHDT
cox1_chick KVFSWLATLH GGTIKWDPPM LWALgfLFTI GGLTGIVLAN SSLDIALHDT
cox1_triru KIFSWLATCY GGSLNLTPAM LFALGfmFTI GGLSGVVLAN ASLDIAFHDT
cox1_yeast KIFSWLATIY GGSIRLATPM LYAIafLFTM GGLTGVALAN ASLDVAFHDT
cox1_podan KIFSWLATCY GGSIRLTPSM LFALgfMFTI GGLSGVVLAN ASLDIAFHDT
cox1_mouse KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_rat KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_human KVFSWLATLH GSNMKWSAAV LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_neucr KIFSWLATCY GGSIRLTPSM LFALgfMFTI GGLSGVVLAN ASLDIAFHDT
cox1_emeni KIFSWLATCY GGSLHLTPPM LFALGflFTI GGLSGVVLAN ASLDVAFHDT
cox1_xenla KVFSWLATMH GGTIKWDAPM LWALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_bovin KVFSWLATLH GGNIKWSPAM MWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_balph KVFSWLATLH GGNIKWSPAL MWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_balmu KVFSWLATLH GGNIKWSPAL MWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_chlre KIFSWMATIY SGRVWFTTPM WFAVGflFTL GGVTGVVLAN AGVDMLVHDT
cox1_didma KVFSWLATLH GGNIKWSPAM LWALgfLFTI GGLTGIVLAN SSLDIVLHDT
cox1_drome KIFSWLATLH GTQLSYSPAI LWALgfLFTV GGLTGVVLAN SSVDIILHDT
cox1_droya KIFSWLATLH GTQLSYSPAI LWALgfLFTV GGLTGVVLAN SSVDIILHDT
cox1_halgr KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_phovi KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_anoqu KIFSWLATMH GTQLTYSPAM LWAFgfLFTV GGLTGVVLAN SSIDIVLHDT
cox1_anoga KIFSWLATLH GTQLTYSPAM LWAFgfLFTV GGLTGVVLAN SSIDIVLHDT
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo KIFSWLATLT GGAIQWsvPM LYAIGflFTI GGLTGVILSN SVLDIAFHDT
cox1_apime KVFSWLATYH GSKLKLNISI LWSLGflFTI GGLTGIMLSN SSIDIILHDT
cox1_ascsu KVFSWLATLF GMKMVFQPLL LWVMgfLFTI GGLTGVMLSN SSLDIILHDT
cox1_caeel KVFSWLATLF GMKMVFNPLL LWVLgfLFTL GGLTGVVLSN SSLDIILHDT
cox1_thep3 KIFNWLFTMW GGSIKFTTPM HYAVAfsFVM GGVTGVMLAS AAADYQYHDS
cox1_bacfi KIFNWLFTMW GGKITFNTAM LFASSftFVL GGVTGVMLAM APVDYLYHDT
cox1_syny3 QNFQLVRYLW GGKIQLNSAM LFAFGFlfMI GGLTGVMVAS VPFDIHVHDT
cox1_bacsu KIFNWLLTIW GGNVKYTTAM LYAVSfsFVL GGVTGVMLAA AAADYQFHDT
qox1_bacsu KIFNWLFTMY KGRISFTTPM LWALAFifVI GGVTGVMLAM AAADYQYHNT
cox1_leita KLFNWIYSFl dMIITFEVYF VIMFIFMFLI GAVTGLFLSN VGIDIMLHDT
cyob_ecoli KIFNWLFTMY QGRIVFHSAM LWTIGftFSV GGMTGVLLAV PGADFVLHNS
qoxm_sulac KVLNWTATLY GGEIRYKTpl LISFIVMFLL GGITGVFFPL VPIDYALNGT
cox1_halha KVFNWITTMW NGKLRLTAPM LFCIGFvfII GGVTGVFLAV IPIDLILHDT
cox1_trybb KLFNWIYSFl dMCICFEIYF IYMFILMFLA GGLTGLFLSN VGIDILMHDT
cox1_parte KLVNWTLTLA NAAIHVDLVF LFFCsfFFLT GGFTGMWLSH VGLNISVHDT
cox1_tetpy KVVNWTLSLV NGALKVDLPF LFSMSflFLV AGFTGMWLSH VSLNVSMHDT
401 450
t2_12833 YYIVAHFHYV MSLGALFAIF AGTYYWIGKM SGRQYPEWAG QLHFWMMFIG
cox1_parde YYIVAHFHYV MSLGALFAIF AGTYYWIGKM SGRQYPEWAG QLHFWMMFIG
cox1_rhosh YYVVAHFHYV MSLGAVFGIF AGSTSGIGKM SGRQYPEWAG KLHFWMMFVG
cox1_scapl YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_gomva YYVVAHFHYV LSMGAVFAIV A......... .......... ..........
cox1_polsp YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_lepsp YYVVAHFHYV LSMGAVFAI. .......... .......... ..........
cox1_megat YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_lepoc YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_pomni YYVVAHFHYV LSMGAVFAIV A......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd YYVVAHFHYV LS........ .......... .......... ..........
cox1_braja YYVVAHFHYV LSLGAVFAIF AGWYYWFPKM TGYMYNETLA KAHFWVTFIG
cox1_panbu YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLHNTWT KIHFGVMFM.
cox1_prowi YYVVAHFHYV LSMGAVFALF SGFYYWIGKI TGLQYPETLG QIHFWLMFLG
cox1_polsx YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLHSTWT KIHFGVMF..
cox1_amica YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLHPTWS KIHFGVMFV.
cox1_marpo YYVVAHFHYV LSMGAVFALF AGFYYWIGKI TGLQYPETLG QIHFWITFFG
cox1_maize YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_orysa YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_wheat YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_sorbi YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_betvu YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_soybn YYVVAHFHYV LSMGAVFALF AGFHYWVGKI FGRTYPETLG QIHFWITFFG
cox1_oenbe YYAGAHFHYV LSMGAVFALF AGFRYWVGKI FGRTYPETLG QIHFWITFFG
cox1_pea YYVVAHFHYV LSMGAVFALF AGFHYWVGKI FGRTYPETLG KIHFWITFFG
cox1_parli YYVVAHFHYV LSMGAVFAIF AGFTHWFPLF CGYNLHPLWG KAHFFMMFVG
cox1_crola YYVVAHFHYV LSMGAVFAIM AGFVHWFPLF TGFSLHDTWT KIHFGVMFIG
cox1_cypca YYVVAHFHYV LSMGAVFAIM AAFVHWFPLL TGYTLHSTWT KIHFGVMFIG
cox1_strpu YYVVAHFHYV LSMGAVFAIF AGFTHWFPLF SGYSLHPLWG KVHFFIMFVG
cox1_pisoc HYVVAHFHYV LSMGAVFAIF AGFTHWFPLF SGVSLHPLWS KVHFAVMFIG
cox1_chick YYVVAHFHYV LSMGAVFAIL AGFTHWFPLF TGFTLHPSWT KAHFGVMFTG
cox1_triru YYVVAHFHYV LSMGAVFALF SGWYFWIPKL LGLSYDLFAG KVHFWILFVG
cox1_yeast YYVVGHFHYV LSMGAIFSLF AGYYYWSPQI LGLNYNEKLA QIQFWLIFIG
cox1_podan YYVVAHFHYV LSMGAVFAMF SGWYFWIPKM LGLNYNMTLS KVQFWILFIG
cox1_mouse YYVVAHFHYV LSMGAVFAIM AGFVHWFPLF SGFTLDDTWA KAHFAIMFVG
cox1_rat YYVVAHFHYV LSMGAVFAIM AGFVHWFPLF SGYTLNDTWA KAHFAIMFVG
cox1_human YYVVAHFHYV LSMGAVFAIM GGFIHWFPLF SGYTLDQTYA KIHFTIMFIG
cox1_neucr YYVVAHFHYV LSMGAVFAMF SGWYHWVPKI LGLNYNMVLS KAQFWLLFIG
cox1_emeni YYVVAHFHYV LSMGAVFALF SGWYLWIPKL LGLSYDQFAA KVHFWILFIG
cox1_xenla YYVVAHFHYV LSMGAVFAIM GGFIHWFPLF TGYTLHETWA KIHFGVMFAG
cox1_bovin YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLNDTWA KIHFAIMFVG
cox1_balph YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLNTTWA KIHFMIMFVG
cox1_balmu YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLNTTWA KIHFLIMFVG
cox1_chlre YYVVAHFHYV LSMGAVFGIF AGVYFWGNLI TGLGYHEGRA MVHFWLLFIG
cox1_didma YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF TGYMLNDMWA KIHFFIMFVG
cox1_drome YYVVAHFHYV LSMGAVFAIM AGFIHWYPLF TGLTLNNKWL KSHFIIMFIG
cox1_droya YYVVAHFHYV LSMGAVFAIM AGFIHWYPLF TGLTLNNKWL KSQFIIMFIG
cox1_halgr YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLDNTWA KIHFTIMFVG
cox1_phovi YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYMLDDTWA KIHFTIMFVG
cox1_anoqu YYVVAHFHYV LSMGAVFAIM AGFIHWYPLL TGLTMNPNWL KLQFAMMFVG
cox1_anoga YYVVAHFHYV LSMGAVFAIM AGFVHWYPLL TGLTMNPTWL KIQFSIMFVG
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo YFVVAHFHYV LSMGALFGL. CGAYYWSPKM FGLMYNETLA SIQFWILFIG
cox1_apime YYVVGHFHYV LSMGAVFAII SSFIHWYPLI TGLLLNIKWL KIQFIMMFIG
cox1_ascsu YYVVSHFHYV LSLGAVFGIF TGVTLWWSFI TGFVYDKMMM SSVFVLMFVG
cox1_caeel YYVVSHFHYV LSLGAVFGIF TGVTLWWSFI TGYVLDKLMM SAVFILLFIG
cox1_thep3 YFVVAHFHYV IVGGVVFALL AGTHYWWPKM FGRMLNETLG KITFWLFFIG
cox1_bacfi YFVVAHFHYI IVGGIVLSLF AGLFYWYPKM FGHMLNETLG KLFFWVFYIG
cox1_syny3 YFVVGHFHYV LFGGSAFALF SGVYHWFPKM TGRMVNEPLG RLHFILTFIG
cox1_bacsu YFVVAHFHYV IIGGVVFGLL AGVHFWWPKM FGKILHETMG KISFVLFFIG
qox1_bacsu YFLVSHFHYV LIAGTVFACF AGFIFWYPKM FGHKLNERIG KWFFWIFMIG
cox1_leita YFVVGHFHYV LSLGAVVGFF TGFIHFLAKW LPIELYLFWM FYFISTLFIG
cyob_ecoli LFLIAHFHNV IIGGVVFGCF AGMTYWWPKA FGFKLNETWG KRAFWFWIIG
qoxm_sulac YFVVGHFHY. MVYAILYALL GALFYYFPFW SGKWYNDDLG KTGAILLVAG
cox1_halha YYVVGHFHFI VYGAIGFALF AASYYWFPMV TGRMYQKRLA HAHFWTALVG
cox1_trybb YFVVAHFHYV LSLGAVVGVF GGFFHFLMKW IPIELHTFWL FFFISTLWFG
cox1_parte FYVVAHFHLM LAGAAMMGAF TGLYYYYNTF FDVQYSKIFG FLHLVYYSAG
cox1_tetpy FYVVAHFHIM LSGAAITGIF SGFYYYFNAL FGIKFSRMFG YMHLIYYSGG
451 500
t2_12833 SNLIFFPQHF LGRQGMPRRY IDYPVEFSYW NNISSIGAYI SFASFLFFIG
cox1_parde SNLIFFPQHF LGRQGMPRRY IDYPVEFSYW NNISSIGAYI SFASFLFFIG
cox1_rhosh ANLTFFPQHF LGRQGMPRRY IDYPEAFATW NFVSSLGAFL SFASFLFFLG
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja VNLVFFPQHF LGLSGMPRRY VDYPDAFAGW NLVSSVGSYI SGFGVLIFLY
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi VNITFFPMHF LGLAGMPRRI PDYPDCYAGW NAVASYGSYL SITAVLFFFY
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo VNLTFFPMHF LGLAGMPRRI PDYPDAYAGW NAFSSFGSYV SVVGIFCFFV
cox1_maize VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_orysa VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_wheat VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_sorbi VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_betvu VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGICCFFV
cox1_soybn VNLTLFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_oenbe VNPTFFPMHF LGLSGMPRPI PDYPESYAGW NALSSFGSYI SVVGIRCFFV
cox1_pea VNLTLFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_parli VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTVSSIGSTI SLVAMLFFIF
cox1_crola VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTVSSIGSLI SLVAVIIFLF
cox1_cypca VNLTFFPQHF LGLSAMPRRY SDYPDAYALW NTVSSIGSLI SLVAVIMFLF
cox1_strpu VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTISSIGSTI SVVAMLFFLF
cox1_pisoc VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTVSSIGSTI SLIRTLIFLF
cox1_chick VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTLSSIGSLI SMTAVIMLMF
cox1_triru VNLTLFPQHF LGLQGMPRRI GDYPDAFAGW NLISSFGSIV SVVATWYFLN
cox1_yeast ANVIFFPMHF LGINGMPRRI PDYPDAFAGW NYVASIGSFI ATLSLFLFIY
cox1_podan VNVTFFPQHF LGLQGMPRRI SDYPDAFAGW NLISSFGSII SVVAAWLFLY
cox1_mouse VNMTFFPQHF LGLSGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVLIMIF
cox1_rat VNMTFFPQHF LGLAGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVLVMIF
cox1_human VNLTFFPQHF LGLSGMPRRY SDYPDAYTTW NILSSVGSFI SLTAVMLMIF
cox1_neucr VNLTFFPQHF LGLQGMPRRI SDYPDAFSGW NLISSFGSIV SVVASWLFLY
cox1_emeni VNLTFFPQHF LGLQLMPRRI SDYPDAFYGW NLLSSIGSII SVVATWYFLT
cox1_xenla VNLTFFPQHF LGLSAMPRRY SDYPDAYTLW NTVSSIGSLI SLVAVIMMMF
cox1_bovin VNMTFFPQHF LGLSGMPRRY SDYPDAYTMW NTISSMGSFI SLTAVMLMVF
cox1_balph VNLTFFPQHF LGLSGMPRRY SDYPDAYTTW NTISSMGSFI SLTAVMLMIF
cox1_balmu VNLTFFPQHF LGLSGMPRRY SDYPDAYTTW NTISSMGSFI SLTAVMLMIF
cox1_chlre VNLTFFPQHF LGLAGMPRRM FDYADCFAGW NAVSSFGASI SFISV.....
cox1_didma VNLTFFPQHF LGLSGMPRRY SDYPDAYTMW NVVSSIGSFI SLTAVILMVF
cox1_drome VNLTFFPQHF LGLAGMPRRY SDYPDAYTTW NIVSTIGSTI SLLGILFFFF
cox1_droya VNLTFFPQHF LGLAGMPRRY SDYPDAYTTW NVVSTIGSTI SLLGILFFFY
cox1_halgr VNMTFFPQHF LGLSGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVMLMVF
cox1_phovi VNMTFFPQHF LGLSGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVMLMVF
cox1_anoqu VNLTFFPQHF LGLAGMPRRY SDFPDSYLAW NIVSSLGSTI SLFAILYFLF
cox1_anoga VNLTFFPQHF LGLAGMPRRY SDFPDSYLTW NVVSSLGSTI SLFAILYFLF
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo VNIVFGPQHF LGLNGMPRRI PDYPEAFVGW NFVSSIGSVI SILSLFLFMY
cox1_apime VNLTFFPQHF LGLMSMPRRY SDYPDSYYCW NSISSMGSMI SLNSMIFLIF
cox1_ascsu VNLTFFPLHF AGIHGYPRKY LDYPDVYSVW NIMASYGSMI SVFALFLFIY
cox1_caeel VNLTFFPLHF AGLHGFPRKY LDYPDVYSVW NIIASYGSII STAGLFLFIY
cox1_thep3 FHLTFFIQHF LGLTGMPRRV FTYLpgWETG NLISTIGAfi AAATVILLIN
cox1_bacfi FHLTFFVQHL LGLMGMPRRV YTYLGdlDAF NFISTIGTFF MSAGVILLVI
cox1_syny3 MDLTFMPMHE LGLMGMNRRI ALYDVEFQPL NVLSTIGAYV LAASTIPFVI
cox1_bacsu FHLTFFIQHF VGLMGMPRRV YTFLpgLETG NLISTIGAFF MAARVILLLV
qox1_bacsu FNICFFPQYF LGLQGMPRRI YTYGpgWTTL NFISTVGAFM MGVGFLILCY
cox1_leita SNMLFFPMHS LGMYAFPRRI SDYPVSFLFW SSFMLYGMLL LASLILFLCA
cyob_ecoli FFVAFMPLYA LGFMGMTRRL sqIDPQFHTM LMIAASGAVL IALGILCLVI
qoxm_sulac TFLTATGMSI AGILGMPRRY AVIPSPIypF QFMASVGAVL TGIGLFILAG
cox1_halha SNATFLAMLW LGYGGMPRRY ATYIPQFATA HRLATVGAFL IGVSTLIWLF
cox1_trybb SNMVFFPLHS LGMFAFPRRI SDYPISFLFW SAFTLYGMLL LTFLVIFCCC
cox1_parte IWTTFFPMFF LGFSGLPRRI HDFPAFFLGW HGLASCGHFL TLAGVCFFFF
cox1_tetpy QWVAFVPQFY LGFSGMPRRI HDYPVVFMGW HSMSTAGHFI TLIGIMFFFL
501 550
t2_12833 IVFYTLFAGK PVNVPNYWNE HADTLEWTLP SPPPEHTFET LPKPEDWDRA
cox1_parde IVFYTLFAGK PVNVPNYWNE HADTLEWTLP SPPPEHTFET LPKPEDWDRA
cox1_rhosh VIFYSL.SGA RVTANNYWNE HADTLEWTLT SPPPEHTFEQ LPKREDWERA
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja CVI.DAFAKK VPAGDNPWGA GATTLEWTLP SPPPFHQFEV LPRVQ.....
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi VVYKTLTSNe pRNPWETTPG VSPTLEWMLP SPPAFHTFEE I.........
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo VVFLTLTSEN KCAPSPwvEQ NSTTLEWMVP SPPAFHTFEE LPAIKE....
cox1_maize VVAITSSSGK NKRCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPTIKE....
cox1_orysa VVAITSSSGK NKRCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPAIKE....
cox1_wheat VVAITSSSGK NQKCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPAVKE....
cox1_sorbi VVAITSSSGK NKRCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPTIKETQGE
cox1_betvu VVTITLSSGK NKRCApwAVE ENSTTLNDVQ SPPAFHTFGE LPAIKE....
cox1_soybn VVTITSSSGN NITRANivEQ NSTTLEWLVQ SPPAFHTFGE LPAIKE....
cox1_oenbe VVTITSSSGN NKrsPWAVEK NSTTLEWMVQ SPPAFHTFGE LPATKE....
cox1_pea VVTITSSSGN NITRANivEQ NSTTLEWLVQ SPPAFHTFGE LPAIKE....
cox1_parli LIWEAFASQR EGVTPEFAN. ..ASLEWQys FPPSHHTFDE TP........
cox1_crola ILWEAFASKR QVMSVEL... TMTNVEWLHG CPPPYHTFEE ..........
cox1_cypca ILWEAFAAKR EVLSVEL... TATNVEWLHG CPPPYHTYEE ..........
cox1_strpu LIWEAFASQR EGITPEFSH. ..ASLEWQYT spPSHHTFDE TP........
cox1_pisoc LIWEAFSTKR TPIHPEFSS. ..SSLEWQYP spPSHHTFDE TPSA......
cox1_chick IVWEAFSAKR KVLQPEL... TATNIEWIHG CPPPYHTFEE ..........
cox1_triru ILYLQLTQGS PVsfQHLFTR NNSSLEWCLN SPPKPHAFDC LPVQS.....
cox1_yeast ILYDQLVNNK SVI...YAKA PSSSIEFLLT SPPAVHSFNT ..........
cox1_podan IVYLQLVEGE YAglQALLNR SYPSLEWALS SPPKPHAFVS LPLQSNILRS
cox1_mouse MIWEAFASKR EVMSVSY... ASTNLEWLHG CPPPYHTFEE ..........
cox1_rat MIWEAFASKR EVLSISYSS. ..TNLEWLHG CPPPYHTFEE ..........
cox1_human MIWEAFASKR KVLM...VEE PSMNLEWLYG CPPPYHTFEE ..........
cox1_neucr IVYIQLVQGE YAglRALLNR SYPSLEWSIS SPPKPHSFAS LPLQSSSFFL
cox1_emeni IIYKQLTEGK AVsfQVLFTR NNSSLEWCLT SPPKPHAFAS LPLQS.....
cox1_xenla IIWEAFAAKR EVTT...YEL TSTMLEWLQG CPTPYHTLKT ..........
cox1_bovin IIWEAFASKR EVLTVDL... TTTNLEWLNG CPPPYHTFEE ..........
cox1_balph IIWEAFTSKR EVLAVDL... TSTNLEWLNG CPPPYHTFEE ..........
cox1_balmu IIWEAFTSKR EVLAVDL... TYTNLEWLNG CPPPYHTFEE ..........
cox1_chlre IVFATTFQEA VRTVPR.... TATTLEWVLL ATPAHHALSQ VP........
cox1_didma IIWEAFASKR EVLDVEL... TTTNIEWLYG CPPPYHTFE. ..........
cox1_drome IIWESLVSQR QVIYPIQLN. ..SSIEWYQN TPPAEHSYSE LPLLTN....
cox1_droya IIWESLVSQR QVIYPIQLN. ..SSIEWYQN TPPAEHSYSE LPLLTN....
cox1_halgr MIWEAFASKR EVAAVEL... TTTNIEWLHG CPPPYHTFEE ..........
cox1_phovi MIWEAFASKR EVAAVEL... TTTNIEWLHG CPPPYHTFEE ..........
cox1_anoqu IIWESMITQR TPAFPM...Q LSSSIEWYHT LPPAEHTYAE LPLLTN....
cox1_anoga IIWESMITQR TPAFPM...Q LSSSIEWYHT LPPAEHTYAE LPLLTN....
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo VMYDQFTSNR VVkiPSYFDD naQSIEWLLH SPVHEHAFNT LPTKS.....
cox1_apime IILESLISKR ML....LFKF NQSSLEWLNF LPPLDHSHLE IP........
cox1_ascsu VLLESFVGHR IFLFDYYVN. ..SGPEYSLS GYVFGHSYQS ..........
cox1_caeel VLLESFFSYR LVISDYYSN. ..SSPEYCMS NYVFGHSYQS ..........
cox1_thep3 IVVTT...AK GEKVPGDAWG DGRTLEWAIA SPPPVYNFAQ TPLVRGLD..
cox1_bacfi NVIYSAFKGE RVTVADPWD. .ARTLEWATP TPVPEYNFAQ TPQVRSLD..
cox1_syny3 NVFWSLFKGE KAARNPW... RALTLEWQTA SPPIIENFEE EP........
cox1_bacsu NVIWTSVKGE YVGADPWHDG R..TLEWTVS SPPPEYNFKQ LPFVRGLDPL
qox1_bacsu NIYYSFRYST REISGDSWG. VGRTLDWAts AIPPHYNFAV LPEVKSQDAF
cox1_leita LFCVFLFWDY CLFFVSLFVF SLYCFFYFST WLPCVMVLYL L.........
cyob_ecoli QMYVSIRDRD QNRDLTGDPW GGRTLEWATS SPPPFYNFAV VPHVHERDAF
qoxm_sulac VLVHGVFRGR AVNGVDPWDN ISVKLQ.... .......... ..........
cox1_halha NMATSWREGP RVDSTDPWdt DQFTNDWAWF RAKEETTVLP DGGDEAQSEA
cox1_trybb LFNVILFWDY CLFFINLFTY SLSIFFYFYT WVPVCMAIYL L.........
cox1_parte GIFDSTSENK SSILANfyNN YTNEIASELP KVEveNTFGE YE........
cox1_tetpy MIFDSHIERR AATSSTLgnG IPGSTVRLML IDRHFAEFEV FKK.......
554
t2_12833 QAHR
cox1_parde QAHR
cox1_rhosh PAH.
cox1_scapl ....
cox1_gomva ....
cox1_polsp ....
cox1_lepsp ....
cox1_megat ....
cox1_lepoc ....
cox1_pomni ....
cox1_saltr ....
cox1_geosd ....
cox1_braja ....
cox1_panbu ....
cox1_prowi ....
cox1_polsx ....
cox1_amica ....
cox1_marpo ....
cox1_maize ....
cox1_orysa ....
cox1_wheat ....
cox1_sorbi LQTR
cox1_betvu ....
cox1_soybn ....
cox1_oenbe ....
cox1_pea ....
cox1_parli ....
cox1_crola ....
cox1_cypca ....
cox1_strpu ....
cox1_pisoc ....
cox1_chick ....
cox1_triru ....
cox1_yeast ....
cox1_podan ....
cox1_mouse ....
cox1_rat ....
cox1_human ....
cox1_neucr SFFR
cox1_emeni ....
cox1_xenla ....
cox1_bovin ....
cox1_balph ....
cox1_balmu ....
cox1_chlre ....
cox1_didma ....
cox1_drome ....
cox1_droya ....
cox1_halgr ....
cox1_phovi ....
cox1_anoqu ....
cox1_anoga ....
cox1_cotja ....
cox1_schpo ....
cox1_apime ....
cox1_ascsu ....
cox1_caeel ....
cox1_thep3 ....
cox1_bacfi ....
cox1_syny3 ....
cox1_bacsu WIEK
qox1_bacsu LHMK
cox1_leita ....
cyob_ecoli WEMK
qoxm_sulac ....
cox1_halha DA..
cox1_trybb ....
cox1_parte ....
cox1_tetpy ....
****************************************************************************
* *
* *
* PredictProtein@EMBL-Heidelberg.DE *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Prediction of: *
* *
* - secondary structure, by PHDsec *
* - solvent accessibility, by PHDacc *
* - and helical transmembrane regions, by PHDhtm *
* *
* PHD: Profile fed neural network systems from HeiDelberg *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Author: Burkhard Rost *
* EMBL, Heidelberg, FRG *
* Meyerhofstrasse 1, 69 117 Heidelberg *
* Internet: Predict-Help@EMBL-Heidelberg.DE *
* *
* All rights reserved. *
* *
* *
****************************************************************************
* *
* *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* Secondary structure prediction by PHDsec: *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Author: Burkhard Rost *
* EMBL, Heidelberg, FRG *
* Meyerhofstrasse 1, 69 117 Heidelberg *
* Internet: Rost@EMBL-Heidelberg.DE *
* *
* All rights reserved. *
* *
* *
****************************************************************************
* *
* About the network method *
* ~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The network procedure is described in detail in: *
* 1) Rost, Burkhard; Sander, Chris: *
* Prediction of protein structure at better than 70% accuracy. *
* J. Mol. Biol., 1993, 232, 584-599. *
* *
* A brief description is given in: *
* Rost, Burkhard; Sander, Chris: *
* Improved prediction of protein secondary structure by use of se- *
* quence profiles and neural networks. *
* Proc. Natl. Acad. Sci. U.S.A., 1993, 90, 7558-7562. *
* *
* The PHD mail server is described in: *
* 2) Rost, Burkhard; Sander, Chris; Schneider, Reinhard: *
* PHD - an automatic mail server for protein secondary structure *
* prediction. *
* CABIOS, 1994, 10, 53-60. *
* *
* The latest improvement steps (up to 72%) are explained in: *
* 3) Rost, Burkhard; Sander, Chris: *
* Combining evolutionary information and neural networks to predict *
* protein secondary structure. *
* Proteins, 1994, 19, 55-72. *
* *
* To be quoted for publications of PHD output: *
* Papers 1-3 for the prediction of secondary structure and the pre- *
* diction server. *
* *
****************************************************************************
* *
* About the input to the network *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The prediction is performed by a system of neural networks. *
* The input is a multiple sequence alignment. It is taken from an HSSP *
* file (produced by the program MaxHom: *
* Sander, Chris & Schneider, Reinhard: Database of Homology-Derived *
* Structures and the Structural Meaning of Sequence Alignment. *
* Proteins, 1991, 9, 56-68. *
* *
* For optimal results the alignment should contain sequences with varying *
* degrees of sequence similarity relative to the input protein. *
* The following is an ideal situation: *
* *
* +-----------------+----------------------+ *
* | sequence: | sequence identity | *
* +-----------------+----------------------+ *
* | target sequence | 100 % | *
* | aligned seq. 1 | 90 % | *
* | aligned seq. 2 | 80 % | *
* | ... | ... | *
* | aligned seq. 7 | 30 % | *
* +-----------------+----------------------+ *
* *
****************************************************************************
* *
* Estimated Accuracy of Prediction *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* A careful cross validation test on some 250 protein chains (in total *
* about 55,000 residues) with less than 25% pairwise sequence identity *
* gave the following results: *
* *
* ++================++-----------------------------------------+ *
* || Qtotal = 72.1% || ("overall three state accuracy") | *
* ++================++-----------------------------------------+ *
* *
* +----------------------------+-----------------------------+ *
* | Qhelix (% of observed)=70% | Qhelix (% of predicted)=77% | *
* | Qstrand(% of observed)=62% | Qstrand(% of predicted)=64% | *
* | Qloop (% of observed)=79% | Qloop (% of predicted)=72% | *
* +----------------------------+-----------------------------+ *
*..........................................................................*
* *
* These percentages are defined by: *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* | number of correctly predicted residues *
* |Qtotal = --------------------------------------- (*100)*
* | number of all residues *
* | *
* | no of res correctly predicted to be in helix *
* |Qhelix (% of obs) = -------------------------------------------- (*100)*
* | no of all res observed to be in helix *
* | *
* | *
* | no of res correctly predicted to be in helix *
* |Qhelix (% of pred)= -------------------------------------------- (*100)*
* | no of all residues predicted to be in helix *
* *
*..........................................................................*
* *
* Averaging over single chains *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The most reasonable way to compute the overall accuracies is the above *
* quoted percentage of correctly predicted residues. However, since the *
* user is mainly interested in the expected performance of the prediction *
* for a particular protein, the mean value when averaging over protein *
* chains might be of help as well. Computing first the three state *
* accuracy for each protein chain, and then averaging over 250 chains *
* yields the following average: *
* *
* +-------------------------------====--+ *
* | Qtotal/averaged over chains = 72.2% | *
* +-------------------------------====--+ *
* | standard deviation = 9.3% | *
* +-------------------------------------+ *
* *
*..........................................................................*
* *
* Further measures of performance *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Matthews correlation coefficient: *
* *
* +---------------------------------------------+ *
* | Chelix = 0.63, Cstrand = 0.53, Cloop = 0.52 | *
* +---------------------------------------------+ *
*..........................................................................*
* *
* Average length of predicted secondary structure segments: *
* *
* . +------------+----------+ *
* . | predicted | observed | *
* +-----------+------------+----------+ *
* | Lhelix = | 10.3 | 9.3 | *
* | Lstrand = | 5.0 | 5.3 | *
* | Lloop = | 7.2 | 5.9 | *
* +-----------+------------+----------+ *
*..........................................................................*
* *
* The accuracy matrix in detail: *
* *
* +---------------------------------------+ *
* | number of residues with H, E, L | *
* +---------+------+------+------+--------+ *
* | |net H |net E |net L |sum obs | *
* +---------+------+------+------+--------+ *
* | obs H |12447 | 1255 | 3990 | 17692 | *
* | obs E | 949 | 7493 | 3750 | 12192 | *
* | obs L | 2604 | 2875 |19962 | 25441 | *
* +---------+------+------+------+--------+ *
* | sum Net |16000 |11623 |27702 | 55325 | *
* +---------+------+------+------+--------+ *
* *
* Note: This table is to be read in the following manner: *
* 12447 of all residues predicted to be in helix, were observed to *
* be in helix, 949 however belong to observed strands, 2604 to *
* observed loop regions. The term "observed" refers to the DSSP *
* assignment of secondary structure calculated from 3D coordinates *
* of experimentally determined structures (Dictionary of Secondary *
* Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, *
* 2577-2637). *
* *
****************************************************************************
* *
* Position-specific reliability index *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The network predicts the three secondary structure types using real *
* numbers from the output units. The prediction is assigned by choosing *
* the maximal unit ("winner takes all"). However, the real numbers *
* contain additional information. *
* E.g. the difference between the maximal and the second largest output *
* unit can be used to derive a "reliability index". This index is given *
* for each residue along with the prediction. The index is scaled to *
* have values between 0 (lowest reliability), and 9 (highest). *
* The accuracies (Qtot) to be expected for residues with values above a *
* particular value of the index are given below as well as the fraction *
* of such residues (%res).: *
* *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | *
* | %res |100.0| 99.2| 90.4| 80.9| 71.6| 62.5| 52.8| 42.3| 29.8| 14.1| *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | | | | | | | | | | | | *
* | Qtot | 72.1| 72.3| 74.8| 77.7| 80.3| 82.9| 85.7| 88.5| 91.1| 94.2| *
* | | | | | | | | | | | | *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | H%obs| 70.4| 70.6| 73.7| 77.1| 80.1| 83.1| 86.0| 89.3| 92.5| 96.4| *
* | E%obs| 61.5| 61.7| 63.7| 66.6| 69.1| 71.7| 74.6| 77.0| 77.8| 68.1| *
* | | | | | | | | | | | | *
* | H%prd| 77.8| 78.0| 80.0| 82.6| 84.7| 86.9| 89.2| 91.3| 93.1| 95.4| *
* | E%prd| 64.5| 64.7| 67.8| 71.0| 74.2| 77.6| 81.4| 85.1| 89.8| 93.5| *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* *
* The above table gives the cumulative results, e.g. 62.5% of all *
* residues have a reliability of at least 5. The overall three-state *
* accuracy for this subset of almost two thirds of all residues is 82.9%. *
* For this subset, e.g., 83.1% of the observed helices are correctly *
* predicted, and 86.9% of all residues predicted to be in helix are *
* correct. *
* *
*..........................................................................*
* *
* The following table gives the non-cumulative quantities, i.e. the *
* values per reliability index range. These numbers answer the question: *
* how reliable is the prediction for all residues labeled with the *
* particular index i. *
* *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | index| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | *
* | %res | 8.8| 9.5| 9.3| 9.1| 9.7| 10.5| 12.5| 15.7| 14.1| *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | | | | | | | | | | | *
* | Qtot | 46.6| 50.6| 57.7| 62.6| 67.9| 74.2| 82.2| 88.3| 94.2| *
* | | | | | | | | | | | *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | H%obs| 36.8| 42.3| 49.5| 55.2| 61.7| 69.9| 78.8| 87.4| 96.4| *
* | E%obs| 44.7| 44.5| 52.1| 55.4| 60.9| 68.0| 75.9| 81.0| 68.1| *
* | | | | | | | | | | | *
* | H%prd| 49.9| 52.5| 60.3| 64.2| 69.2| 77.5| 85.4| 89.9| 95.4| *
* | E%prd| 41.7| 47.1| 53.6| 57.0| 64.0| 71.6| 78.8| 88.8| 93.5| *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* *
* For example, for residues with Relindex = 5 64% of all predicted betha- *
* strand residues are correctly identified. *
* *
* *
****************************************************************************
* *
* *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* Solvent accessibility prediction by PHDacc: *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Author: Burkhard Rost *
* EMBL, Heidelberg, FRG *
* Meyerhofstrasse 1, 69 117 Heidelberg *
* Internet: Rost@EMBL-Heidelberg.DE *
* *
* All rights reserved. *
* *
* *
****************************************************************************
* *
* About the network method *
* ~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The network for prediction of secondary structure is described in *
* detail in: *
* Rost, Burkhard; Sander, Chris: *
* Prediction of protein structure at better than 70% accuracy. *
* J. Mol. Biol., 1993, 232, 584-599. *
* *
* The analysis of the prediction of solvent exposure is given in: *
* Rost, Burkhard; Sander, Chris: *
* Conservation and prediction of solvent accessibility in protein *
* families. Proteins, 1994, 20, 216-226. *
* *
* To be quoted for publications of PHD exposure prediction: *
* Both papers quoted above. *
* *
****************************************************************************
* *
* Definition of accessibility *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* For training the residue solvent accessibility the DSSP (Dictionary of *
* Secondary Structure of Proteins; Kabsch & Sander (1983) Biopolymers, 22,*
* 2577-2637) values of accessible surface area have been used. The *
* prediction provides values for the relative solvent accessibility. The *
* normalisation is the following: *
* *
* | ACCESSIBILITY (from DSSP in Angstrom) *
* |RELATIVE_ACCESSIBILITY = ------------------------------------- * 100 *
* | MAXIMAL_ACC (amino acid type i) *
* *
* where MAXIMAL_ACC (i) is the maximal accessibility of amino acid type i.*
* The maximal values are: *
* *
* +----+----+----+----+----+----+----+----+----+----+----+----+ *
* | A | B | C | D | E | F | G | H | I | K | L | M | *
* | 106| 160| 135| 163| 194| 197| 84| 184| 169| 205| 164| 188| *
* +----+----+----+----+----+----+----+----+----+----+----+----+ *
* | N | P | Q | R | S | T | V | W | X | Y | Z | *
* | 157| 136| 198| 248| 130| 142| 142| 227| 180| 222| 196| *
* +----+----+----+----+----+----+----+----+----+----+----+ *
* *
* Notation: one letter code for amino acid, B stands for D or N; Z stands *
* for E or Q; and X stands for undetermined. *
* *
* The relative solvent accessibility can be used to estimate the number *
* of water molecules (W) in contact with the residue: *
* *
* W = ACCESSIBILITY /10 *
* *
* The prediction is given in 10 states for relative accessibility, with *
* *
* RELATIVE_ACCESSIBILITY = (PREDICTED_ACC * PREDICTED_ACC) *
* *
* where PREDICTED_ACC = 0 - 9. *
* *
****************************************************************************
* *
* Estimated Accuracy of Prediction *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* A careful cross validation test on some 238 protein chains (in total *
* about 62,000 residues) with less than 25% pairwise sequence identity *
* gave the following results: *
* *
* *
* Correlation *
* ........... *
* *
* The correlation between observed and predicted solvent accessibility *
* is: *
* *
* ----------- *
* corr = 0.53 *
* ----------- *
* *
* This value ought to be compared to the worst and best case prediction *
* scenario: random prediction (corr = 0.0) and homology modelling *
* (corr = 0.66). (Note: homology modelling yields a relative accurate *
* prediction in 3D if, and only if, a significantly identical sequence *
* has a known 3D structure.) *
* *
* *
* 3-state accuracy *
* ................ *
* *
* Often the relative accessibility is projected onto, e.g., 3 states: *
* b = buried (here defined as < 9% relative accessibility), *
* i = intermediate ( 9% <= rel. acc. < 36% ), *
* e = exposed ( rel. acc. >= 36% ). *
* *
* A projection onto 3 states or 2 states (buried/exposed) enables the *
* compilation of a 3- and 2-state prediction accuracy. PHD reaches an *
* overall 3-state accuracy of: *
* Q3 = 57.5% *
* (compared to 35% for random prediction and 70% for homology modelling). *
* *
* In detail: *
* *
* +-----------------------------------+-------------------------+ *
* | Qburied (% of observed)=77% | Qb (% of predicted)=60% | *
* | Qintermediate (% of observed)= 9% | Qi (% of predicted)=44% | *
* | Qexposed (% of observed)=78% | Qe (% of predicted)=56% | *
* +-----------------------------------+-------------------------+ *
* *
* *
* 10-state accuracy *
* ................. *
* *
* The network predicts relative solvent accessibility in 10 states, with *
* state i (i = 0-9) corresponding to a relative solvent accessibility of *
* i*i %. The 10-state accuracy of the network is: *
* *
* Q10 = 24.5% *
* *
*..........................................................................*
* *
* These percentages are defined by: *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* | number of correctly predicted residues *
* |Q3 = --------------------------------------- (*100)*
* | number of all residues *
* | *
* | no of res. correctly predicted to be buried *
* |Qburied (% of obs) = ------------------------------------------- (*100)*
* | no of all res. observed to be buried *
* | *
* | *
* | no of res. correctly predicted to be buried *
* |Qburied (% of pred)= ------------------------------------------- (*100)*
* | no of all residues predicted to be buried *
* *
*..........................................................................*
* *
* Averaging over single chains *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The most reasonable way to compute the overall accuracies is the above *
* quoted percentage of correctly predicted residues. However, since the *
* user is mainly interested in the expected performance of the prediction *
* for a particular protein, the mean value when averaging over protein *
* chains might be of help as well. Computing first the correlation *
* between observed and predicted accessibility for each protein chan, and *
* then averaging over all 238 chains yields the following average: *
* *
* +-------------------------------====--+ *
* | corr/averaged over chains = 0.53 | *
* +-------------------------------====--+ *
* | standard deviation = 0.11 | *
* +-------------------------------------+ *
* *
*..........................................................................*
* *
* Further details of performance accuracy *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The accuracy matrix in detail: *
* .............................. *
* *
* -------+----------------------------------------------------+----------- *
* \ PHD | 0 1 2 3 4 5 6 7 8 9 | SUM %obs *
* -------+----------------------------------------------------+----------- *
* OBS 0 | 8611 140 8 44 82 169 772 334 27 0 | 10187 16.6 *
* OBS 1 | 4367 164 0 50 106 231 738 346 44 3 | 6049 9.8 *
* OBS 2 | 3194 168 1 68 125 303 951 513 42 7 | 5372 8.7 *
* OBS 3 | 2760 159 8 80 136 327 1246 746 58 19 | 5539 9.0 *
* OBS 4 | 2312 144 2 72 166 396 1615 1245 124 19 | 6095 9.9 *
* OBS 5 | 1873 96 3 84 138 425 1979 1834 187 27 | 6646 10.8 *
* OBS 6 | 1387 67 1 60 80 278 2237 2627 231 51 | 7019 11.4 *
* OBS 7 | 1082 35 0 32 56 225 1871 3107 302 60 | 6770 11.0 *
* OBS 8 | 660 25 0 27 43 136 1206 2374 325 87 | 4883 7.9 *
* OBS 9 | 325 20 2 27 29 74 648 1159 366 214 | 2864 4.7 *
* -------+----------------------------------------------------+----------- *
* SUM |26571 1018 25 544 961 2564 13263 14285 1706 487 | *
* %pred | 43.3 1.7 0.0 0.9 1.6 4.2 21.6 23.3 2.8 0.8 | *
* -------+----------------------------------------------------+----------- *
* *
* Note: This table is to be read in the following manner: *
* 8611 of all residues predicted to be in exposed by 0%, were *
* observed with 0% relative accessibility. However, 325 of all *
* residues predicted to have 0% are observed as completely exposed *
* (obs = 9 -> rel. acc. >= 81%). The term "observed" refers to the *
* DSSP compilation of area of solvent accessibility calculated from *
* 3D coordinates of experimentally determined structures (Diction- *
* ary of Secondary Structure of Proteins: Kabsch & Sander (1983) *
* Biopolymers, 22, 2577-2637). *
* *
* *
* Accuracy for each amino acid: *
* ............................. *
* *
* +---+------------------------------+-----+-------+------+ *
* |AA | Q3 b%o b%p i%o i%p e%o e%p | Q10 | corr | N | *
* +---+------------------------------+-----+-------+------+ *
* | A | 59.0 87 60 2 38 66 57 | 31 | 0.530 | 5054 | *
* | C | 62.0 91 67 5 39 25 21 | 34 | 0.244 | 893 | *
* | D | 56.5 21 45 6 49 94 57 | 20 | 0.321 | 3536 | *
* | E | 60.8 9 40 3 41 98 61 | 21 | 0.347 | 3743 | *
* | F | 63.3 94 67 9 46 29 37 | 27 | 0.366 | 2436 | *
* | G | 52.1 75 51 1 31 67 53 | 22 | 0.405 | 4787 | *
* | H | 50.9 63 53 23 45 71 50 | 18 | 0.442 | 1366 | *
* | I | 64.9 95 68 6 41 30 38 | 34 | 0.360 | 3437 | *
* | K | 66.6 2 11 2 37 98 67 | 23 | 0.267 | 3652 | *
* | L | 61.6 93 65 8 44 31 40 | 31 | 0.368 | 5016 | *
* | M | 60.1 92 64 5 39 45 44 | 29 | 0.452 | 1371 | *
* | N | 55.5 45 45 8 38 87 59 | 17 | 0.410 | 2923 | *
* | P | 53.0 48 48 9 39 83 56 | 18 | 0.364 | 2920 | *
* | Q | 54.3 27 44 7 44 92 56 | 20 | 0.344 | 2225 | *
* | R | 49.9 15 47 36 47 76 51 | 18 | 0.372 | 2765 | *
* | S | 55.6 69 53 3 51 81 56 | 22 | 0.464 | 3981 | *
* | T | 51.8 61 51 8 38 78 53 | 21 | 0.432 | 3740 | *
* | V | 61.1 93 65 5 40 39 42 | 34 | 0.418 | 4156 | *
* | W | 56.2 85 62 20 49 29 27 | 21 | 0.318 | 891 | *
* | Y | 49.7 73 52 33 49 36 38 | 19 | 0.359 | 2301 | *
* +---+------------------------------+-----+-------+------+ *
* *
* Abbreviations: *
* *
* AA: amino acid in one-letter code *
* b%o, i%o, e%o: = Qburied, Qintermediate, Qexposed (% of observed), *
* i.e. percentage of correct prediction in each state, see above *
* b%p, i%p, e%p: = Qburied, Qintermediate, Qexposed (% of predicted), *
* i.e. probability of correct prediction in each state, see above *
* b%o: = Qburied (% of observed), see above *
* Q10: percentage of correctly predicted residues in each of the 10 *
* states of predicted relative accessibility. *
* corr: correlation between predicted and observed rel. acc. *
* N: number of residues in data set *
* *
* *
* Accuracy for different secondary structure: *
* ........................................... *
* *
* +--------+------------------------------+----+-------+-------+ *
* | type | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | N | *
* +--------+------------------------------+----+-------+-------+ *
* | helix | 59.5 79 64 8 44 80 56 | 27 | 0.574 | 20100 | *
* | strand | 61.3 84 73 9 46 69 37 | 35 | 0.524 | 13356 | *
* | loop | 54.4 64 43 11 44 78 61 | 18 | 0.442 | 27968 | *
* +--------+------------------------------+----+-------+-------+ *
* *
* Abbreviations as before. *
* *
****************************************************************************
* *
* Position-specific reliability index *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The network predicts the 10 states for relative accessibility using real*
* numbers from the output units. The prediction is assigned by choosing *
* the maximal unit ("winner takes all"). However, the real numbers *
* contain additional information. *
* E.g. the difference between the maximal and the second largest output *
* unit (with the constraint that the second largest output is compiled *
* among all units at least 2 positions off the maximal unit) can be used *
* to derive a "reliability index". This index is given for each residue *
* along with the prediction. The index is scaled to have values between *
* 0 (lowest reliability), and 9 (highest). *
* The accuracies (Q3, corr, asf.) to be expected for residues with values *
* above a particular value of the index are given below as well as the *
* fraction of such residues (%res).: *
* *
* +---+------------------------------+----+-------+-------+ *
* |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | *
* +---+------------------------------+----+-------+-------+ *
* | 0 | 57.5 77 60 9 44 78 56 | 24 | 0.535 | 100.0 | *
* | 1 | 59.1 76 63 9 45 82 57 | 25 | 0.560 | 91.2 | *
* | 2 | 61.7 79 66 4 47 87 58 | 27 | 0.594 | 77.1 | *
* | 3 | 66.6 87 70 1 51 89 63 | 30 | 0.650 | 57.1 | *
* | 4 | 70.0 89 72 0 83 91 67 | 32 | 0.686 | 45.8 | *
* | 5 | 72.9 92 75 0 0 93 70 | 34 | 0.722 | 35.6 | *
* | 6 | 76.3 95 77 0 0 93 75 | 36 | 0.769 | 24.7 | *
* | 7 | 79.0 97 79 0 0 93 78 | 39 | 0.803 | 16.0 | *
* | 8 | 80.9 98 80 0 0 91 81 | 43 | 0.824 | 9.6 | *
* | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | *
* +---+------------------------------+----+-------+-------+ *
* *
* Abbreviations as before. *
* *
* The above table gives the cumulative results, e.g. 45.8% of all *
* residues have a reliability of at least 4. The correlation for this *
* most reliably predicted half of the residues is 0.686, i.e. a value *
* comparable to what could be expected if homology modelling were *
* possible. For this subset of 45.8% of all residues, 89% of the buried *
* residues are correctly predicted, and 72% of all residues predicted to *
* be buried are correct. *
* *
*..........................................................................*
* *
* The following table gives the non-cumulative quantities, i.e. the *
* values per reliability index range. These numbers answer the question: *
* how reliable is the prediction for all residues labeled with the *
* particular index i. *
* *
* +---+------------------------------+----+-------+-------+ *
* |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | *
* +---+------------------------------+----+-------+-------+ *
* | 0 | 40.9 79 40 16 41 21 40 | 14 | 0.175 | 8.8 | *
* | 1 | 45.4 61 46 28 44 48 44 | 17 | 0.278 | 14.1 | *
* | 2 | 47.4 53 52 10 46 80 44 | 19 | 0.343 | 19.9 | *
* | 3 | 52.9 75 59 4 50 77 47 | 23 | 0.439 | 11.4 | *
* | 4 | 60.0 81 63 0 83 84 56 | 25 | 0.547 | 10.1 | *
* | 5 | 65.2 82 70 0 0 93 62 | 28 | 0.607 | 10.9 | *
* | 6 | 71.3 90 72 0 0 94 70 | 31 | 0.692 | 8.8 | *
* | 7 | 76.0 94 76 0 0 95 75 | 34 | 0.762 | 6.3 | *
* | 8 | 80.5 97 81 0 0 94 79 | 39 | 0.808 | 3.8 | *
* | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | *
* +---+------------------------------+----+-------+-------+ *
* *
* For example, for residues with RI = 4 83% of all predicted intermediate *
* residues are correctly predicted as such. *
* *
* *
****************************************************************************
* *
* *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* Prediction of helical transmembrane segments by PHDhtm: *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Author: Burkhard Rost *
* EMBL, Heidelberg, FRG *
* Meyerhofstrasse 1, 69 117 Heidelberg *
* Internet: Rost@EMBL-Heidelberg.DE *
* *
* All rights reserved. *
* *
* *
****************************************************************************
* *
* About the network method *
* ~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The PHD mail server is described in: *
* Rost, Burkhard; Sander, Chris; Schneider, Reinhard: *
* PHD - an automatic mail server for protein secondary structure *
* prediction. *
* CABIOS, 1994, 10, 53-60. *
* *
* To be quoted for publications of PHDhtm output: *
* Rost, Burkhard; Casadio, Rita; Fariselli, Piero; Sander, Chris: *
* Prediction of helical transmembrane segments at 95% accuracy. *
* Protein Science, 1995, 4, 521-533. *
* *
****************************************************************************
* *
* Estimated Accuracy of Prediction *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* A cross validation test on 69 helical trans-membrane proteins (in total*
* about 30,000 residues) with less than 25% pairwise sequence identity *
* gave the following results: *
* *
* ++================++-----------------------------------------+ *
* || Qtotal = 94.7% || ("overall two state accuracy") | *
* ++================++-----------------------------------------+ *
* *
* +----------------------------+-----------------------------+ *
* | Qhelix (% of observed)=92% | Qhelix (% of predicted)=83% | *
* | Qloop (% of observed)=96% | Qloop (% of predicted)=97% | *
* +----------------------------+-----------------------------+ *
* *
*..........................................................................*
* *
* These percentages are defined by: *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* | number of correctly predicted residues *
* |Qtotal = --------------------------------------- (*100)*
* | number of all residues *
* | *
* | no of res correctly predicted to be in helix *
* |Qhelix (% of obs) = -------------------------------------------- (*100)*
* | no of all res observed to be in helix *
* | *
* | *
* | no of res correctly predicted to be in helix *
* |Qhelix (% of pred)= -------------------------------------------- (*100)*
* | no of all residues predicted to be in helix *
* *
*..........................................................................*
* *
* Further measures of performance *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Matthews correlation coefficient: *
* *
* +---------------------------------------------+ *
* | Chelix = 0.84, Cloop = 0.84 | *
* +---------------------------------------------+ *
*..........................................................................*
* *
* Average length of predicted secondary structure segments: *
* *
* | +------------+----------+ *
* | | predicted | observed | *
* +-----------+------------+----------+ *
* | Lhelix = | 24.6 | 22.2 | *
* +-----------+------------+----------+ *
*..........................................................................*
* *
* The accuracy matrix in detail: *
* *
* +---------------------------------+ *
* | number of residues with H, L | *
* +---------+------+-------+--------+ *
* | |net H | net L |sum obs | *
* +---------+------+-------+--------+ *
* | obs H | 5214 | 492 | 5706 | *
* | obs L | 1050 | 22423 | 23473 | *
* +---------+------+-------+--------+ *
* | sum Net | 6264 | 22915 | 29179 | *
* +---------+------+-------+--------+ *
* *
* Note: This table is to be read in the following manner: *
* 5214 of all residues predicted to be in a helical trans-membrane *
* region, were observed to be in the lipid bilayer, 1050 however *
* were observed either inside or outside of the protein, i.e. in *
* loop (or non-membrane) regions. The term "observed" refers to DSSP*
* assignment of secondary structure calculated from 3D coordinates *
* of experimentally determined structures (Dictionary of Secondary *
* Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, *
* 2577-2637) where these were available. For all other proteins, *
* the assignment of trans-membrane segments has been taken from the *
* Swissprot data bank (Bairoch, A.; Boeckmann, B.: The SWISS-PROT *
* protein sequence data bank. Nucl. Acids Res. 20: 2019-2022, 1992).*
* *
*..........................................................................*
* *
* Overlap between predicted and observed segments: *
* *
* +-----------------+---------------+----------------+ *
* | segment overlap | % of observed | % of predicted | *
* | Sov helix | 95.6% | 95.5% | *
* | Sov loop | 83.6% | 97.2% | *
* +-----------------+---------------+----------------+ *
* | Sov total | 86.0% | 96.8% | *
* +-----------------+---------------+----------------+ *
* *
* Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26. *
* *
* As helical trans-membrane segments are longer than globular heli- *
* ces, correctly predicted segments can easily be made out. PHDhtm *
* misses 5 out of 258 observed segments, predicts 6 where non is *
* observed and 3 times the predicted helical segment overlaps two *
* observed regions. Thus, in total more than 95% of all segments *
* are correctly predicted. *
* *
*..........................................................................*
* *
* Entropy of prediction (information measure): *
* *
* +-----------------+ *
* | I = 0.64 | *
* +-----------------+ *
* *
* (For comparison: homology modelling of globular proteins in three *
* states: I=0.62.) *
* Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26. *
* *
****************************************************************************
* *
* Position-specific reliability index *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* The network predicts two states: helical trans-membrane region and rest *
* using two output units. The prediction is assigned by choosing the ma- *
* ximal unit ("winner takes all"). However, the real numbers of the out- *
* put units contain additional information. *
* E.g. the difference between the two output units can be used to derive *
* a "reliability index". This index is given for each residue along with *
* the prediction. The index is scaled to have values between 0 (lowest *
* reliability), and 9 (highest). *
* The accuracies (Qtot) to be expected for residues with values above a *
* particular value of the index are given below as well as the fraction *
* of such residues (%res).: *
* *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | *
* | %res |100.0| 98.8| 97.3| 95.9| 94.1| 92.3| 89.9| 86.2| 75.0| 66.8| *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | | | | | | | | | | | | *
* | Qtot | 94.7| 95.2| 95.6| 96.2| 96.7| 97.2| 97.7| 98.4| 99.4| 99.8| *
* | | | | | | | | | | | | *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* | H%obs| 91.8| 92.9| 93.8| 94.4| 95.0| 95.7| 96.2| 96.8| 95.5| 78.7| *
* | L%obs| 95.3| 95.7| 96.1| 96.6| 97.0| 97.5| 98.1| 98.8| 99.7|100.0| *
* | | | | | | | | | | | | *
* | H%prd| 82.7| 83.8| 85.0| 86.7| 88.1| 89.7| 91.4| 93.8| 96.3| 97.1| *
* | L%prd| 97.9| 98.3| 98.5| 98.7| 98.8| 99.0| 99.2| 99.4| 99.7| 99.9| *
* +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ *
* *
* The above table gives the cumulative results, e.g. 92.3% of all *
* residues have a reliability of at least 5. The overall two-state *
* accuracy for this subset is 97.2%. For this subset, e.g., 95.7% of *
* the observed helical trans-membrane residues are correctly predicted, *
* and 89.7% of all residues predicted to be in helical trans-membrane *
* segment are correct. *
* *
* *
* *
****************************************************************************
The resulting network (PHD) prediction is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
****************************************************************************
* *
* PredictProtein@EMBL-Heidelberg.DE *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* PHD: Profile fed neural network systems from HeiDelberg *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* Prediction of: *
* - secondary structure, by PHDsec *
* - solvent accessibility, by PHDacc *
* - and helical transmembrane regions, by PHDhtm *
* *
* Author: Burkhard Rost *
* EMBL, Heidelberg, FRG *
* Meyerhofstrasse 1, 69 117 Heidelberg *
* Internet: Predict-Help@EMBL-Heidelberg.DE *
* All rights reserved. *
* *
****************************************************************************
* *
* The network systems are described in: *
* *
* PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599. *
* B Rost & C Sander: Proteins, 1994, 19, 55-72. *
* PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226. *
* PHDhtm: B Rost, R Casadio, P Fariselli & C Sander, *
* Prot. Science, 4, 521-533. *
* *
****************************************************************************
* *
* Some statistics *
* ~~~~~~~~~~~~~~~ *
* *
* Percentage of amino acids: *
* +--------------+--------+--------+--------+--------+--------+ *
* | AA: | L | G | A | F | I | *
* | % of AA: | 9.6 | 9.2 | 9.0 | 8.3 | 7.2 | *
* +--------------+--------+--------+--------+--------+--------+ *
* | AA: | V | T | P | M | S | *
* | % of AA: | 6.7 | 6.1 | 6.0 | 5.4 | 5.2 | *
* +--------------+--------+--------+--------+--------+--------+ *
* | AA: | Y | H | W | R | Q | *
* | % of AA: | 4.9 | 3.4 | 3.2 | 2.7 | 2.7 | *
* +--------------+--------+--------+--------+--------+--------+ *
* | AA: | N | E | D | K | C | *
* | % of AA: | 2.7 | 2.7 | 2.7 | 1.6 | 0.5 | *
* +--------------+--------+--------+--------+--------+--------+ *
* *
* Percentage of secondary structure predicted: *
* +--------------+--------+--------+--------+ *
* | SecStr: | H | E | L | *
* | % Predicted: | 49.6 | 12.6 | 37.7 | *
* +--------------+--------+--------+--------+ *
* *
* According to the following classes: *
* all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 *
* alpha-beta : %H>30 and %E>20; mixed: rest, *
* this means that the predicted class is: mixed class *
* *
****************************************************************************
* *
* PHD output for your protein *
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~ *
* *
* Wed Nov 15 04:29:17 1995 *
* Jury on: 10 different architectures (version 5.94_317 ). *
* Note: differently trained architectures, i.e., different versions can *
* result in different predictions. *
* *
****************************************************************************
* *
* About the protein *
* ~~~~~~~~~~~~~~~~~ *
* *
* HEADER /home/phd/tmp/t2_12969.seq *
* COMPND *
* SOURCE *
* AUTHOR *
* SEQLENGTH 554 *
* NCHAIN 1 chain(s) in t2_12969 data set *
* NALIGN 68 *
* (=number of aligned sequences in HSSP file) *
* *
****************************************************************************
* *
* Abbreviations: PHDsec *
* ~~~~~~~~~~~~~~~~~~~~~ *
* *
* sequence: *
* AA : amino acid sequence *
* secondary structure: *
* HEL: H=helix, E=extended (sheet), blank=other (loop) *
* PHD: Profile network prediction HeiDelberg *
* Rel: Reliability index of prediction (0-9) *
* detail: *
* prH: 'probability' for assigning helix *
* prE: 'probability' for assigning strand *
* prL: 'probability' for assigning loop *
* note: the 'probabilites' are scaled to the interval 0-9, e.g.,*
* prH=5 means, that the first output node is 0.5-0.6 *
* subset: *
* SUB: a subset of the prediction, for all residues with an expected *
* average accuracy > 82% (tables in header) *
* note: for this subset the following symbols are used: *
* L: is loop (for which above " " is used) *
* ".": means that no prediction is made for this residue, as the *
* reliability is: Rel < 5 *
* *
* Abbreviations: PHDacc *
* ~~~~~~~~~~~~~~~~~~~~~ *
* *
* solvent accessibility: *
* 3st: relative solvent accessibility (acc) in 3 states: *
* b = 0-9%, i = 9-36%, e = 36-100%. *
* PHD: Profile network prediction HeiDelberg *
* Rel: Reliability index of prediction (0-9) *
* P_3: predicted relative accessibility in 3 states *
* note: for convenience a blank is used intermediate (i). *
* 10st:relative accessibility in 10 states: *
* = n corresponds to a relative acc. of n*n % *
* subset: *
* SUB: a subset of the prediction, for all residues with an expected *
* average correlation > 0.69 (tables in header) *
* note: for this subset the following symbols are used: *
* "I": is intermediate (for which above " " is used) *
* ".": means that no prediction is made for this residue, as the *
* reliability is: Rel < 4 *
* *
* *
* Abbreviations: PHDhtm *
* ~~~~~~~~~~~~~~~~~~~~~ *
* *
* secondary structure: *
* HL: T=helical transmembrane region, blank=other (loop) *
* PHD: Profile network prediction HeiDelberg *
* PHDF:filtered prediction, i.e., too long transmembrane segments *
* are split, too short ones are deleted *
* Rel: Reliability index of prediction (0-9) *
* detail: *
* prH: 'probability' for assigning helical transmembrane region *
* prL: 'probability' for assigning loop *
* note: the 'probabilites' are scaled to the interval 0-9, e.g.,*
* prH=5 means, that the first output node is 0.5-0.6 *
* subset: *
* SUB: a subset of the prediction, for all residues with an expected *
* average accuracy > 82% (tables in header) *
* note: for this subset the following symbols are used: *
* L: is loop (for which above " " is used) *
* ".": means that no prediction is made for this residue, as the *
* reliability is: Rel < 5 *
* *
****************************************************************************
* *
* protein: t2_1296 length 554 *
* *
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ|
PHD sec | HHHHHHHEEEE HHHHHHHHHHHHHHHHHHHHHHHHHHHH |
Rel sec |997555777634454432144445331213345779998765599999999999467964|
detail:
prH sec |000001111136676655422211334545566878988776699999999998621011|
prE sec |001221000000000113466521000112322110000000000000000000000002|
prL sec |987666777763222221111257554331010000001112200000000001377976|
subset: SUB sec |LLLLLLLLLL...H.........L........HHHHHHHHHHHHHHHHHHHHHH.LLLL.|
ACCESSIBILITY
3st: P_3 acc |eeeebeeeeeeeee beebbbb bbeebbbbbbbbbbbbbbbbbbbbbbbbbebeeebee|
10st: PHD acc |998707777776774066000050076000000000000000000000000060777076|
Rel acc |464404533750550112451210041664958665336332425665691415343031|
subset: SUB acc |eeee.ee..ee.ee....bb.....e.bbbbbbbbb..b...b.bbbbbb.b.b.e....|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP|
PHD sec | HHHHHHHH EEEEEHHHHHHHHHHHHHHHHHHHH HHHHHHHH |
Rel sec |137899731364457358886176433688577676333247665312467664013225|
detail:
prH sec |000000135676678621101111236788777777665567777545677766321011|
prE sec |431000000000000000000477653100001112333321000100000111333332|
prL sec |468888754322321368887301000101211000000011112343321112345556|
subset: SUB sec |..LLLLL...H..HH.LLLLL.EE...HHHHHHHHH.....HHHH....HHHH......L|
ACCESSIBILITY
3st: P_3 acc |bbeeeebebbeebbeebeeeeebbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb|
10st: PHD acc |006777060078008808977600000000000000000000000000000000000000|
Rel acc |201234111135214615543164367967387899998977466213312778544230|
subset: SUB acc |.....e.....e..ee.eee..bb.bbbbb.bbbbbbbbbbbbbb......bbbbbb...|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI|
PHD sec | HHHHH HHHHHHHHHHHH EEE HHHHHH|
Rel sec |466345433131112236866544564325788778887415735876699997121368|
detail:
prH sec |222332333454433367877666676531111100011000000010000001444578|
prE sec |000000000011131000011222211211000010001347762001100000122210|
prL sec |677566665434334532111001112246788888887642137877799997433110|
subset: SUB sec |.LL..L...........HHHHH..HH...LLLLLLLLLL..EE.LLLLLLLLLL....HH|
ACCESSIBILITY
3st: P_3 acc |ebbbb bbbbbbbbbbebbbbbbbbbbbeebeeeeeebbbbbbbbbbbeeeeeebbbbbb|
10st: PHD acc |600005000000000060000000000078077779700000000000877977000000|
Rel acc |126332400458562112446653213343036553400332613021345633141677|
subset: SUB acc |..b...b..bbbbb....bbbbb.....e...eee.e.....b......eee...b.bbb|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT|
PHD sec |HHHHHHHHHHHHHHHHHHHHEE E EHHHHHHHHHHHHHHHHHHHHHHH|
Rel sec |999973267997555553112211676652123331113689999999986579998878|
detail:
prH sec |889875577887666665443343111121100123245779999999987788988888|
prE sec |000000000000111113445532100013443222333110000000000000000000|
prL sec |000013421001211111000124677764355553311000000000012210001111|
subset: SUB sec |HHHHH..HHHHHHHHHH.......LLLLL..........HHHHHHHHHHHHHHHHHHHHH|
ACCESSIBILITY
3st: P_3 acc |bbbbbbbbbbbbbbbbbbbbbbebeeebbeb ebbbbbbbbbbbbbbbbbbbbbbbbbbb|
10st: PHD acc |000000000000000000000060677007056000000000000000000000000000|
Rel acc |866051354554565277675315133023001202456675769577765625664695|
subset: SUB acc |bbb.b..bbbbbbbb.bbbbb..b............bbbbbbbbbbbbbbbb.bbbbbbb|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF|
PHD sec |HHHEE HHHHHHHHHHHH EEEEEE HHHHHHHHHH |
Rel sec |762112455551258799707999999998457536899856842332323331436522|
detail:
prH sec |775332122211000100148888899998621221000000124555556554210012|
prE sec |013432211124320000000000000000000026888871001133333333121233|
prL sec |101234666664568889841000000001378741000027863311000012557644|
subset: SUB sec |HH.....LLLL..LLLLLL.HHHHHHHHHH.LLL.EEEEEELL.............LL..|
ACCESSIBILITY
3st: P_3 acc |bbbbb ebbb bbe eeee bbbbbbbbbbbb bebbbbbbbbbbbbbbbbbe eeeebb|
10st: PHD acc |000005600050065999950000000000003060000000000000000074787600|
Rel acc |855101110114320363211575038745211219697951231576187040366202|
subset: SUB acc |bbb........b....e....bbb..bbbb.....bbbbbb....bbb.bb.e..ee...|
....,....31...,....32...,....33...,....34...,....35...,....36
AA |GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW|
PHD sec |EHHHHHHHHHHHHHHHHHHHHHHHHHEEE EHHHHHHEEEEEEE HHHHHHHHH |
Rel sec |145799999999987786886644113313222244432166787379514432565521|
detail:
prH sec |356799999999887787887755442110112465554311100000145555776654|
prE sec |311000000000000001101122334543334322123477888610111233212110|
prL sec |322100000000011101000111222245453211222210000379642111000134|
subset: SUB sec |..HHHHHHHHHHHHHHHHHHHH..................EEEEE.LLL.....HHHH..|
ACCESSIBILITY
3st: P_3 acc |b bbbbbbbbbbbbbbbbbbbbbbbbbbbebebebbbbbbbbbbbbbebbebbbbbbbbe|
10st: PHD acc |030000000000000000000000000006070600000000000007006000000007|
Rel acc |303577598668457576889446536232031140503748789513342676458213|
subset: SUB acc |...bbbbbbbbbbbbbbbbbbbbbb.b.......b.b..bbbbbbb...b.bbbbbb...|
....,....37...,....38...,....39...,....40...,....41...,....42
AA |GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF|
PHD sec | EEE HHHHHHHHHHEE EEEEE EEE EEEEEHHHHHHHHHHHHHHH|
Rel sec |894222213589755663341422133652576266525347852312245788999999|
detail:
prH sec |100000145688866775311233332111110000000011123545566788999998|
prE sec |003555310000000013563112355764101477632367865233322110000000|
prL sec |896333443210122111114654311123677522256621001111110000000000|
subset: SUB sec |LL.......HHHHHHHH..........EE.LLL.EEE.L..EEE......HHHHHHHHHH|
ACCESSIBILITY
3st: P_3 acc |eeebebebebbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb|
10st: PHD acc |866070707000000000000000000000000000000000000000000000000000|
Rel acc |301161303353642257342231655650103065311343999484588533788974|
subset: SUB acc |....e.....b.bb..bb.b....bbbbb.....bb....b.bbbbbbbbbb..bbbbbb|
....,....43...,....44...,....45...,....46...,....47...,....48
AA |AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW|
PHD sec |H HHHEEEE HHHHHHH EEEEEEE EE HHHHH HHHHH|
Rel sec |412402452125332247643203578764132121113136756876346894212233|
detail:
prH sec |645532112210013567766432110100000114445421122111221003455555|
prE sec |001134564332332000000224678776435431110121100011111000000002|
prL sec |343232223456554421133232100112463444443347767876567886543332|
subset: SUB sec |.......E...L.....HH.....EEEEE............LLLLLLL..LLL.......|
ACCESSIBILITY
3st: P_3 acc |bbbb bbbebebeebeeebbebbbbbbbbbbbbbbbbbbbbbbebbb beebbebbebb|
10st: PHD acc |000030006060662777007000000000000000000000070005407720700600|
Rel acc |233004301201110454015308382635437254301645231101114410420101|
subset: SUB acc |.....b.........eee..e..b.b.b.bb.b.bb...bbb........ee..e.....|
....,....49...,....50...,....51...,....52...,....53...,....54
AA |NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET|
PHD sec |HHHHHH HHHHHHHHHHHHHHHHHHHHH EEE |
Rel sec |434433168999999999999999995237554556678897521653789995346787|
detail:
prH sec |556555477899999999999999987531101111110101111000000002322101|
prE sec |232210000000000000000000000001222211100000234773100000010001|
prL sec |111123421000000000000000002357666666778887654225889996567787|
subset: SUB sec |.......HHHHHHHHHHHHHHHHHHHH..LLL.LLLLLLLLLL..EE.LLLLLL..LLLL|
ACCESSIBILITY
3st: P_3 acc |bbbbbbbbbbbbbbbbbbbbbbbebbeeeeeeeeeeeeeeeeeebebbee eeee bbee|
10st: PHD acc |000000000000000000000007007777776777677877760600785677740077|
Rel acc |016332443755466348746833003465521335027643412131260034401353|
subset: SUB acc |..b...bb.bbbbbb.bbbbbb.....eeee....e..eee.e......e...ee...e.|
....,....55...,....56...,....57...,....58...,....59...,....60
AA |LPKPEDWDRAQAHR|
PHD sec | |
Rel sec |88754523212289|
detail:
prH sec |01122233345410|
prE sec |00000000000000|
prL sec |88876655554589|
subset: SUB sec |LLLL.L......LL|
ACCESSIBILITY
3st: P_3 acc |beeeeeeeeeeeee|
10st: PHD acc |07767777767799|
Rel acc |03516535504549|
subset: SUB acc |..e.ee.ee.eeee|
************************************************************
* PHDhtm Helical transmembrane prediction
* note: PHDacc and PHDsec are reliable for water-
* soluble globular proteins, only. Thus,
* please take the predictions above with
* particular caution wherever transmembrane
* helices are predicted by PHDhtm!
************************************************************
PHDhtm
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ|
PHD htm | TTTTTTTTTTTTTTTTT |
Rel htm |999999999999999999999999998754204677888888877652146778999999|
detail:
prH htm |000000000000000000000000000122357888999999988876421110000000|
prL htm |999999999999999999999999999877642111000000011123578889999999|
subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLL....HHHHHHHHHHHHHH...LLLLLLLLLL|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP|
PHD htm | TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT |
Rel htm |999999999999999999999886314677888888888888887776532333331047|
detail:
prH htm |000000000000000000000001357888999999999999998888766666665421|
prL htm |999999999999999999999998642111000000000000001111233333334578|
subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLL...HHHHHHHHHHHHHHHHHHHHHH..........L|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI|
PHD htm | TTTTTTTTTTTTT T|
Rel htm |767667666513454345666641266788999999999999999999999998876411|
detail:
prH htm |111111111246777677888875311100000000000000000000000000011245|
prL htm |888888888753222322111124688899999999999999999999999999988754|
subset: SUB htm |LLLLLLLLLL...H...HHHHH...LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL...|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT|
PHD htm |TTTTTTTTTTTTTTTTTTTTT TTTTTTTTTTTTTTTTTTTTTTTTT|
Rel htm |456676777777777765431046777777777513567888888888888888887652|
detail:
prH htm |778888888888888887765421111111111246788999999999999999998876|
prL htm |221111111111111112234578888888888753211000000000000000001123|
subset: SUB htm |.HHHHHHHHHHHHHHHHH.....LLLLLLLLLLL..HHHHHHHHHHHHHHHHHHHHHHH.|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF|
PHD htm |T TTTTTTTTTTTTTTTTTTTTTT |
Rel htm |013678999999999999999899986520110025677777777777542024677776|
detail:
prH htm |543110000000000000000000001235555567888888888888776432111111|
prL htm |456889999999999999999999998764444432111111111111223567888888|
subset: SUB htm |...LLLLLLLLLLLLLLLLLLLLLLLLL.......HHHHHHHHHHHHHH.....LLLLLL|
....,....31...,....32...,....33...,....34...,....35...,....36
AA |GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW|
PHD htm | TTTTTTTTTTTTTTTTTTTTT TTTTTTTTTTTTTTTTTTT |
Rel htm |531256788888888888776520123567888775202346788887776655410234|
detail:
prH htm |234678899999999999888765432211000012356678899998888877754332|
prL htm |765321100000000000111234567788999987643321100001111122245667|
subset: SUB htm |L...HHHHHHHHHHHHHHHHHH.....LLLLLLLLL.....HHHHHHHHHHHHH......|
....,....37...,....38...,....39...,....40...,....41...,....42
AA |GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF|
PHD htm | TTTTTTTTTTTTTTTTTTTT TTTTTTTTTTTTTTTTTTT|
Rel htm |444455411456777788888877765301234542234302334456777777777788|
detail:
prH htm |222222245778888899999988887644332223332346667778888888888899|
prL htm |777777754221111100000011112355667776667653332221111111111100|
subset: SUB htm |....LL....HHHHHHHHHHHHHHHHH......L............HHHHHHHHHHHHHH|
....,....43...,....44...,....45...,....46...,....47...,....48
AA |AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW|
PHD htm |TTTTTTT TTTTTTTTTTTTTT |
Rel htm |777764203455677777655402567777777754102457899999999999999998|
detail:
prH htm |888887643222111111122246788888888877543221000000000000000000|
prL htm |111112356777888888877753211111111122456778999999999999999999|
subset: SUB htm |HHHHH.....LLLLLLLLLLL...HHHHHHHHHHH.....LLLLLLLLLLLLLLLLLLLL|
....,....49...,....50...,....51...,....52...,....53...,....54
AA |NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET|
PHD htm | TTTTTTTTTTTTTTTTTTTTTT |
Rel htm |762036778888888888888765304678999999999999999999999999999999|
detail:
prH htm |113568889999999999999887642110000000000000000000000000000000|
prL htm |886431110000000000000112357889999999999999999999999999999999|
subset: SUB htm |LL...HHHHHHHHHHHHHHHHHHH...LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....55...,....56...,....57...,....58...,....59...,....60
AA |LPKPEDWDRAQAHR|
PHD htm | |
Rel htm |99999999999999|
detail:
prH htm |00000000000000|
prL htm |99999999999999|
subset: SUB htm |LLLLLLLLLLLLLL|
________________________________________________________________________________
*** ***
********************************************************************************
*** ***
*** Prediction of transmembrane regions (PHDhtm) ***
*** ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ***
*** ***
*** ***
*** Note: The accuracy of predicting helical trans-membrane regions is ***
*** some 95%. In a test on 69 proteins only one was not predicted to be ***
*** a trans-membrane protein (2mlt). PHDsec for the prediction of glo- ***
*** bular proteins predicted this protein more accuractely, than PHDhtm ***
*** for trans-membrane proteins. Vice versa, about 5% out of 300 globu- ***
*** lar proteins were missclassified as trans-membrane molecules. These ***
*** results have two practical consequences: ***
*** (i) if you know that your sequence is partly in a membrane and ***
*** PHDhtm does not predict a clear membrane region: ***
*** -> try PHDsec, it may be more accurate although in general not ***
*** suited for membrane proteins. ***
*** (ii) if you assume your sequence is not at all in a membrane and ***
*** PHDhtm does predict a membrane segment: ***
*** -> ignore the trans-membrane prediction. ***
*** ***
*** For residues predicted to be outside of the lipid bilayer (predicted ***
*** as loop, PHDsec should give reasonably accurate results, provided the ***
*** regions sticking out of the membrane or long enough. ***
*** ***