The following information has been received by the server: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ rost # prion_human MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV ILLISFLIFLIVG ________________________________________________________________________________ The sequence had been interpreted as being: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ >P1; t1 (#) prion_human MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV ILLISFLIFLIVG ________________________________________________________________________________ The alignment that has been used as input to the network is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- ------------------------------------------------------------ --- MAXHOM multiple sequence alignment --- ------------------------------------------------------------ --- --- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY --- ID : identifier of aligned (homologous) protein --- STRID : PDB identifier (only for known structures) --- PIDE : percentage of pairwise sequence identity --- WSIM : percentage of weighted similarity --- LALI : number of residues aligned --- NGAP : number of insertions and deletions (indels) --- LGAP : number of residues in all indels --- LSEQ2 : length of aligned sequence --- ACCNUM : SwissProt accession number --- NAME : one-line description of aligned protein --- --- MAXHOM ALIGNMENT HEADER: SUMMARY ID STRID IDE WSIM LALI NGAP LGAP LEN2 ACCNUM NAME prio_human 100 100 253 0 0 253 P04156 MAJOR PRION PROTEIN PRECU prio_gorgo 100 100 253 0 0 253 P40252 MAJOR PRION PROTEIN PRECU prio_pantr 99 99 253 0 0 253 P40253 MAJOR PRION PROTEIN PRECU prio_ponpy 98 99 253 0 0 253 P40256 MAJOR PRION PROTEIN PRECU prio_colgu 97 98 253 0 0 253 P40251 MAJOR PRION PROTEIN PRECU prio_prefr 97 98 253 0 0 253 P40257 MAJOR PRION PROTEIN PRECU prio_atege 97 96 232 1 9 232 P40246 MAJOR PRION PROTEIN PRECU prio_macfa 96 98 253 0 0 253 P40254 MAJOR PRION PROTEIN PRECU prio_saisc 96 96 253 1 7 260 P40258 MAJOR PRION PROTEIN PRECU prio_calja 96 97 252 1 1 252 P40247 MAJOR PRION PROTEIN PRECU prio_calmo 96 98 241 0 0 241 P40248 MAJOR PRION PROTEIN PRECU prio_cebap 96 96 252 1 1 252 P40249 MAJOR PRION PROTEIN PRECU prio_cerae 96 96 245 1 8 245 P40250 MAJOR PRION PROTEIN PRECU prio_mansp 96 98 241 0 0 241 P40255 MAJOR PRION PROTEIN PRECU prio_aottr 96 96 239 1 1 239 P40245 MAJOR PRION PROTEIN PRECU prio_bovin 92 92 252 2 9 264 P10279 PROTEIN 1). prio_rat 92 94 225 1 1 226 P13852 MAJOR PRION PROTEIN (PRP) prio_trast 91 91 252 2 9 264 P40242 PROTEIN 1). prip_bovin 90 92 252 1 1 256 Q01880 PROTEIN 2). prio_mouse 90 91 252 2 3 254 P04925 MAJOR PRION PROTEIN PRECU prio_sheep 90 92 252 1 1 256 P23907 MAJOR PRION PROTEIN PRECU prio_mesau 90 92 253 1 1 254 P04273 MAJOR PRION PROTEIN PRECU prio_odohe 90 91 252 1 1 256 P47852 MAJOR PRION PROTEIN PRECU prp2_trast 89 90 252 1 1 256 P40243 PROTEIN 2). prio_musvi 88 90 252 2 2 257 P40244 MAJOR PRION PROTEIN PRECU grp_horvu 48 42 88 3 21 200 P17816 GLYCINE-RICH CELL WALL ST grw1_lyces 55 53 38 1 7 43 Q01157 GLYCINE-RICH CELL WALL ST grp2_sinal 40 38 88 2 18 169 P49311 GLYCINE-RICH RNA-BINDING grp8_arath 39 39 87 2 19 169 Q03251 GLYCINE-RICH RNA-BINDING nucl_xenla 41 42 69 3 9 650 P20397 NUCLEOLIN (PROTEIN C23). prio_chick 38 34 249 5 23 273 P27177 RECEPTOR-INDUCING ACTIVIT gar1_schpo 39 35 71 2 22 194 Q06975 GAR1 PROTEIN. grpa_medfa 38 34 93 2 11 159 Q09134 ABSCISIC ACID AND ENVIRON egg1_schja 36 30 135 6 34 212 P19470 EGGSHELL PROTEIN 1 PRECUR grp1_phavu 36 36 127 2 10 252 P10495 GLYCINE-RICH CELL WALL ST grp2_nicsy 35 34 113 2 4 214 P27484 GLYCINE-RICH CELL WALL ST roab_xenla 35 28 122 5 21 351 P17131 PROTEIN) (SINGLE-STRAND B grp1_orysa 35 34 125 2 24 165 P25074 GLYCINE-RICH CELL WALL ST k1ci_human 35 35 120 3 9 622 P35527 KERATIN, TYPE I CYTOSKELE grp2_sorvu 35 30 115 4 21 168 Q99070 GLYCINE-RICH RNA-BINDING grp1_sinal 35 34 98 2 21 166 P49310 GLYCINE-RICH RNA-BINDING gr10_brana 34 32 117 2 24 169 Q05966 GLYCINE-RICH RNA-BINDING ebn1_ebv 34 37 112 1 1 641 P03211 EBNA-1 NUCLEAR PROTEIN. roa2_human 34 26 124 4 15 353 P22626 B1). ch15_drogr 36 33 70 2 14 102 P13425 CHORION PROTEIN S15. asf1_helan 34 28 83 2 24 161 P22357 ANTHER-SPECIFIC PROTEIN S pcp_yeren 34 22 89 2 61 155 P31484 OUTER MEMBRANE LIPOPROTEI grp_dauca 34 28 122 3 48 157 Q03878 GLYCINE-RICH RNA-BINDING ykr3_caeel 35 36 71 0 0 113 P34309 HYPOTHETICAL 11.3 KD PROT vnua_prvka 33 35 105 1 19 1733 P33485 PROBABLE NUCLEAR ANTIGEN. chb3_bommo 34 30 77 2 6 91 P08915 CHORION CLASS B PROTEIN M grpa_maize 33 28 104 2 21 157 P10979 GLYCINE-RICH RNA-BINDING, grp_arath 33 37 132 0 0 338 P27483 GLYCINE-RICH CELL WALL ST els_human 33 22 129 4 19 730 P15502 ELASTIN PRECURSOR. roaa_xenla 33 28 126 3 18 365 P17130 PROTEIN) (SINGLE-STRAND B grp2_orysa 32 34 154 2 23 183 P29834 GLYCINE-RICH CELL WALL ST rnha_human 32 28 99 2 32 1279 Q08211 ATP-DEPENDENT RNA HELICAS grp7_arath 32 37 99 1 4 176 Q03250 GLYCINE-RICH RNA-BINDING grp2_phavu 32 36 127 1 6 465 P10496 GLYCINE-RICH CELL WALL ST chb8_bommo 32 28 77 2 6 119 P08914 CHORION CLASS B PROTEIN M grp1_cheru 32 27 113 3 21 144 P11898 GLYCINE-RICH PROTEIN HC1. els_bovin 32 23 135 4 19 747 P04985 ELASTINS A/B/C PRECURSOR. nucl_rat 32 32 85 1 3 712 P13383 NUCLEOLIN (PROTEIN C23). nucl_human 32 32 85 1 3 706 P19338 NUCLEOLIN (PROTEIN C23). nucl_mouse 32 32 85 1 3 706 P09405 NUCLEOLIN (PROTEIN C23). vg38_bpm1 32 26 114 2 17 262 P08234 RECEPTOR RECOGNIZING PROT grp1_pethy 32 36 130 0 0 384 P09789 GLYCINE-RICH CELL WALL ST mcba_ecoli 42 40 43 1 2 69 P05834 BACTERIOCIN MICROCIN B17 roa1_drome 31 25 131 3 11 365 P07909 (PEN REPEAT CLONE P9). grp3_artsa 31 22 90 3 18 308 P13230 GLYCINE-RICH PROTEIN GRP3 chb7_bommo 32 28 75 2 6 126 P08916 CHORION CLASS B PROTEIN M roa1_scham 31 30 126 2 5 342 P21522 HETEROGENEOUS NUCLEAR RIB sala_droor 31 25 120 3 18 142 P21748 PROTEIN SPALT-ACCESSORY. els_chick 31 28 120 2 8 750 P07916 ELASTIN PRECURSOR (FRAGME spd1_nepcl 31 28 140 1 6 747 P19837 SPIDROIN 1 (DRAGLINE SILK sala_drosi 31 21 134 5 43 139 P21749 PROTEIN SPALT-ACCESSORY. chb4_bommo 31 22 108 5 21 147 P05685 CHORION CLASS B PROTEIN B ews_human 30 24 128 2 27 656 Q01844 RNA-BINDING PROTEIN EWS. sqd_drome 30 28 115 2 12 345 Q08473 (HNRNP 40). ydh3_hsvsc 34 35 61 0 0 103 P22577 HYPOTHETICAL 9.5 KD PROTE sala_drome 30 22 123 4 25 142 P21750 PROTEIN SPALT-ACCESSORY. egg2_schja 30 26 150 4 50 207 P19469 EGGSHELL PROTEIN 2A PRECU ssb_ecoli 32 28 68 1 2 177 P02339 SINGLE-STRAND BINDING PRO prpc_human 30 26 97 1 1 166 P02810 PIF-F, PIF-S, PROTEINS A --- --- MAXHOM ALIGNMENT: IN MSF FORMAT MSF of: /home/phd/tmp/t1_11691.hssp from: 1 to: 253 /home/phd/tmp/t1_11691.ret_msf MSF: 253 Type: P 9-Aug-96 02:00:0 Check: 7304 .. Name: t1_11691 Len: 253 Check: 8781 Weight: 1.00 Name: prio_human Len: 253 Check: 8781 Weight: 1.00 Name: prio_gorgo Len: 253 Check: 9429 Weight: 1.00 Name: prio_pantr Len: 253 Check: 9714 Weight: 1.00 Name: prio_ponpy Len: 253 Check: 629 Weight: 1.00 Name: prio_colgu Len: 253 Check: 433 Weight: 1.00 Name: prio_prefr Len: 253 Check: 92 Weight: 1.00 Name: prio_atege Len: 253 Check: 4448 Weight: 1.00 Name: prio_macfa Len: 253 Check: 9975 Weight: 1.00 Name: prio_saisc Len: 253 Check: 346 Weight: 1.00 Name: prio_calja Len: 253 Check: 8833 Weight: 1.00 Name: prio_calmo Len: 253 Check: 6952 Weight: 1.00 Name: prio_cebap Len: 253 Check: 9659 Weight: 1.00 Name: prio_cerae Len: 253 Check: 2862 Weight: 1.00 Name: prio_mansp Len: 253 Check: 5552 Weight: 1.00 Name: prio_aottr Len: 253 Check: 4849 Weight: 1.00 Name: prio_bovin Len: 253 Check: 542 Weight: 1.00 Name: prio_rat Len: 253 Check: 161 Weight: 1.00 Name: prio_trast Len: 253 Check: 828 Weight: 1.00 Name: prip_bovin Len: 253 Check: 515 Weight: 1.00 Name: prio_mouse Len: 253 Check: 1680 Weight: 1.00 Name: prio_sheep Len: 253 Check: 1310 Weight: 1.00 Name: prio_mesau Len: 253 Check: 9948 Weight: 1.00 Name: prio_odohe Len: 253 Check: 1063 Weight: 1.00 Name: prp2_trast Len: 253 Check: 272 Weight: 1.00 Name: prio_musvi Len: 253 Check: 54 Weight: 1.00 Name: grp_horvu Len: 253 Check: 759 Weight: 1.00 Name: grw1_lyces Len: 253 Check: 2041 Weight: 1.00 Name: grp2_sinal Len: 253 Check: 4903 Weight: 1.00 Name: grp8_arath Len: 253 Check: 6106 Weight: 1.00 Name: nucl_xenla Len: 253 Check: 6307 Weight: 1.00 Name: prio_chick Len: 253 Check: 9139 Weight: 1.00 Name: gar1_schpo Len: 253 Check: 3186 Weight: 1.00 Name: grpa_medfa Len: 253 Check: 2708 Weight: 1.00 Name: egg1_schja Len: 253 Check: 6293 Weight: 1.00 Name: grp1_phavu Len: 253 Check: 3033 Weight: 1.00 Name: grp2_nicsy Len: 253 Check: 830 Weight: 1.00 Name: roab_xenla Len: 253 Check: 3477 Weight: 1.00 Name: grp1_orysa Len: 253 Check: 6544 Weight: 1.00 Name: k1ci_human Len: 253 Check: 3405 Weight: 1.00 Name: grp2_sorvu Len: 253 Check: 9465 Weight: 1.00 Name: grp1_sinal Len: 253 Check: 3668 Weight: 1.00 Name: gr10_brana Len: 253 Check: 4828 Weight: 1.00 Name: ebn1_ebv Len: 253 Check: 4393 Weight: 1.00 Name: roa2_human Len: 253 Check: 6823 Weight: 1.00 Name: ch15_drogr Len: 253 Check: 2332 Weight: 1.00 Name: asf1_helan Len: 253 Check: 4095 Weight: 1.00 Name: pcp_yeren Len: 253 Check: 3128 Weight: 1.00 Name: grp_dauca Len: 253 Check: 2776 Weight: 1.00 Name: ykr3_caeel Len: 253 Check: 4721 Weight: 1.00 Name: vnua_prvka Len: 253 Check: 5805 Weight: 1.00 Name: chb3_bommo Len: 253 Check: 2045 Weight: 1.00 Name: grpa_maize Len: 253 Check: 2510 Weight: 1.00 Name: grp_arath Len: 253 Check: 1075 Weight: 1.00 Name: els_human Len: 253 Check: 5247 Weight: 1.00 Name: roaa_xenla Len: 253 Check: 479 Weight: 1.00 Name: grp2_orysa Len: 253 Check: 7212 Weight: 1.00 Name: rnha_human Len: 253 Check: 9503 Weight: 1.00 Name: grp7_arath Len: 253 Check: 6751 Weight: 1.00 Name: grp2_phavu Len: 253 Check: 1994 Weight: 1.00 Name: chb8_bommo Len: 253 Check: 1451 Weight: 1.00 Name: grp1_cheru Len: 253 Check: 3797 Weight: 1.00 Name: els_bovin Len: 253 Check: 727 Weight: 1.00 Name: nucl_rat Len: 253 Check: 3798 Weight: 1.00 Name: nucl_human Len: 253 Check: 3956 Weight: 1.00 Name: nucl_mouse Len: 253 Check: 3798 Weight: 1.00 Name: vg38_bpm1 Len: 253 Check: 3833 Weight: 1.00 Name: grp1_pethy Len: 253 Check: 5946 Weight: 1.00 Name: mcba_ecoli Len: 253 Check: 8534 Weight: 1.00 Name: roa1_drome Len: 253 Check: 150 Weight: 1.00 Name: grp3_artsa Len: 253 Check: 8770 Weight: 1.00 Name: chb7_bommo Len: 253 Check: 9468 Weight: 1.00 Name: roa1_scham Len: 253 Check: 1032 Weight: 1.00 Name: sala_droor Len: 253 Check: 6894 Weight: 1.00 Name: els_chick Len: 253 Check: 553 Weight: 1.00 Name: spd1_nepcl Len: 253 Check: 1595 Weight: 1.00 Name: sala_drosi Len: 253 Check: 2965 Weight: 1.00 Name: chb4_bommo Len: 253 Check: 7618 Weight: 1.00 Name: ews_human Len: 253 Check: 1326 Weight: 1.00 Name: sqd_drome Len: 253 Check: 5891 Weight: 1.00 Name: ydh3_hsvsc Len: 253 Check: 1140 Weight: 1.00 Name: sala_drome Len: 253 Check: 6682 Weight: 1.00 Name: egg2_schja Len: 253 Check: 1991 Weight: 1.00 Name: ssb_ecoli Len: 253 Check: 2795 Weight: 1.00 Name: prpc_human Len: 253 Check: 7305 Weight: 1.00 // 1 50 t1_11691 MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_human MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_gorgo MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_pantr MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_ponpy MANLGCWMLV LFVATWSNLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_colgu MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_prefr MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_atege .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_macfa MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_saisc MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_calja MANLGCWMLF LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_calmo .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_cebap MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNLYP prio_cerae MANLGCWMLV VFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_mansp .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_aottr .......MLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QSSPGGNRYP prio_bovin .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP prio_rat .......... .......... ........GG WNTGGSRYPG QGSPGGNRYP prio_trast .SHIGSWILV LFVAMWSDVA LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP prip_bovin .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP prio_mouse MANLGYWLLA LFVTMWTDVG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_sheep .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP prio_mesau MANLSYWLLA LFVAMWTDVG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP prio_odohe .SHIGSWILV LFVAMWSDVG LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP prp2_trast .SHIGSWILV LFVAMWSDVA LCKKRPKpgG WNTGGSRYPG QGSPGGNRYP prio_musvi .SHIGSWLLV LFVATWSDIG FCKKRPKpgG WNTGGSRYPG QGSPGGNRYP grp_horvu .......... .........G GGGYPGGGGG YGGGGGGYPG HGGEGGGGY. grw1_lyces .......... .......... .......... .......... ...GGGGRYP grp2_sinal .......... .......... ..NEAQSRGS GAGGGGRGGG GGYRGGGGY. grp8_arath .......... .......... ..QSRGSGGG GGGRGGSGGG YRSGGGGGYS nucl_xenla .......... .......... ..SQRGGRGG FGRGGGFRGG RGGRGGG... prio_chick MARLlcCLLA LLLAACTDVA LSKkkPSGGG WGAGSHRQPS YPRQPGYPHN gar1_schpo .......... .......... .......... .........G PKKPKGARNG grpa_medfa .......... .......... .......GGG YNHGGGGYNG GGYNHGG... egg1_schja LAAIG.YTIA YPPPSDYDSG YGGGGGGGGG GGYGGWCGGS DCYGGGNGGG grp1_phavu .......... .........G YGGGAGKGGG EGYGGGGANG GGYGGGGGSG grp2_nicsy .......... .........G GGGGGGRGGG GYGGGSGGYG GGGRGGSRGY roab_xenla .......... ......SRGG FGNDNFGGRG GNFGGNR.GG GGGFGNRGYG grp1_orysa .......FLL LLTISLSKSN AARVIKYNGG GSGGGGGGGG GGGGGGNGSg k1ci_human .......... .......... ........GG GGSGGGYGGG SGSRGGSGGs grp2_sorvu ......FGFV TFSSEQSMLD AIEngKELDG RNITVNQAQS RGGGGGGGGY grp1_sinal .......... ..IEGMNGQD LDGRSITVNE AQSRGSGGGG GGRGGGGGYR gr10_brana .......... ....TFSQFG EVIDSKIIND RETGRSRGFG FVTFKDEKsq ebn1_ebv .......... .........G GTGAGAGAGG AGAGGAGAGG GAGAGGGAGG roa2_human .......... ......SRGG GGNFGPGPGS NFRGGSDGYG SGRGFGDGYN ch15_drogr .......... .......... .......... .......... .......... asf1_helan .......... .......... ..NPGPPPGA PGTPGTPPAP PGKGEGDAPH pcp_yeren .......... .......... .......... .......... .......... grp_dauca MAEVEYRCFV GGLAwfSQFG DITDSKIIND RETGRSrlDG RNITVNEAQS ykr3_caeel .......... .......... .......... .......... .GSIAGNLIR vnua_prvka .......... .........G AALPARGPGG LRGRGRGGRG GGGGGGGRGP chb3_bommo .......... ..VGVSGNLP FLGTADVAGE FPTAGIGEIL YGCGNGAviT grpa_maize .......... .FVTFSSENS MLDAIENMNG KELDGR.... ..NITVNQAQ grp_arath .......... ...GSGGGLG GGIGGGAGGG AGGGGGLGGG HGGGIGGGAG els_human .........V LPGARFPGVG VLPgkPKAPG VGGAFAGIPG VGPFGG.... roaa_xenla .......... .........G NRGGGGGFGN RGYGGDGYNG DGQLWWQPSL grp2_orysa LAILVLLSIG MTTSARTLLG YGPGGGGGGG GEGGGGGYGG SGYGSGSGYG rnha_human .......... .......... .MARYDNGSG YRRGGSSYSG GGYGGGYSSG grp7_arath .......... .......... ........DG RSITVNEAQS RGSGGGGGHR grp2_phavu .......... ......YGTG GGAGGGGGGG GDHGGGYGGG QGAGGGAGGG chb8_bommo .......... ..VGVCGNLP FLGTADVAGE FPTAGIGEID YGCGNGAviT grp1_cheru LLGLSIAFAI LISSEVAARE LAETAAKTEG YNNGGGYHNG GGGYNNGGGY els_bovin .......... ........LL LCILQPSQPG GVPGAVP... GGVPGGVFFP nucl_rat .......... .......... ..AKEAMEDG EIDGNKVTLD WAKPKGEGGF nucl_human .......... .......... ..AKEAMEDG EIDGNKVTLD WAKPKGEGGF nucl_mouse .......... .......... ..AKEAMEDG EIDGNKVTLD WAKPKGEGGF vg38_bpm1 .......... .......... .......... LNIHGVTMYG RGGNGGSNSP grp1_pethy .......... .........G AGGGFGGGAG GGAGGGLGGG GGLGGGGGGG mcba_ecoli .......... .......... .......... .......... .......... roa1_drome .......... ........VD VKKALPKQND QQGGGGGRGG PGGRAGGNRG grp3_artsa .......... .......... .......... .......... .......... chb7_bommo .......... ..VGVSGNLP FLGTADVAGE FPTAGIGEID YGCGNGAviT roa1_scham .......... ..VGGGAGGG WGGGRGDWGG SAGGGG...G GGWGGADPWE sala_droor .......... .......... .......... .......... NGYGQGGQGP els_chick AAPLLPGVLL LFSILPASQQ GGVPGAIPGG GVPGGGFFPG AGVGGL.... spd1_nepcl ........LG SQGAGRGGQG AGAAAAAAGG AGQGGYGGLG SQGAGRGGLG sala_drosi .MKLLIALLA LVTAAIAQNG F......... .........G QGGYGGQ... chb4_bommo .......... .......... .........G RGCGGRGYGG LGY....... ews_human .AAVEWFDGK DFQGSKLKVS LARKKPPMNS MRGGLPPREG RGMPPPLRGG sqd_drome .......... ......KEVD VKRATPKPEN QMMGGMRGGP RGGMRGGRGG ydh3_hsvsc .......... .......... .......... .SPGGPGGPG GPGGPGGPGG sala_drome .......... .......... .......... IAQNGFGQVG QGGYGGQ... egg2_schja LAAIG.YTIA YPPSSDYDSG YGGGGGGGGG GGYGGWCGGS DCYGGGNGGG ssb_ecoli .......... LRTRKWTDQS GQDRYTTEVV VNVGGTMQML GGRQGGG..A prpc_human .......... .........G NQDDGPQQGP PQQGGQQQQG PPPPQGKPqp 51 100 t1_11691 PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_human PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_gorgo PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_pantr PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_ponpy PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_colgu PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_prefr PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_atege .........P QGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN prio_macfa PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWH prio_saisc PQGGggWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN prio_calja PQGGG.WGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN prio_calmo PQGGGSWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN prio_cebap PQGGG.WGQP HGGGWGQPHG GGWGQPHGGS WGQPHGGGWG QGGGTHNQWN prio_cerae PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQ....... .GGGTHNQWH prio_mansp PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWH prio_aottr PQSGG.WGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN prio_bovin PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG qqGGTHGQWN prio_rat PQSGGTWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWS QGGGTHNQWN prio_trast SQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG qqGGTHGQWN prip_bovin PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGGW GQGGSHSQWN prio_mouse PQGGT.WGQP HGGGWGQPHG GSWGQPHGGS WGQPHGGGWG QGGGTHNQWN prio_sheep PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGGW GQGGSHSQWN prio_mesau PQGGGTWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHNQWN prio_odohe PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGGW GQGGTHSQWN prp2_trast PQEGGDWGQP HGGGWGQPHV GGWGQPHGGG WGQPHGGGGW GQGGTHGQWN prio_musvi PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPhgGGWG QGGGSHGQWG grp_horvu ..GGGGGYPG HGGEGGGGYG GGGGYHGHGG EG...GGGYG GGGGYHGH.. grw1_lyces GGGGGGRG.. .....GGRYS GGGGRGGGGG RGGRGGGG.. .......... grp2_sinal GGGGGGYGGG RREGGGYSGG GGGYSSRGGG GGGYGGGGRR DGGGY..... grp8_arath GGGGGGYSGG GGGGYER.RS GGYGSGGGGG GRGYGGGGRR EGGGY..... nucl_xenla ..GGRGFGG. RGGGRGR... GGFGGRGGGG FRGGQGGGFR GGQGKKMRFD prio_chick PGYPHNPGYP HNPGygYPHN PGYPqpHNPG YPGWGQGYNP SSGGSYHNQK gar1_schpo PAGRGGRGGF RGGRGGS..R GGFGGNSRGG FGGGSRGGFG GGSRGGSR.. grpa_medfa ..GGYNNGGG YNHGGGGYNN GGGGYNHGGG GYNNGGGGYN HGGGGYNNGG egg1_schja GGGGGGNGGE YGGGYGDVYG GSYGGggGGG YGDVYGGGCG ggGGN..... grp1_phavu GGGGGGAGGa yGGGEGSGAG GGYGGANGGG GGGNGGGGGG GSGGAHGGGA grp2_nicsy GGGDGGYGGG GGYGGGSRYG GGGGG.YGGG GGY...GGGG SGGGSGCFKC roab_xenla GDGYNGDGQl wNRGYGAGQG GGYGAGQGGG YGGgqGGGYG GNGGYDGYNG grp1_orysa gGGGGGGGGG NGSGSGSGYG YGYGQGNGGA QGQGSGGGGG GGGGGGGGGs k1ci_human sGSGGGSGGG YGGGSGGGHS GGSGGGHSGG SGGNYGGGSG SGGGSGGGYg grp2_sorvu GGGGGGYGGR EGGGYGG.GG GGYGGRREGG GGY.GGGGYG GGGGGY.... grp1_sinal SGGGGGYGGG GGGYGGGGRE GGYSG.GGGG YSSRGGGGGG YGGGGRRD.. gr10_brana SRGGGGGGGR GGGGYGGRGG GGYGGGGGGY GDRRGGGGYG SGGGGRGGGG ebn1_ebv AGGAGGAGAG GGAGAGGGAG GAGGAGAGGG AGAggAGGAG AGGGAGGAGG roa2_human GYGGGPGGGN FGGSPG..YG GGRGGYGGGG PGYgqGGGYG ggSGNYNDFG ch15_drogr .......... SAGGYGNIGL GGYGL.GNVG YLQNHGGGYG RRPILISKSS asf1_helan pdGGSGPAPP AGGGSPPPAG GDGGGGAPPP AGGDGGGGAP PPAGGDG... pcp_yeren .......... QGGDDNNVMG AIGGAVLGGF LGNTVGGGTG RSLAT..... grp_dauca RGSGGGGGRR EGGGGGYGGG GGYGGRREGG GGGGYGGRRE GGGGGYGG.. ykr3_caeel DKVGGAGGDI LGGLASNFFG GGGGGGGGGG GGGFGGGNGG FGGGIFIFKI vnua_prvka RGRGGRRRRr lGGGRGRGGR GGRGGRGRGG GRAPRGGGGG PGGGGRAGRG chb3_bommo REGGLGYGAG YGGGYGLGYG G.....YGGG YGLGYGGYGG CGCG...... grpa_maize SRGGGGGGGG YGGGRGGGGY GGGRRDGGYG GGGGYGGRRE GGGGGYG... grp_arath GGAGGGLGGG HGGGIGGGAG GGSGGGLGGG IGGGAGGGAG GGGGAGGGGG els_human PQPGVPLGYp lPGGYGLPYT TgyGYGPGGV AGAAGKAGYP TGTGVGPQAA roaa_xenla LGWNRGYGAG QGGGYGAGQG GGYGGgqGGG YGGnsGGNFG SSGGYNDFGN grp2_orysa ..EGGGSGGA AGGGYGRGGG GGGGGGEGGG SGSggGGGGG GQGGGAGGYG rnha_human GYGSGGYGGs vGGGYRGVSR GGFRGNSGGD YRGPSGGYRG SGGFQR.... grp7_arath GGGGGGYRSG GGGGYS.... GGGGSYGGGG GRREGGGGYS GGGGGYSSRG grp2_phavu YGGGGEHGGG GGGGQGGGAG GGYGAgaGGG QGGGAGGGYG AGGEHGGGAG chb8_bommo REGGLGYGAG YGDGYGLGYG G.....YGGG YGLGYGGYGG CGCG...... grp1_cheru HNGGGGYNNg hNGGGGYNNG GGygGHHNGG GGYNNGGGYH GGGGSCYHYC els_bovin GAGLGGLGVG GLGPGVKPAK PGVGGLVGPG LGAepGGFFG AGGGA..... nucl_rat GGRGGGRGGF GGRGGGRGGR GGFGGRGRGG FGGRGGFRGG RGGGGDF... nucl_human GGRGGGRGGF GGRGGGRGGR GGFGGRGRGG FGGRGGFRGG RGGGGD...H nucl_mouse GGRGGGRGGF GGRGGGRGGR GGFGGRGRGG FGGRGGFRGG RGGGGDF... vg38_bpm1 GSAGGHCIQN NIGGRLRINN GGAIAGGGGG GGGggGGGRP FGAAGGYSGG grp1_pethy AGGGGGVGGG AGSGGGFGAG GGVGGGAGAG GGVGGGGGFG GGGGGGVGGG mcba_ecoli ...GVGIGGG GGGGGGGSCG GQGGG..CGG CSNGCSGGNG GSGGSGSH.. roa1_drome NMGGGNYGNQ NGGGNWNNGG NNWGNNRGGn fGGGGGGGGG YGGGNNSWGN grp3_artsa .MGGPGPMGP QGRGRGRGRG GFSGPdmDPG YGF.DESYCG MGGGYEMPYN chb7_bommo REGGFGYGAG YGDGYGLGFG G.....YGGG YGLGYGGYGG CG........ roa1_scham NGRGGGGDRW GGGGGGMGGG DRWGGGGGMG GGDRYGGGGG RSGGWSNDGY sala_droor YGGQGGFGGY GGLGGQAGFG GQIGFNGQGG VGGQLGVG.. QGGVSPGQ.. els_chick ...GAGLGAG LGAGGKPLKP GVSGLGGLGP LGLQPGAGVG GLGAGLGAFP spd1_nepcl GQGAGAAAAA AAGGAGQGGY GGLGNQGAGR GGQggGAGQG GYGGLGSQGA sala_drosi ....GGFGGF GGLGGQAGFG GQIGFNGQGG VGG..QVGIG QGGVHPGQ.. chb4_bommo ..GGLGYGGL GYGGLGGGCG RGFS...GGG LPVATASAAP TGLGIASeyE ews_human PGGPGGPGGP MggGRGGDRG GFPPRGPRGS RGNPSGGGNV QHRAGDWQCP sqd_drome YGGRGGYNNq dGQGSYGGYG GGYGGYGAGG YGDYYAGGyg YGGGFEGNGY ydh3_hsvsc PGGPGGPGGP CGPGGPCGPG GPCGPGGPGG PGGPRSPVSS IG........ sala_drome ....GGFGGF GGIGGQAGFG GQIG..FTGQ GGVSGQVGIG QGGVHPGQ.. egg2_schja GGGGGGNGGE YGGGYGDVYG GSYGGgyGGG NGGGNGGGGG CNGGGcnDYY ssb_ecoli PAGGNIGGGQ PQGGWGQPQQ PQGGNQFSGG .......... .......... prpc_human PQQGGHPPPP QGRPQGPPQQ GGHPRPPRGR PQGPPQQGGH QQGPPPPPPG 101 150 t1_11691 KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY prio_human KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY prio_gorgo KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY prio_pantr KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY prio_ponpy KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGNDYEDRYY prio_colgu KPSKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_prefr KPSKPKSNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_atege KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_macfa KPSKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_saisc KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_calja KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_calmo KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_cebap KPSKPKTSMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_cerae KPSKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_mansp KPNKPKTSMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_aottr KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_bovin KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGSDYEDRYY prio_rat KPSKPKTNLK HVAGAAAAGA VVGGLGGYML GSAMSRPMLH FGNDWEDRYY prio_trast KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGSDYEDRYY prip_bovin KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_mouse KPSKPKTNLK HVAGAAAAGA VVGGLGGYML GSAMSRPMIH FGNDWEDRYY prio_sheep KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY prio_mesau KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPMMH FGNDWEDRYY prio_odohe KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMNRPLIH FGNDYEDRYY prp2_trast KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGSDYEDRYY prio_musvi KPSKPKTNMK HVAGAAAAGA VVGGLGGYML GSAMSRPLIH FGNDYEDRYY grp_horvu .......... ...GGEGGGG YGGGGGGY.. .......... .......... grw1_lyces .......... .......... .......... .......... .......... grp2_sinal .......... ..GGGEGGGY GGGGGGGW.. .......... .......... grp8_arath .......... ...GGGDGGS YGGGGGGW.. .......... .......... nucl_xenla .......... .......... .......... .......... .......... prio_chick PWKPPKTNFK HVAGAAAAGA VVGGLGGYAM GRVMSGMNYH FDSPDEYRWW gar1_schpo .......... ........GG FRGGSRGGFR GR........ .......... grpa_medfa GGYN...... HGGGGYNGGG YNHGGGGYNH G......... .......... egg1_schja .......... ........GG GNGGGGGCNG GGCGGGPD.F YGKGYEDSYG grp1_phavu AGGGEGAGQG aaAGGGGRGS GGGGGGGYGG GGARGSGYGG GGGSGE.... grp2_nicsy GESGHFARDC SQSGGGGGGG RFGGGGGGGG GGGCYK.... .......... roab_xenla GGS....... ...GFSGSGG NFGSSGGYnf GNYNSQSSSN FGPMKGGNY. grp1_orysa qGSGSGYGYG YGKGGGGGGG GGGGGGGGGG GS........ .......... k1ci_human sGSRGGSGGS HGGGSGFGGE SGGSYGG... GEEASGSGGG YGGGSGKSSH grp2_sorvu .......... ...GGREGGG GYGGGGGYGG NRGDSGGNWR .......... grp1_sinal .......... ........GG EGGGYGGSGG G......... .......... gr10_brana YGSG.GGGYG GGGGRRDGGG YGGGDGGYGG GS........ .......... ebn1_ebv AGAGGGAGAG GGAGGAGAGG GAGGAGGAGA G......... .......... roa2_human NYNQQPSNYG PMKSGNFGGS rgGPYGGGNY GPGGSGGSGG YG........ ch15_drogr NPSAAAanQR GVIGYELDGG ILGGHGGYGG G......... .......... asf1_helan .......... ..GGAPPPGA .......... .......... .......... pcp_yeren .......... ......AAGA VAGGMAGQGV QGAMNR.... .......... grp_dauca .......... ..GGGGYGGR REGGDGGYGG GGGGSR.... .......... ykr3_caeel VRQKFPKNSS SF........ .......... .......... .......... vnua_prvka EVRVAAAAAG AAEAAAAAEG ALSG...... .......... .......... chb3_bommo .......... .......... .......... .......... .......... grpa_maize .......... ..GGGGYGGR REGGGGGYGG GGGGWR.... .......... grp_arath LGGGHGGGFG GGAGGGLGGG AGGGTGGGFG GGAGGGAGGG AGGGF..... els_human AAAAAKAAAK FGAGAAGVLP GVGGAGVPGV PGAIPGIGGI AG........ roaa_xenla YNSQSSSNFG PMKGGNYGGG RNSGpgGYGG GSASSSSGYG GGRRF..... grp2_orysa QGSGYGSGYG SGAGGAHGGG YGSGGGGGGG GGQGGGSGYG SGSGYGSGYG rnha_human .......... ........GG GRGAYGTGYL DIEEEVAAIK LG........ grp7_arath GGGGSYGGGR REGGGGYGGG EGGGYGGSGG G......... .......... grp2_phavu GGQGGGAGGG YGAGGEHGGG AGGGQGGGAG GGYGAGGEHG GGA....... chb8_bommo .......... .......... .......... .......... .......... grp1_cheru HGR....... .CCSAAEAKA L......... .......... .......... els_bovin ..AGAAAAYK AAAKAGAAGL GVGGIGGVgl GVSTGAVVPQ LGAGVGAGVK nucl_rat KPQGKKTKFE .......... .......... .......... .......... nucl_human KPQGKKTKFE .......... .......... .......... .......... nucl_mouse KPQGKKTKFE .......... .......... .......... .......... vg38_bpm1 SASTAGTLTG AGIGSKPGNA IYGGNGGN.V GSAGGAFGGI SGSRY..... grp1_pethy SGHGGGFGAG GGVGGGAGGG LGGGVGGGGG GGSGGGGGIG GGSGHGGGF. mcba_ecoli .......... .......... .......... .......... .......... roa1_drome NNPWDNGNGG GNFGGGGNNW NNGgfGGYqy GGGPQRGGGN FNNNRMQPY. grp3_artsa GNAGWTASPG RGAGAGARGA .RGGLDQSRG GGKFPSARGG RGR....... chb7_bommo .......... .......... .......... .......... .......... roa1_scham NSGPQSDGFG GGYKQSYGGG AVRGSSGY.. GGSRSAPYSD RGS....... sala_droor .......... ..GGFAAQGP PNQYQPGY.. GSPVGSGHFH GGNPVDAGYI els_chick GAAFpaASAA ALKAAAKAGA GLGGVGG... .......... .......... spd1_nepcl GRGGLGGQGA GAAAAAAGGA GQGGYGGLGG QGAGQGGYGG LGSQGAGR.. sala_drosi .......... ..GGFAGQGS PNQYQPGY.. GNPVGSGHFH GGNPVESGHF chb4_bommo GTVGVCGNLP FLGTAAVAGE ftVGIGEILY GCGNGAVGIt yGAGYGGGY. ews_human NPGCGNQNFA wdRGRGGPGG MRGGRGGLM. .......... .......... sqd_drome GGGGGGGNMG GGRGGPRGGG GPKGGGGFNG G......... .......... ydh3_hsvsc .......... .......... .......... .......... .......... sala_drome .......... ..GGFAGQGS PNQYQPGY.. GSPVGSGHFH GANPVESGHF egg2_schja GGSNGRRNGH GKGGKGGNGG GGGKGGGKGG GNGEGNGKGg kGGSYAPSYY ssb_ecoli .......... .......... .......... .......... .......... prpc_human KPQGPPPQGG RPQGPP.... .......... .......... .......... 151 200 t1_11691 RENMHRYPNQ VYYRPMDEYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_human RENMHRYPNQ VYYRPMDEYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_gorgo RENMHRYPNQ VYYRPMDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_pantr RENMHRYPNQ VYYRPMDQYS SQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_ponpy RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_colgu RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_prefr RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_atege RENMYRYPNQ VYYRPVDQYN NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_macfa RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_saisc RENMYRYPSQ VYYRPVDQYS NQNNFVHDCV NVTIKQHTVT TTTKGENFTE prio_calja RENMYRYPNQ VYYRPVDQYN NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_calmo RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_cebap RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_cerae RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_mansp RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_aottr RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_bovin RENMHRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKEHTVT TTTKGENFTE prio_rat RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_trast RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE prip_bovin RENMHRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKEHTVT TTTKGENFTE prio_mouse RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_sheep RENMYRYPNQ VYYRPVDRYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE prio_mesau RENMNRYPNQ VYYRPVDQYN NQNNFVHDCV NITIKQHTVT TTTKGENFTE prio_odohe RENMYRYPNQ VYYRPVDQYN NQNTFVHDCV NITVKQHTVT TTTKGENFTE prp2_trast RENMYRYPNQ VYYRPVDQYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE prio_musvi RENMYRYPNQ VYYKPVDQYS NQNNFVHDCV NITVKQHTVT TTTKGENFTE grp_horvu .......... .......... .......... .......... .......... grw1_lyces .......... .......... .......... .......... .......... grp2_sinal .......... .......... .......... .......... .......... grp8_arath .......... .......... .......... .......... .......... nucl_xenla .......... .......... .......... .......... .......... prio_chick SENSARYPNR VYYRDYSSPV PQDVFVADCF NITVTEYSIG PAAKKNTSee gar1_schpo .......... .......... .......... .......... .......... grpa_medfa .......... .......... .......... .......... .......... egg1_schja GDS...YGND YY........ .......... .......... .......... grp1_phavu .......... .......... .......... .......... .......... grp2_nicsy .......... .......... .......... .......... .......... roab_xenla .......... .......... .......... .......... .......... grp1_orysa .......... .......... .......... .......... .......... k1ci_human S......... .......... .......... .......... .......... grp2_sorvu .......... .......... .......... .......... .......... grp1_sinal .......... .......... .......... .......... .......... gr10_brana .......... .......... .......... .......... .......... ebn1_ebv .......... .......... .......... .......... .......... roa2_human .......... .......... .......... .......... .......... ch15_drogr .......... .......... .......... .......... .......... asf1_helan .......... .......... .......... .......... .......... pcp_yeren .......... .......... ......TDGV QLEVRKDDGT TILVVQKQGP grp_dauca .......... .......... .......... .......... .......... ykr3_caeel .......... .......... .......... .......... .......... vnua_prvka .......... .......... .......... .......... .......... chb3_bommo .......... .......... .......... .......... .......... grpa_maize .......... .......... .......... .......... .......... grp_arath .......... .......... .......... .......... .......... els_human .......... .......... .......... .......... .......... roaa_xenla .......... .......... .......... .......... .......... grp2_orysa GGNGHH.... .......... .......... .......... .......... rnha_human .......... .......... .......... .......... .......... grp7_arath .......... .......... .......... .......... .......... grp2_phavu .......... .......... .......... .......... .......... chb8_bommo .......... .......... .......... .......... .......... grp1_cheru .......... .......... .......... .......... .......... els_bovin PGKVPGVGLP GVY....... .......... .......... .......... nucl_rat .......... .......... .......... .......... .......... nucl_human .......... .......... .......... .......... .......... nucl_mouse .......... .......... .......... .......... .......... vg38_bpm1 .......... .......... .......... .......... .......... grp1_pethy .......... .......... .......... .......... .......... mcba_ecoli .......... .......... .......... .......... .......... roa1_drome .......... .......... .......... .......... .......... grp3_artsa .......... .......... .......... .......... .......... chb7_bommo .......... .......... .......... .......... .......... roa1_scham .......... .......... .......... .......... .......... sala_droor HGNHHEYPEH HGDHHREHHE HHGHHEHH.. .......... .......... els_chick .......... .......... .......... .......... .......... spd1_nepcl .......... .......... .......... .......... .......... sala_drosi HGNPHEYPEH HGEHHREHHE HHGHHEHH.. .......... .......... chb4_bommo .......... .......... .......... .......... .......... ews_human .......... .......... .......... .......... .......... sqd_drome .......... .......... .......... .......... .......... ydh3_hsvsc .......... .......... .......... .......... .......... sala_drome HENPHEYPEH HGDHHREHHE HHGHHEHH.. .......... .......... egg2_schja .......... .......... .......... .......... .......... ssb_ecoli .......... .......... .......... .......... .......... prpc_human .......... .......... .......... .......... .......... 201 250 t1_11691 TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_human TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_gorgo TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_pantr TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_ponpy TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_colgu TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_prefr TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVFFSSPPV ILLISFLIFL prio_atege TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLI.. prio_macfa TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_saisc TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_calja TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_calmo TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLI.. prio_cebap TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_cerae TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLIFL prio_mansp TDVKMMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFLI.. prio_aottr TDVKIMERVV EQMCITQYEK ESQAYYQRGS SMVLFSSPPV ILLISFL... prio_bovin TDIKMMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL prio_rat TDVKMMERVV EQMCVTQYQK ESQAYYdrRS SAVLFSSPPV ILLISFLIFL prio_trast TDIKMMERVV EQMCITQYQR ESEAYYQRGA SVILFSSPPV ILLISFLIFL prip_bovin TDIKMMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL prio_mouse TDVKMMERVV EQMCVTQYQK ESQAYyrRSS STVLFSSPPV ILLISFLIFL prio_sheep TDIKIMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL prio_mesau TDIKIMERVV EQMCTTQYQK ESQAYYdrRS SAVLFSSPPV ILLISFLIFL prio_odohe TDIKMMERVV EQMCITQYQR ESQAYYQRGA SVILFSSPPV ILLISFLIFL prp2_trast TDIKMMERVV EQMCITQYQR ESEAYYQRGA SVILFSSPPV ILLISFLIFL prio_musvi TDMKIMERVV EQMCVTQYQR ESEAYYQRGA SAILFSPPPV ILLISLLILL grp_horvu .......... .......... .......... .......... .......... grw1_lyces .......... .......... .......... .......... .......... grp2_sinal .......... .......... .......... .......... .......... grp8_arath .......... .......... .......... .......... .......... nucl_xenla .......... .......... .......... .......... .......... prio_chick MENKVVTKVI REMCVQQYRE YRLASGIQLH PADTWLAVLL LLLTTLFAM. gar1_schpo .......... .......... .......... .......... .......... grpa_medfa .......... .......... .......... .......... .......... egg1_schja .......... .......... .......... .......... .......... grp1_phavu .......... .......... .......... .......... .......... grp2_nicsy .......... .......... .......... .......... .......... roab_xenla .......... .......... .......... .......... .......... grp1_orysa .......... .......... .......... .......... .......... k1ci_human .......... .......... .......... .......... .......... grp2_sorvu .......... .......... .......... .......... .......... grp1_sinal .......... .......... .......... .......... .......... gr10_brana .......... .......... .......... .......... .......... ebn1_ebv .......... .......... .......... .......... .......... roa2_human .......... .......... .......... .......... .......... ch15_drogr .......... .......... .......... .......... .......... asf1_helan .......... .......... .......... .......... .......... pcp_yeren TRFSVGQRVM .......... .......... .......... .......... grp_dauca .......... .......... .......... .......... .......... ykr3_caeel .......... .......... .......... .......... .......... vnua_prvka .......... .......... .......... .......... .......... chb3_bommo .......... .......... .......... .......... .......... grpa_maize .......... .......... .......... .......... .......... grp_arath .......... .......... .......... .......... .......... els_human .......... .......... .......... .......... .......... roaa_xenla .......... .......... .......... .......... .......... grp2_orysa .......... .......... .......... .......... .......... rnha_human .......... .......... .......... .......... .......... grp7_arath .......... .......... .......... .......... .......... grp2_phavu .......... .......... .......... .......... .......... chb8_bommo .......... .......... .......... .......... .......... grp1_cheru .......... .......... .......... .......... .......... els_bovin .......... .......... .......... .......... .......... nucl_rat .......... .......... .......... .......... .......... nucl_human .......... .......... .......... .......... .......... nucl_mouse .......... .......... .......... .......... .......... vg38_bpm1 .......... .......... .......... .......... .......... grp1_pethy .......... .......... .......... .......... .......... mcba_ecoli .......... .......... .......... .......... .......... roa1_drome .......... .......... .......... .......... .......... grp3_artsa .......... .......... .......... .......... .......... chb7_bommo .......... .......... .......... .......... .......... roa1_scham .......... .......... .......... .......... .......... sala_droor .......... .......... .......... .......... .......... els_chick .......... .......... .......... .......... .......... spd1_nepcl .......... .......... .......... .......... .......... sala_drosi .......... .......... .......... .......... .......... chb4_bommo .......... .......... .......... .......... .......... ews_human .......... .......... .......... .......... .......... sqd_drome .......... .......... .......... .......... .......... ydh3_hsvsc .......... .......... .......... .......... .......... sala_drome .......... .......... .......... .......... .......... egg2_schja .......... .......... .......... .......... .......... ssb_ecoli .......... .......... .......... .......... .......... prpc_human .......... .......... .......... .......... .......... 253 t1_11691 IVG prio_human IVG prio_gorgo IVG prio_pantr IVG prio_ponpy IVG prio_colgu IVG prio_prefr IVG prio_atege ... prio_macfa IVG prio_saisc IVG prio_calja IVG prio_calmo ... prio_cebap IVG prio_cerae IVG prio_mansp ... prio_aottr ... prio_bovin IVG prio_rat IVG prio_trast IVG prip_bovin IVG prio_mouse IVG prio_sheep IVG prio_mesau MVG prio_odohe IVG prp2_trast IVG prio_musvi IVG grp_horvu ... grw1_lyces ... grp2_sinal ... grp8_arath ... nucl_xenla ... prio_chick ... gar1_schpo ... grpa_medfa ... egg1_schja ... grp1_phavu ... grp2_nicsy ... roab_xenla ... grp1_orysa ... k1ci_human ... grp2_sorvu ... grp1_sinal ... gr10_brana ... ebn1_ebv ... roa2_human ... ch15_drogr ... asf1_helan ... pcp_yeren ... grp_dauca ... ykr3_caeel ... vnua_prvka ... chb3_bommo ... grpa_maize ... grp_arath ... els_human ... roaa_xenla ... grp2_orysa ... rnha_human ... grp7_arath ... grp2_phavu ... chb8_bommo ... grp1_cheru ... els_bovin ... nucl_rat ... nucl_human ... nucl_mouse ... vg38_bpm1 ... grp1_pethy ... mcba_ecoli ... roa1_drome ... grp3_artsa ... chb7_bommo ... roa1_scham ... sala_droor ... els_chick ... spd1_nepcl ... sala_drosi ... chb4_bommo ... ews_human ... sqd_drome ... ydh3_hsvsc ... sala_drome ... egg2_schja ... ssb_ecoli ... prpc_human ... ________________________________________________________________________________ PredictProtein@EMBL-Heidelberg.DE PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: - secondary structure, by PHDsec - solvent accessibility, by PHDacc - and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Predict-Help@EMBL-Heidelberg.DE All rights reserved. Please quote ~~~~~~~~~~~~ The PredictProtein mail server is described in: B Rost: PHD: predicting one-dimensional protein structure by pro- file based neural networks. Meth. in Enzym., 1996, 266, 525-539. Additionally to be quoted for publications of PHDsec output: B Rost & C Sander: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. The latest improvement steps (up to 72%) are explained in: B Rost & C Sander: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 1994, 19, 55-72. Additionally to be quoted for publications of PHDacc output: B Rost & C Sander: Conservation and prediction of solvent accessi- bility in protein families. Proteins, 1994, 20, 216-226. Additionally to be quoted for publications of PHDhtm output: B Rost, R Casadio, P Fariselli & C Sander: Prediction of helical transmembrane segments at 95% accuracy. Prot. Sci.,1995,4,521-533. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of secondary structure by PHDsec ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ About the input to the network ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The prediction is performed by a system of neural networks. The input is a multiple sequence alignment. It is taken from an HSSP file (produced by the program MaxHom: Sander, Chris & Schneider, Reinhard: Database of Homology-Derived Structures and the Structural Meaning of Sequence Alignment. Proteins, 1991, 9, 56-68. For optimal results the alignment should contain sequences with varying degrees of sequence similarity relative to the input protein. The following is an ideal situation: +-----------------+----------------------+ | sequence: | sequence identity | +-----------------+----------------------+ | target sequence | 100 % | | aligned seq. 1 | 90 % | | aligned seq. 2 | 80 % | | ... | ... | | aligned seq. 7 | 30 % | +-----------------+----------------------+ Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A careful cross validation test on some 250 protein chains (in total about 55,000 residues) with less than 25% pairwise sequence identity gave the following results: ++================++-----------------------------------------+ || Qtotal = 72.1% || ("overall three state accuracy") | ++================++-----------------------------------------+ +----------------------------+-----------------------------+ | Qhelix (% of observed)=70% | Qhelix (% of predicted)=77% | | Qstrand(% of observed)=62% | Qstrand(% of predicted)=64% | | Qloop (% of observed)=79% | Qloop (% of predicted)=72% | +----------------------------+-----------------------------+ .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Qtotal = --------------------------------------- (*100) | number of all residues | | no of res correctly predicted to be in helix |Qhelix (% of obs) = -------------------------------------------- (*100) | no of all res observed to be in helix | | | no of res correctly predicted to be in helix |Qhelix (% of pred)= -------------------------------------------- (*100) | no of all residues predicted to be in helix .......................................................................... Averaging over single chains ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most reasonable way to compute the overall accuracies is the above quoted percentage of correctly predicted residues. However, since the user is mainly interested in the expected performance of the prediction for a particular protein, the mean value when averaging over protein chains might be of help as well. Computing first the three state accuracy for each protein chain, and then averaging over 250 chains yields the following average: +-------------------------------====--+ | Qtotal/averaged over chains = 72.2% | +-------------------------------====--+ | standard deviation = 9.3% | +-------------------------------------+ .......................................................................... Further measures of performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Matthews correlation coefficient: +---------------------------------------------+ | Chelix = 0.63, Cstrand = 0.53, Cloop = 0.52 | +---------------------------------------------+ .......................................................................... Average length of predicted secondary structure segments: . +------------+----------+ . | predicted | observed | +-----------+------------+----------+ | Lhelix = | 10.3 | 9.3 | | Lstrand = | 5.0 | 5.3 | | Lloop = | 7.2 | 5.9 | +-----------+------------+----------+ .......................................................................... The accuracy matrix in detail: +---------------------------------------+ | number of residues with H, E, L | +---------+------+------+------+--------+ | |net H |net E |net L |sum obs | +---------+------+------+------+--------+ | obs H |12447 | 1255 | 3990 | 17692 | | obs E | 949 | 7493 | 3750 | 12192 | | obs L | 2604 | 2875 |19962 | 25441 | +---------+------+------+------+--------+ | sum Net |16000 |11623 |27702 | 55325 | +---------+------+------+------+--------+ Note: This table is to be read in the following manner: 12447 of all residues predicted to be in helix, were observed to be in helix, 949 however belong to observed strands, 2604 to observed loop regions. The term "observed" refers to the DSSP assignment of secondary structure calculated from 3D coordinates of experimentally determined structures (Dictionary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637). Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts the three secondary structure types using real numbers from the output units. The prediction is assigned by choosing the maximal unit ("winner takes all"). However, the real numbers contain additional information. E.g. the difference between the maximal and the second largest output unit can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Qtot) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res |100.0| 99.2| 90.4| 80.9| 71.6| 62.5| 52.8| 42.3| 29.8| 14.1| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | | Qtot | 72.1| 72.3| 74.8| 77.7| 80.3| 82.9| 85.7| 88.5| 91.1| 94.2| | | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 70.4| 70.6| 73.7| 77.1| 80.1| 83.1| 86.0| 89.3| 92.5| 96.4| | E%obs| 61.5| 61.7| 63.7| 66.6| 69.1| 71.7| 74.6| 77.0| 77.8| 68.1| | | | | | | | | | | | | | H%prd| 77.8| 78.0| 80.0| 82.6| 84.7| 86.9| 89.2| 91.3| 93.1| 95.4| | E%prd| 64.5| 64.7| 67.8| 71.0| 74.2| 77.6| 81.4| 85.1| 89.8| 93.5| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ The above table gives the cumulative results, e.g. 62.5% of all residues have a reliability of at least 5. The overall three-state accuracy for this subset of almost two thirds of all residues is 82.9%. For this subset, e.g., 83.1% of the observed helices are correctly predicted, and 86.9% of all residues predicted to be in helix are correct. .......................................................................... The following table gives the non-cumulative quantities, i.e. the values per reliability index range. These numbers answer the question: how reliable is the prediction for all residues labeled with the particular index i. +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res | 8.8| 9.5| 9.3| 9.1| 9.7| 10.5| 12.5| 15.7| 14.1| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | Qtot | 46.6| 50.6| 57.7| 62.6| 67.9| 74.2| 82.2| 88.3| 94.2| | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 36.8| 42.3| 49.5| 55.2| 61.7| 69.9| 78.8| 87.4| 96.4| | E%obs| 44.7| 44.5| 52.1| 55.4| 60.9| 68.0| 75.9| 81.0| 68.1| | | | | | | | | | | | | H%prd| 49.9| 52.5| 60.3| 64.2| 69.2| 77.5| 85.4| 89.9| 95.4| | E%prd| 41.7| 47.1| 53.6| 57.0| 64.0| 71.6| 78.8| 88.8| 93.5| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ For example, for residues with Relindex = 5 64% of all predicted betha- strand residues are correctly identified. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of solvent accessibility by PHDacc ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Definition of accessibility ~~~~~~~~~~~~~~~~~~~~~~~~~~~ For training the residue solvent accessibility the DSSP (Dictionary of Secondary Structure of Proteins; Kabsch & Sander (1983) Biopolymers, 22, 2577-2637) values of accessible surface area have been used. The prediction provides values for the relative solvent accessibility. The normalisation is the following: | ACCESSIBILITY (from DSSP in Angstrom) |RELATIVE_ACCESSIBILITY = ------------------------------------- * 100 | MAXIMAL_ACC (amino acid type i) where MAXIMAL_ACC (i) is the maximal accessibility of amino acid type i. The maximal values are: +----+----+----+----+----+----+----+----+----+----+----+----+ | A | B | C | D | E | F | G | H | I | K | L | M | | 106| 160| 135| 163| 194| 197| 84| 184| 169| 205| 164| 188| +----+----+----+----+----+----+----+----+----+----+----+----+ | N | P | Q | R | S | T | V | W | X | Y | Z | | 157| 136| 198| 248| 130| 142| 142| 227| 180| 222| 196| +----+----+----+----+----+----+----+----+----+----+----+ Notation: one letter code for amino acid, B stands for D or N; Z stands for E or Q; and X stands for undetermined. The relative solvent accessibility can be used to estimate the number of water molecules (W) in contact with the residue: W = ACCESSIBILITY /10 The prediction is given in 10 states for relative accessibility, with RELATIVE_ACCESSIBILITY = (PREDICTED_ACC * PREDICTED_ACC) where PREDICTED_ACC = 0 - 9. Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A careful cross validation test on some 238 protein chains (in total about 62,000 residues) with less than 25% pairwise sequence identity gave the following results: Correlation ........... The correlation between observed and predicted solvent accessibility is: ----------- corr = 0.53 ----------- This value ought to be compared to the worst and best case prediction scenario: random prediction (corr = 0.0) and homology modelling (corr = 0.66). (Note: homology modelling yields a relative accurate prediction in 3D if, and only if, a significantly identical sequence has a known 3D structure.) 3-state accuracy ................ Often the relative accessibility is projected onto, e.g., 3 states: b = buried (here defined as < 9% relative accessibility), i = intermediate ( 9% <= rel. acc. < 36% ), e = exposed ( rel. acc. >= 36% ). A projection onto 3 states or 2 states (buried/exposed) enables the compilation of a 3- and 2-state prediction accuracy. PHD reaches an overall 3-state accuracy of: Q3 = 57.5% (compared to 35% for random prediction and 70% for homology modelling). In detail: +-----------------------------------+-------------------------+ | Qburied (% of observed)=77% | Qb (% of predicted)=60% | | Qintermediate (% of observed)= 9% | Qi (% of predicted)=44% | | Qexposed (% of observed)=78% | Qe (% of predicted)=56% | +-----------------------------------+-------------------------+ 10-state accuracy ................. The network predicts relative solvent accessibility in 10 states, with state i (i = 0-9) corresponding to a relative solvent accessibility of i*i %. The 10-state accuracy of the network is: Q10 = 24.5% .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Q3 = --------------------------------------- (*100) | number of all residues | | no of res. correctly predicted to be buried |Qburied (% of obs) = ------------------------------------------- (*100) | no of all res. observed to be buried | | | no of res. correctly predicted to be buried |Qburied (% of pred)= ------------------------------------------- (*100) | no of all residues predicted to be buried .......................................................................... Averaging over single chains ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most reasonable way to compute the overall accuracies is the above quoted percentage of correctly predicted residues. However, since the user is mainly interested in the expected performance of the prediction for a particular protein, the mean value when averaging over protein chains might be of help as well. Computing first the correlation between observed and predicted accessibility for each protein chan, and then averaging over all 238 chains yields the following average: +-------------------------------====--+ | corr/averaged over chains = 0.53 | +-------------------------------====--+ | standard deviation = 0.11 | +-------------------------------------+ .......................................................................... Further details of performance accuracy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The accuracy matrix in detail: .............................. -------+----------------------------------------------------+----------- \ PHD | 0 1 2 3 4 5 6 7 8 9 | SUM %obs -------+----------------------------------------------------+----------- OBS 0 | 8611 140 8 44 82 169 772 334 27 0 | 10187 16.6 OBS 1 | 4367 164 0 50 106 231 738 346 44 3 | 6049 9.8 OBS 2 | 3194 168 1 68 125 303 951 513 42 7 | 5372 8.7 OBS 3 | 2760 159 8 80 136 327 1246 746 58 19 | 5539 9.0 OBS 4 | 2312 144 2 72 166 396 1615 1245 124 19 | 6095 9.9 OBS 5 | 1873 96 3 84 138 425 1979 1834 187 27 | 6646 10.8 OBS 6 | 1387 67 1 60 80 278 2237 2627 231 51 | 7019 11.4 OBS 7 | 1082 35 0 32 56 225 1871 3107 302 60 | 6770 11.0 OBS 8 | 660 25 0 27 43 136 1206 2374 325 87 | 4883 7.9 OBS 9 | 325 20 2 27 29 74 648 1159 366 214 | 2864 4.7 -------+----------------------------------------------------+----------- SUM |26571 1018 25 544 961 2564 13263 14285 1706 487 | %pred | 43.3 1.7 0.0 0.9 1.6 4.2 21.6 23.3 2.8 0.8 | -------+----------------------------------------------------+----------- Note: This table is to be read in the following manner: 8611 of all residues predicted to be in exposed by 0%, were observed with 0% relative accessibility. However, 325 of all residues predicted to have 0% are observed as completely exposed (obs = 9 -> rel. acc. >= 81%). The term "observed" refers to the DSSP compilation of area of solvent accessibility calculated from 3D coordinates of experimentally determined structures (Diction- ary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637). Accuracy for each amino acid: ............................. +---+------------------------------+-----+-------+------+ |AA | Q3 b%o b%p i%o i%p e%o e%p | Q10 | corr | N | +---+------------------------------+-----+-------+------+ | A | 59.0 87 60 2 38 66 57 | 31 | 0.530 | 5054 | | C | 62.0 91 67 5 39 25 21 | 34 | 0.244 | 893 | | D | 56.5 21 45 6 49 94 57 | 20 | 0.321 | 3536 | | E | 60.8 9 40 3 41 98 61 | 21 | 0.347 | 3743 | | F | 63.3 94 67 9 46 29 37 | 27 | 0.366 | 2436 | | G | 52.1 75 51 1 31 67 53 | 22 | 0.405 | 4787 | | H | 50.9 63 53 23 45 71 50 | 18 | 0.442 | 1366 | | I | 64.9 95 68 6 41 30 38 | 34 | 0.360 | 3437 | | K | 66.6 2 11 2 37 98 67 | 23 | 0.267 | 3652 | | L | 61.6 93 65 8 44 31 40 | 31 | 0.368 | 5016 | | M | 60.1 92 64 5 39 45 44 | 29 | 0.452 | 1371 | | N | 55.5 45 45 8 38 87 59 | 17 | 0.410 | 2923 | | P | 53.0 48 48 9 39 83 56 | 18 | 0.364 | 2920 | | Q | 54.3 27 44 7 44 92 56 | 20 | 0.344 | 2225 | | R | 49.9 15 47 36 47 76 51 | 18 | 0.372 | 2765 | | S | 55.6 69 53 3 51 81 56 | 22 | 0.464 | 3981 | | T | 51.8 61 51 8 38 78 53 | 21 | 0.432 | 3740 | | V | 61.1 93 65 5 40 39 42 | 34 | 0.418 | 4156 | | W | 56.2 85 62 20 49 29 27 | 21 | 0.318 | 891 | | Y | 49.7 73 52 33 49 36 38 | 19 | 0.359 | 2301 | +---+------------------------------+-----+-------+------+ Abbreviations: AA: amino acid in one-letter code b%o, i%o, e%o: = Qburied, Qintermediate, Qexposed (% of observed), i.e. percentage of correct prediction in each state, see above b%p, i%p, e%p: = Qburied, Qintermediate, Qexposed (% of predicted), i.e. probability of correct prediction in each state, see above b%o: = Qburied (% of observed), see above Q10: percentage of correctly predicted residues in each of the 10 states of predicted relative accessibility. corr: correlation between predicted and observed rel. acc. N: number of residues in data set Accuracy for different secondary structure: ........................................... +--------+------------------------------+----+-------+-------+ | type | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | N | +--------+------------------------------+----+-------+-------+ | helix | 59.5 79 64 8 44 80 56 | 27 | 0.574 | 20100 | | strand | 61.3 84 73 9 46 69 37 | 35 | 0.524 | 13356 | | loop | 54.4 64 43 11 44 78 61 | 18 | 0.442 | 27968 | +--------+------------------------------+----+-------+-------+ Abbreviations as before. Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts the 10 states for relative accessibility using real numbers from the output units. The prediction is assigned by choosing the maximal unit ("winner takes all"). However, the real numbers contain additional information. E.g. the difference between the maximal and the second largest output unit (with the constraint that the second largest output is compiled among all units at least 2 positions off the maximal unit) can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Q3, corr, asf.) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +---+------------------------------+----+-------+-------+ |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | +---+------------------------------+----+-------+-------+ | 0 | 57.5 77 60 9 44 78 56 | 24 | 0.535 | 100.0 | | 1 | 59.1 76 63 9 45 82 57 | 25 | 0.560 | 91.2 | | 2 | 61.7 79 66 4 47 87 58 | 27 | 0.594 | 77.1 | | 3 | 66.6 87 70 1 51 89 63 | 30 | 0.650 | 57.1 | | 4 | 70.0 89 72 0 83 91 67 | 32 | 0.686 | 45.8 | | 5 | 72.9 92 75 0 0 93 70 | 34 | 0.722 | 35.6 | | 6 | 76.3 95 77 0 0 93 75 | 36 | 0.769 | 24.7 | | 7 | 79.0 97 79 0 0 93 78 | 39 | 0.803 | 16.0 | | 8 | 80.9 98 80 0 0 91 81 | 43 | 0.824 | 9.6 | | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | +---+------------------------------+----+-------+-------+ Abbreviations as before. The above table gives the cumulative results, e.g. 45.8% of all residues have a reliability of at least 4. The correlation for this most reliably predicted half of the residues is 0.686, i.e. a value comparable to what could be expected if homology modelling were possible. For this subset of 45.8% of all residues, 89% of the buried residues are correctly predicted, and 72% of all residues predicted to be buried are correct. .......................................................................... The following table gives the non-cumulative quantities, i.e. the values per reliability index range. These numbers answer the question: how reliable is the prediction for all residues labeled with the particular index i. +---+------------------------------+----+-------+-------+ |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | +---+------------------------------+----+-------+-------+ | 0 | 40.9 79 40 16 41 21 40 | 14 | 0.175 | 8.8 | | 1 | 45.4 61 46 28 44 48 44 | 17 | 0.278 | 14.1 | | 2 | 47.4 53 52 10 46 80 44 | 19 | 0.343 | 19.9 | | 3 | 52.9 75 59 4 50 77 47 | 23 | 0.439 | 11.4 | | 4 | 60.0 81 63 0 83 84 56 | 25 | 0.547 | 10.1 | | 5 | 65.2 82 70 0 0 93 62 | 28 | 0.607 | 10.9 | | 6 | 71.3 90 72 0 0 94 70 | 31 | 0.692 | 8.8 | | 7 | 76.0 94 76 0 0 95 75 | 34 | 0.762 | 6.3 | | 8 | 80.5 97 81 0 0 94 79 | 39 | 0.808 | 3.8 | | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | +---+------------------------------+----+-------+-------+ For example, for residues with RI = 4 83% of all predicted intermediate residues are correctly predicted as such. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of helical transmembrane segments by PHDhtm: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A cross validation test on 69 helical trans-membrane proteins (in total about 30,000 residues) with less than 25% pairwise sequence identity gave the following results: ++================++-----------------------------------------+ || Qtotal = 94.7% || ("overall two state accuracy") | ++================++-----------------------------------------+ +----------------------------+-----------------------------+ | Qhelix (% of observed)=92% | Qhelix (% of predicted)=83% | | Qloop (% of observed)=96% | Qloop (% of predicted)=97% | +----------------------------+-----------------------------+ .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Qtotal = --------------------------------------- (*100) | number of all residues | | no of res correctly predicted to be in helix |Qhelix (% of obs) = -------------------------------------------- (*100) | no of all res observed to be in helix | | | no of res correctly predicted to be in helix |Qhelix (% of pred)= -------------------------------------------- (*100) | no of all residues predicted to be in helix .......................................................................... Further measures of performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Matthews correlation coefficient: +---------------------------------------------+ | Chelix = 0.84, Cloop = 0.84 | +---------------------------------------------+ .......................................................................... Average length of transmembrane helices: | +------------+----------+ | | predicted | observed | +-----------+------------+----------+ | Lhelix = | 24.6 | 22.2 | +-----------+------------+----------+ .......................................................................... The accuracy matrix in detail: +---------------------------------+ | number of residues with H, L | +---------+------+-------+--------+ | |net H | net L |sum obs | +---------+------+-------+--------+ | obs H | 5214 | 492 | 5706 | | obs L | 1050 | 22423 | 23473 | +---------+------+-------+--------+ | sum Net | 6264 | 22915 | 29179 | +---------+------+-------+--------+ Note: This table is to be read in the following manner: 5214 of all residues predicted to be in a helical trans-membrane region, were observed to be in the lipid bilayer, 1050 however were observed either inside or outside of the protein, i.e. in loop (or non-membrane) regions. The term "observed" refers to DSSP assignment of secondary structure calculated from 3D coordinates of experimentally determined structures (Dictionary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637) where these were available. For all other proteins, the assignment of trans-membrane segments has been taken from the Swissprot data bank (Bairoch, A.; Boeckmann, B.: The SWISS-PROT protein sequence data bank. Nucl. Acids Res. 20: 2019-2022, 1992). .......................................................................... Overlap between predicted and observed segments: +-----------------+---------------+----------------+ | segment overlap | % of observed | % of predicted | | Sov helix | 95.6% | 95.5% | | Sov loop | 83.6% | 97.2% | +-----------------+---------------+----------------+ | Sov total | 86.0% | 96.8% | +-----------------+---------------+----------------+ Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26. As helical trans-membrane segments are longer than globular heli- ces, correctly predicted segments can easily be made out. PHDhtm misses 5 out of 258 observed segments, predicts 6 where non is observed and 3 times the predicted helical segment overlaps two observed regions. Thus, in total more than 95% of all segments are correctly predicted. .......................................................................... Entropy of prediction (information measure): +-----------------+ | I = 0.64 | +-----------------+ (For comparison: homology modelling of globular proteins in three states: I=0.62.) Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26. Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts two states: helical trans-membrane region and rest using two output units. The prediction is assigned by choosing the ma- ximal unit ("winner takes all"). However, the real numbers of the out- put units contain additional information. E.g. the difference between the two output units can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Qtot) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res |100.0| 98.8| 97.3| 95.9| 94.1| 92.3| 89.9| 86.2| 75.0| 66.8| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | | Qtot | 94.7| 95.2| 95.6| 96.2| 96.7| 97.2| 97.7| 98.4| 99.4| 99.8| | | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 91.8| 92.9| 93.8| 94.4| 95.0| 95.7| 96.2| 96.8| 95.5| 78.7| | L%obs| 95.3| 95.7| 96.1| 96.6| 97.0| 97.5| 98.1| 98.8| 99.7|100.0| | | | | | | | | | | | | | H%prd| 82.7| 83.8| 85.0| 86.7| 88.1| 89.7| 91.4| 93.8| 96.3| 97.1| | L%prd| 97.9| 98.3| 98.5| 98.7| 98.8| 99.0| 99.2| 99.4| 99.7| 99.9| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ The above table gives the cumulative results, e.g. 92.3% of all residues have a reliability of at least 5. The overall two-state accuracy for this subset is 97.2%. For this subset, e.g., 95.7% of the observed helical trans-membrane residues are correctly predicted, and 89.7% of all residues predicted to be in helical trans-membrane segment are correct. The resulting network (PHD) prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ PredictProtein@EMBL-Heidelberg.DE PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: - secondary structure, by PHDsec - solvent accessibility, by PHDacc - and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Predict-Help@EMBL-Heidelberg.DE All rights reserved. The network systems are described in: PHDsec: B Rost & C Sander, JMB, 1993, 232, 584-599. B Rost & C Sander, Proteins, 1994, 19, 55-72. PHDacc: B Rost & C Sander, Proteins, 1994, 20, 216-226. PHDhtm: B Rost et al., Prot. Science, 4, 521-533. Some statistics ~~~~~~~~~~~~~~~ Percentage of amino acids: +--------------+--------+--------+--------+--------+--------+ | AA: | G | P | S | Q | V | | % of AA: | 17.8 | 6.7 | 5.9 | 5.9 | 5.5 | +--------------+--------+--------+--------+--------+--------+ | AA: | Y | T | N | M | L | | % of AA: | 5.1 | 5.1 | 4.7 | 4.7 | 4.7 | +--------------+--------+--------+--------+--------+--------+ | AA: | R | K | H | A | W | | % of AA: | 4.3 | 4.0 | 4.0 | 4.0 | 3.6 | +--------------+--------+--------+--------+--------+--------+ | AA: | I | E | F | D | C | | % of AA: | 3.6 | 3.6 | 2.8 | 2.4 | 1.6 | +--------------+--------+--------+--------+--------+--------+ Percentage of secondary structure predicted: +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Predicted: | 13.8 | 19.0 | 67.2 | +--------------+--------+--------+--------+ According to the following classes: all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 alpha-beta : %H>30 and %E>20; mixed: rest, this means that the predicted class is: mixed class PHD output for your protein ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fri Aug 9 02:00:26 1996 Jury on: 10 different architectures (version 5.94_317 ). Note: differently trained architectures, i.e., different versions can result in different predictions. About the protein ~~~~~~~~~~~~~~~~~ HEADER /home/phd/tmp/t1_11691.seq COMPND SOURCE AUTHOR SEQLENGTH 253 NCHAIN 1 chain(s) in t1_11691 data set NALIGN 84 (=number of aligned sequences in HSSP file) Abbreviations: PHDsec ~~~~~~~~~~~~~~~~~~~~~ sequence: AA : amino acid sequence secondary structure: HEL: H=helix, E=extended (sheet), blank=other (loop) PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helix prE: 'probability' for assigning strand prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, e.g., prH=5 means, that the first output node is 0.5-0.6 subset: SUB: a subset of the prediction, for all residues with an expected average accuracy > 82% (tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 5 Abbreviations: PHDacc ~~~~~~~~~~~~~~~~~~~~~ solvent accessibility: 3st: relative solvent accessibility (acc) in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) P_3: predicted relative accessibility in 3 states note: for convenience a blank is used intermediate (i). 10st:relative accessibility in 10 states: = n corresponds to a relative acc. of n*n % subset: SUB: a subset of the prediction, for all residues with an expected average correlation > 0.69 (tables in header) note: for this subset the following symbols are used: "I": is intermediate (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 4 Abbreviations: PHDhtm ~~~~~~~~~~~~~~~~~~~~~ secondary structure: HL: T=helical transmembrane region, blank=other (loop) PHD: Profile network prediction HeiDelberg PHDF:filtered prediction, i.e., too long transmembrane segments are split, too short ones are deleted Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helical transmembrane region prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, e.g., prH=5 means, that the first output node is 0.5-0.6 subset: SUB: a subset of the prediction, for all residues with an expected average accuracy > 82% (tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 5 protein: t1_1169 length 253 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP| PHD sec | EEEEEEEEEEE | Rel sec |986232698999988536777898889998767988789888889788888888775445| detail: prH sec |001322000000000011110000000000121000100000110110000111112332| prE sec |001124788999988631001000000000000000000000000000000000000000| prL sec |987443100000001257778898889998878888888888888888888888877666| subset: SUB sec |LLL...EEEEEEEEEE.LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL..L| ACCESSIBILITY 3st: P_3 acc |bbebbbbbbbbbbbbbbebbbbeeeeeeebeeebbee eeeb eeeee eeeebeb b e| 10st: PHD acc |007000000000000007000078777790666007656970577997567990605057| Rel acc |113535278764440103000045435430110103101340133234013420000213| subset: SUB acc |...b.b.bbbbbbb........eee.ee............e......e...e........| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA| PHD sec | | Rel sec |678554545776655457866522687542246654469899998678646553333775| detail: prH sec |211222222112222321112233211224432222310000000111122212333111| prE sec |000000000000000000000000000000000000000000000000000011000001| prL sec |788766767887777677877655788765567776678899998788767676655886| subset: SUB sec |LLLLL.L.LLLLLLL.LLLLLL..LLLL....LLL..LLLLLLLLLLLL.LLL....LLL| ACCESSIBILITY 3st: P_3 acc |ebebeb eebbe b eeebbeb eebbe beebbe eeeeeeeeeeeebeebbbbbbbbb| 10st: PHD acc |606060576007505769006057700750670065677778977776076000000000| Rel acc |110011130003010402001113310211020011033354454530141223311232| subset: SUB acc |...............e........................eeeeee...e..........| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV| PHD sec | HHH EE HHHHH EEEE EEE EE| Rel sec |446532135221147631112446303432223323699524433643458996451549| detail: prH sec |111234456434321100000001345655433542000000000122220000000000| prE sec |220000010111200134443221000010100112100256653111110002664268| prL sec |667665432344467754445667643223355335789642335765568997324730| subset: SUB sec |..LL....H.....LL.......L............LLLL.....L...LLLLL.E.L.E| ACCESSIBILITY 3st: P_3 acc |bbbbbbb bbbbbbbebbbebbeebeeeb eebbeebbb bbb bbeebeeee bbbbbb| 10st: PHD acc |000000040000000600060066067604670076000500040076066674000000| Rel acc |024203611240002111111311013101231132220031110031011021443298| subset: SUB acc |..b...b...b...........................................bb..bb| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV| PHD sec |EEEEEEEEEEE HHHHHHHHHHHHHHHHHHHHHHHHHHH EEEEE E| Rel sec |999988697543887686547599999999997556875567997419955999729965| detail: prH sec |000000000000000000268788999999998667876778887530000000000011| prE sec |999988788763111111100000000000001222100000001110027999840016| prL sec |000011211226887787621200000000000100012221001248972000159971| subset: SUB sec |EEEEEEEEEE..LLLLLLL.HHHHHHHHHHHHHHHHHHHHHHHHH..LLLEEEEE.LLLE| ACCESSIBILITY 3st: P_3 acc |bbbbeebbbbbbbebeebbebbbebbe bbebbbbbebeeebeb beeebbbbbbee bb| 10st: PHD acc |000066000000060660060006006400700000606770605077900000096500| Rel acc |385921235131020213022342671199448721201741251044502244530106| subset: SUB acc |.bbb....b.............b.bb..bbebbb.....ee..b..eee...bbb....b| ....,....25...,....26...,....27...,....28...,....29...,....30 AA |ILLISFLIFLIVG| PHD sec |EEEE EEEEE | Rel sec |7876211246538| detail: prH sec |1111243321110| prE sec |8887201467660| prL sec |0001444110128| subset: SUB sec |EEEE.....EE.L| ACCESSIBILITY 3st: P_3 acc |bbbbbbbbbbbbe| 10st: PHD acc |0000000000009| Rel acc |8899679997425| subset: SUB acc |bbbbbbbbbbb.e| PHDhtm Helical transmembrane prediction note: PHDacc and PHDsec are reliable for water- soluble globular proteins, only. Thus, please take the predictions above with particular caution wherever transmembrane helices are predicted by PHDhtm! PHDhtm ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP| PHD htm | | Rel htm |998888888876677899999999999999999999999999999999999999999999| detail: | | prH htm |000000000011111000000000000000000000000000000000000000000000| prL htm |999999999988888999999999999999999999999999999999999999999999| other: | | PHDFhtm | | subset: | | SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA| PHD htm | | Rel htm |999999999999999999999999999999999999999999999999999999999999| detail: | | prH htm |000000000000000000000000000000000000000000000000000000000000| prL htm |999999999999999999999999999999999999999999999999999999999999| other: | | PHDFhtm | | subset: | | SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV| PHD htm | | Rel htm |999999999999999999999999999999999999999999999999999999999999| detail: | | prH htm |000000000000000000000000000000000000000000000000000000000000| prL htm |999999999999999999999999999999999999999999999999999999999999| other: | | PHDFhtm | | subset: | | SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |NITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPV| PHD htm | TTTTTTT| Rel htm |999999999999999999999999999999999999999999999999998612346788| detail: | | prH htm |000000000000000000000000000000000000000000000000000146678899| prL htm |999999999999999999999999999999999999999999999999999853321100| other: | | PHDFhtm | TTTTTTT| subset: | | SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL....HHHH| ....,....25...,....26...,....27...,....28...,....29...,....30 AA |ILLISFLIFLIVG| PHD htm |TTTTTTTTTTTTT| Rel htm |8888888887764| detail: | | prH htm |9999999998887| prL htm |0000000001112| other: | | PHDFhtm |TTTTTTTTTTTTT| subset: | | SUB htm |HHHHHHHHHHHH.| ________________________________________________________________________________