Publication to reference in reporting results: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rost, Burkhard; Sander, Chris: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. Rost, Burkhard; Sander, Chris: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 1994, 19 (No. 1), in press. Some statistics: ~~~~~~~~~~~~~~~~ Percentage of amino acids: +--------------+--------+--------+--------+--------+--------+ | AA: | I | G | L | T | V | | % of AA: | 13.1 | 13.1 | 12.1 | 8.1 | 6.1 | +--------------+--------+--------+--------+--------+--------+ | AA: | Q | P | K | R | E | | % of AA: | 6.1 | 6.1 | 6.1 | 4.0 | 4.0 | +--------------+--------+--------+--------+--------+--------+ | AA: | D | N | A | W | M | | % of AA: | 4.0 | 3.0 | 3.0 | 2.0 | 2.0 | +--------------+--------+--------+--------+--------+--------+ | AA: | F | C | Y | S | H | | % of AA: | 2.0 | 2.0 | 1.0 | 1.0 | 1.0 | +--------------+--------+--------+--------+--------+--------+ Percentage of secondary structure predicted/observed: +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Predicted: | 6.1 | 53.5 | 40.4 | +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Observed: | 7.1 | 48.5 | 44.4 | +--------------+--------+--------+--------+ According to the following classes: all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 alpha-beta : %H>30 and %E>20; mixed: rest, this means that the predicted class is: mixed class The class of the observed structure is: mixed class PHD output for your protein: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fri Jun 10 15:30:51 1994 For secondary structure prediction: Jury on: 10 different architectures (version 5.94 ). For solvent accessibility prediction: Jury on: 5 different architectures (version 5.94 ). Note: differently trained architectures, i.e., different versions can result in different predictions. About the protein: ------------------ HEADER HYDROLASE(ACID PROTEINASE) COMPND HIV-1 PROTEASE (BRU ISOLATE) SOURCE GENE OF THE HUMAN IMMUNODEFICIENCY VIRUS /HIV-1$ AUTHOR S.SPINELLI,P.M.ALZARI SEQLENGTH 99 NCHAIN 1 chain(s) in 1hhp data set NALIGN 58 (=number of aligned sequences in HSSP file) Abbreviations: -------------- secondary structure : H=helix, E=extended (sheet), blank=rest (loop) accessibility : 3st: relative solvent accessibility (acc) in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. 10st: relative solvent acc. in 10 states = n corresponds to a relative acc. of n*n % AA: amino acid sequence OBS: values for experimentally observed 3D structures OBS sec: DSSP classification of secondary structure OBS acc: DSSP estimat of relative solvent accessibility area O_3 acc: observed relative accessibility in 3 states: B, I, E Note: a blank is used for intermediate (i) PHD: Profile network prediction HeiDelberg PHD sec: prediction of secondary structure in three states PHD acc: prediction of solvent accessibility in 10 states P_3 acc: predicted relative accessibility in 3 states Rel: Reliability index of prediction (0-9) Rel sec: reliability of secondary structure prediction Rel sec: reliability of accessibility prediction detail: prH sec: 'probability' for assigning helix prE sec: 'probability' for assigning strand prL:sec: 'probability' for assigning loop Note: the 'probabilites' are scaled to the interval 0-9, i.e. prH=5 means, that the signal at the first output node is 0.5-0.6. subset: SUB: a subset of the prediction, for all residues with a reliablity > n, with the following choices: SUB sec: for secondary structure prediction n >= 5, which corresponds to an expected accuracy > 82% (see tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is Rel < 5 SUB acc: for accessibility prediction n >= 4, which corresponds to an expected correlation coefficient > 0.69% (see tables in header) Note: for this subset the following symbols are used: "I": is intermediate (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is Rel < 4 protein: 1hhp length 99 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYD| OBS sec | EEEEE EEEEEE EEE EEEEEEE EEEEEEEEE| PHD sec | EEEE EEEEEE EEEEEEE EEEEEE EEEEE EEEEEE | Rel sec |945644266799995261277564258753113431168997643688736515566412| detail: prH sec |000000000000000000000000000112332110000000000000000000000000 prE sec |037766422899997424587676521012345654421001236788831246677644 prL sec |962233577100002575411212378865322234578898763211157742222355 subset: SUB sec |L.EE...LLEEEEEE.L..EEEE..LLLL........LLLLLL..EEEE.LL.EEEE...| ACC: 3st: O_3 acc |eeeeeeee e ebebee e eb ee eeb beebe eeee e e eeee e eb b | P_3 acc |ee e ee ebebebeeeeeebbbeebbeeebbee e eeeeeeebbbbbbbbbe ee e| 10st: OBS acc |996789984637061785837043578476030672839887585647799585824415 PHD acc |985736755606060686667000760076600765737976686000000000656757 Rel acc |963702641314342164236264420143342621707671443251451210524626 subset: SUB acc |ee.e..ei...e.e..ee..e.bbe...e..b.e..e.eee.ee..b.bb....e.ee.e| ....,....7....,....8....,....9....,....10...,....11...,....1 AA |QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF| OBS sec |EEEEEE EEEEEEEEEE EE HHHHHHH | PHD sec |EEEEEEE EEEEEEEE EEE HHHHHH E | Rel sec |599998284379999983998112251789961253009 detail: prH sec |000000000000000000000000114888974322100 prE sec |799998512688999983000445520000000011440 prL sec |200001486310000016998444364110024565449 subset: SUB sec |EEEEEE.L..EEEEEEE.LLL....L.HHHHH..L...L| ACC: 3st: O_3 acc |e eb beeeebe bbbbe eeb bbe bbee eeeeee| P_3 acc |ebebe eeee ebbbbbbe bbebbb ebbeeb ee ee| 10st: OBS acc |836051686817350101849623006502774768899 PHD acc |706065677756000000750060005600670567569 Rel acc |714352357713224210521235373315462126236 subset: SUB acc |e.e.e..eee....b...e....b.b...bee...e..e|