Previous -
Next -
Bottom -
PP home -
PP help TOC
Note: the examples for the allowed PP input formats are primarily important when you submit the request by email.
Submitting a single sequence
- INPUT is: your protein sequence,
- OUTPUT is: alignment + prediction
INPUT
You send the following file:
joe@amino.churn.edu
# incredulase from paracoccus dementiae, translated from cDNA
KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKD
WWKVEVNDRQGFVPAAYVKKLD
Notes:
- The '#' is a control for PredictProtein.
- The hash (#) is crucial, as the parser interprets anything after this line as a protein sequence. Following the hash, put a one-line description of the protein.
OUTPUT (detailed example)
If your sequence has at least one non-trivial homologue in the database of protein sequences, you receive a multiple sequence alignment and the annotated prediction in the following form:
Block with multiple sequence alignment.
Block with explanations about the prediction method.
Block with prediction (example for secondary structure prediction follows).
.........1.........2.........3.........4.........5.........6
AA KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLD
PHD EEEEEE EEEEEE EEEEEE EEEE EEE
Rel 854777641334566643102441577762566642443213663122112234155
- INPUT is: a list with your sequences,
- OUTPUT is: alignment + prediction
INPUT
You send the following file:
joe@amino.churn.edu
# FASTA list incredulase from paracoccus dementiae, translated from cDNA
> Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
> Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:
Notes:
- The string "# FASTA list" is crucial, as the parser interprets anything after this line as a list of sequences in FASTA format (i.e. the actual FASTA format starts in the line after the '#').
OUTPUT (example)
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
- INPUT is: a list with your sequences,
- OUTPUT is: alignment + prediction
INPUT
You send the following file:
joe@amino.churn.edu
# PIR list incredulase from paracoccus dementiae, translated from cDNA
>P1;
Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
>P1;
Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:
Notes:
- The string "# PIR list" is crucial, as the parser interprets anything after this line as a list of sequences in PIR format (i.e. the actual PIR format starts in the line after the '#').
OUTPUT (example)
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
Note: I do strongly recommend this as THE option of choice for non-experts (rather than the MSF format).
- INPUT is: your alignment (in the simple alignment format SAF),
- OUTPUT is: prediction
INPUT
You send the following file:
joe@amino.churn.edu
# SAF incredulase from paracoccus dementiae, translated from cDNA
Andr_Human RQLVHVVKWA KALPGFRNLH VDDQMAVIQY SWMGLMVFAM GWRSFT
Prgr_Rabit .QLLSVVKWS KSLPGFRNLH IDDQITLIQY SWMSLMVFGL GWRSYK
Notes:
- Your name and email address are required. The string "# SAF" is crucial, as the parser interprets anything after this line as an alignment in SAF format.)
- The '#' is a control for PredictProtein. The actual SAF-format begins after that line!
- Names should contain up to 14 characters and no blanks.
- Please use the same names for the same protein in all rows.
- To mark insertions, please use a point '.'.
OUTPUT (example)
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
Note: To non-experts I strongly recommend to use the SAF format, instead (see above).
- INPUT is: your alignment (in the multiple sequence format MSF),
- OUTPUT is: prediction
INPUT
You send the following file:
joe@amino.churn.edu
# MSF incredulase from paracoccus dementiae, translated from cDNA
MSF of: x.hssp from: 1 to: 176
x.msf MSF: 176 Type: P 11-Oct-93 21:17:4 Check: 5859 ..
Name: Andr_Human Len: 176 Check: 750 Weight: 1.00
Name: Prgr_Rabit Len: 176 Check: 3980 Weight: 1.00
//
Andr_Human RQLVHVVKWA KALPGFRNLH VDDQMAVIQY SWMGLMVFAM GWRSFT
Prgr_Rabit .QLLSVVKWS KSLPGFRNLH IDDQITLIQY SWMSLMVFGL GWRSYK
Notes:
- Your name and email address are required. The string "# MSF" is crucial, as the parser interprets anything after this line as an alignment in MSF format.)
- The '#' is a control for PredictProtein. The actual MSF-format begins after that line!
- Names should contain up to 14 characters and no blanks.
- Please use the same names for the same protein in all rows.
- All sequences must have the same length.
- To mark insertions, please use a point '.'.
OUTPUT (example)
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
- INPUT is: a list with your aligned sequences,
- OUTPUT is: prediction
INPUT
You send the following file:
joe@amino.churn.edu
do NOT align
# FASTA list incredulase from paracoccus dementiae, translated from cDNA
> Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
> Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:
Notes:
- The strings '# FASTA list',and 'do NOT align' are crucial: the first as the parser interprets anything after the line with the hash ('#') as a list of sequences in FASTA format (i.e. the actual FASTA format starts in the line after the '#'), the second ('do not align'), as otherwise your sequences will be re-aligned.
Block with ProDom domain assignment (if found).
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
- INPUT is: a list with your aligned sequences,
- OUTPUT is: prediction
INPUT
You send the following file:
joe@amino.churn.edu
do NOT align
# PIR list incredulase from paracoccus dementiae, translated from cDNA
>P1;
Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
>P1;
Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:
Notes:
- The strings '# PIR list',and 'do NOT align' are crucial: the first as the parser interprets anything after the line with the hash ('#') as a list of sequences in PIR format (i.e. the actual PIR format starts in the line after the '#'), the second ('do not align'), as otherwise your sequences will be re-aligned.
Block with ProDom domain assignment (if found).
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
- INPUT is: a prediction of secondary structure and accessibility,
- OUTPUT is: an alignment of remote homologues
INPUT
You send the following file:
joe@amino.churn.edu
prediction-based threading
# COLUMN format
AA PSEC PACC RI_SEC RI_ACC
E L 11 9 6
F E 7 9 0
: : : : :
V H 61 3 0
L H 113 3 0
R H 39 1 1
: : : : :
P L 17 9 4
A L 89 9 2
: : : : :
- Delimiters of columns: allowed are spaces, commas, and tabs.
- Compulsory information: (1) sequence (AA) in one-letter code; (2) secondary structure (PSEC) in either of the states H=helix, E=strand, or L=rest; (3) solvent accessibility (PACC) in square Angstrom (note: for prediction-based threading accessibility will be converted to relative accessibility in two states: buried (<15%) or exposed (≥15%)).
- Optional: (1) reliability, or strength for secondary structure (RI_SEC) scaled from 0 (low) to 9 (high); (2) reliability, or strength for relative accessibility (RI_ACC) scaled from 0 (low) to 9 (high).
-
Notes:
- The string '# COLUMN format' is crucial, as the parser interprets anything after this line as a prediction.
- To receive PHD prediction in this format use the output option 'return COLUMN format'.
OUTPUT (example)
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.
- INPUT is: a prediction and observation of secondary structure,
- OUTPUT is: an evaluation of prediction accuracy
INPUT
You send the following file:
joe@amino.churn.edu
evaluate prediction accuracy
# COLUMN format
NAME AA PSEC OSEC
first M L L
first Q L L
first T L H
first S H H
first S H H
first I H H
: : : :
second G L L
second V L L
second K E L
second S L H
second I L H
: : : :
- Delimiters of columns: allowed are spaces, commas, and tabs.
- Compulsory information: (1) sequence (AA) in one-letter code; (2) secondary structure (PSEC) in either of the states H=helix, E=strand, or L=rest; (3) observed (OSEC) secondary structure in either of the states H=helix, E=strand, or L=rest (e.g. from DSSP assignment); (4) if more than one protein is used, simply append all requested proteins (in that case make sure that the first column (NAME) lists a unique protein name).
- Optional: (1) name of protein (compulsory for more than one protein).
- (Note: Your email address is required. The string "# COLUMN format" is crucial, as the parser interprets anything after this line as a prediction.)
OUTPUT (example)
Block with definition of scores for prediction accuracy.
Tables with per-residue and per-segment prediction accuracy.
- INPUT is: your protein sequence given as a valid SWISSPROT identifer,
- OUTPUT is: alignment + prediction
INPUT
You send the following file:
joe@amino.churn.edu
# SWISSid
paho_chick
Notes:
- The string "# SWISSid" is crucial, as the parser interprets anything after this line as a SWISSPROT identifier.
- Valid identifiers have the form 'name_species', for the example above:
- Only identifiers of the latest SWISSPROT release are accepted.
Previous -
Next -
Top -
PP home -
PP help TOC