Here, we report a substantial increase in both the accuracy and quality of secondary structure predictions, using a neural network algorithm. The main improvements come from the use of multiple sequence alignments (better overall accuracy), from 'balanced training' (better prediction of b-strands) and from 'structure context training' (better prediction of helix and strand lengths). The new method, cross-validated on seven different test sets purged of sequence similarity to learning sets, achieves a three-state prediction accuracy of 69.7%, significantly better than previous methods.
In addition, the predicted structures have a more realistic distribution of helix and strand segments. The predictions may be suitable for use in practice as a first estimate of the structural type of newly sequenced proteins.