Training Analysis Page: Error Analysis

 

The Modify Field Dialog allows you to analyze and modify the properties of a field. The Training Analysis page allows you to view an analysis of the prediction results versus the desired values.

&  For help with predictions, see Predicting and Modeling Financial Data.

Error Analysis Sub-Page Data

This sub-page displays an analysis of the error, correlation, and percent of correct signals for both the raw values and the postprocessed values.

p    Normalized Error

This value is an analysis of the difference between the desired values and the predicted values. This value is divided by the difference between the desired value and the previous value. This produces a normalized error value in which 0 indicates no error and 1 indicates that the prediction was no better than simply repeating the previous value. For more information on this value, see the help for the Normalized Error sub-page.

p    Correlation

This value is an analysis of the statistical correlation between the desired values and the predicted values. Specifically, the reported correlation coefficient is an analysis of whether the desired values and the predicted values move in the same direction by similar amounts. If a strong relationship is found, the changes in the predicted values are very similar to the changes in the desired values. For more information on this value, see the help for the Correlation sub-page.

p    Percent Correct Signal

This value is an analysis of the overall direction of the predicted signal. This information is analyzed in detail on the Prediction Matrix sub-page.

 

The analysis is performed for the training, cross validation, and accuracy testing sets used for the prediction. Each type of analysis is performed on the entire subset.

Error Analysis Sub-Page Analysis

The analysis presented on this page is based on the normalized error and correlation values. It detects common characteristics to look for in the data and is intended only as a starting point for evaluating the model. Some common results include:

·      This might be a good / reasonable / poor predictive model.

This is an analysis of the error in the testing set. If no testing set is used, the cross validation or training set error is used. The following table is used:

error < .80 excellent model

error < .85 very good model

error < .90 good model

error < .95 reasonable model

error >= .95 poor model

 

The evaluation of models in this way is very abstract. Models that are classified as "excellent" or "good" may be good at mirroring the desired data, but may not predict the values in a way that is useful in the way that was intended. Similarly, models that are classified as "weak" may produce values that are still useful. As stated above, this is intended only as a starting point for evaluating the model.

·      It performed better / worse than simply predicting the last value.

This is an explanation of the rationale for the analysis text. As explained above, an error of 1 is equivalent to predicting the previous value. Therefore, the effectiveness of the model is based on its improvement over simply predicting the last known value.

·      This might be a good / reasonable / weak correlational model.

This is an analysis of the correlation in the testing set. If no testing set is used, the cross validation or training set error is used. The following table is used:

correlation > .20 excellent model

correlation > .15 very good model

correlation > .10 good model

correlation > .05 reasonable model

correlation <= .05 poor model

 

The evaluation of models in this way is very abstract. Models that are classified as "excellent" or "good" may be good at mirroring the desired data, but may not predict the values in a way that is useful in the way that was intended. Similarly, models that are classified as "weak" may produce values that are still useful. As stated above, this is intended only as a starting point for evaluating the model.

·      It produced values with a strong / weak correlation to the desired data.

This is an explanation of the rationale for the analysis text. As explained above, a higher correlation is desirable. If an inverse correlation is reported, this is generally not a desirable characteristic, since the model is supposed to be training to produce results similar to the desired data, not opposite to it.

 

If a problem occurred during the training or calculation phase, the analysis will be replaced with a description of the error. A summary of these error messages is displayed on the help for the Modify Field Dialog: Training Analysis page.

How Did I Get Here?

This is a sub-page of the Modify Field Dialog: Training Analysis page.