Training Analysis Page: Prediction Matrix
The Modify Field Dialog allows you to analyze and modify the properties of a field. The Training Analysis page allows you to view an analysis of the prediction results versus the desired values.
& For help with predictions, see Predicting and Modeling Financial Data.
Prediction Matrix Sub-Page Data
This sub-page displays an analysis of the accuracy of predicted entry/exit signals. Specifically, it reports how often each type of signal was predicted as compared to the desired signals. This is useful for determining the nature of incorrectly predicted signals.
There are several sections to the data being presented. Each section is divided into separate columns for training, cross validation, and testing data. Typically, the testing data is of primary importance since it is not used for training and will most closely resemble the behavior on any new data.
p Heading
The total and desired sections are present for each type of predicted signal. These signals are displayed using the following symbols:
Enter Long >= 0.50
Exit Short >= 0.20
Hold
Exit Long <= -0.20
Enter Short <= -0.50
p Total
These values are the percentage of the outputs from each training set that produced each type of predicted signal. These values total 100% for each training set, but may sum to a different number if rounding occurs.
: Example: A 42% in the first sub-column indicates that 42% of the signals produced in the training set were enter long.
p Desired
These values are the percentage of the outputs from each training set that produced each type of predicted signal, broken down by desired signal. These 25 values total 100% for each training set, but may sum to a different number if rounding occurs.
The values are color-coded to differentiate matches in correct and incorrect directions.
Background Meaning
White Perfect Match
Light Gray Correct Direction
Dark Gray Neutral – Hold Expected or Produced
Light Pink Incorrect Direction
Dark Pink Incorrect Entry
p Neutral
This value indicates the total percentage of the outputs that were not in the incorrect direction. In other words, all of the white, light gray, and dark gray values in the desired section.
Ä Note: A high neutral percentage is not always useful prediction. If the prediction signal produces primarily hold signals, it could be up to 100% neutral.
p Correct
This value indicates the total percentage of the outputs that were in the correct direction. In other words, all of the white and light gray values in the desired section.
p Perfect
This value indicates the total percentage of the outputs that matched the desired signal. In other words, all of the white values in the desired section.
Prediction Matrix Sub-Page Analysis
The analysis presented on this page is based on the directional accuracy values. It detects common characteristics to look for in the data and is intended only as a starting point for evaluating the model. Some common results include:
· This might be a good / reasonable / weak directional model.
This is an analysis of the directional accuracy in the testing set. It uses the percent correct direction for the values not in the dark gray cells. In other words, values that were predicted as a hold or desired to be a hold, but did not match, are ignored. If no testing set is used, the cross validation or training set directional accuracy is used. The following table is used:
accuracy > 65% excellent model
accuracy > 55% very good model
accuracy > 45% good model
accuracy > 35% reasonable model
accuracy <= 35% weak model
The evaluation of models in this way is very abstract. Models that are classified as "excellent" or "good" may be good at mirroring the desired data, but may not predict the values in a way that is useful in the way that was intended. Similarly, models that are classified as "weak" may produce values that are still useful. As stated above, this is intended only as a starting point for evaluating the model.
· It is specialized on the training / cross validation data.
This is an analysis of the directional accuracy for the testing set as compared to the directional accuracy for the training set. If no testing set is used, the directional accuracy for the cross validation set is used.
This is significant when determining whether the training has produced a model that is good at making generalizations outside of the training set. If a model over-trains or simply memorizes the training data, it may perform very well at predicting values in the training set; however, it will tend to perform poorly at predicting the values outside of the training set. A good predictive model will perform equally well on data in the training, cross validation, and testing sets.
Since the postprocessing values may be optimized on the cross validation data, the directional accuracy of the cross validation set will also be compared to that of the testing set if both are used.
This message may appear if the samples to weights ratio is not adequate for the type of data being used. A good rule of thumb is to have about ten training samples for each weight in the network. Significantly lower ratios may allow the neural network to simply use the weights to memorize the data, rather than make effective generalizations about its characteristics. See the Prediction Model page for a report of this ratio and help for improving it.
· Many of its long / short entry signals are contrary to the desired signal.
This is an analysis of the entry signals. Specifically, this message appears if at least 15% of the entry signals of a given type are being produced when the desired signal is in the opposite direction. This is useful for determining the reliability of the entry signals.
· It is only producing hold signals.
This message is generated to indicate that only hold signals are being produced. This results in the model not producing any trades during signal analysis.
This can occur if postprocessing is not used or it is not optimized for the current prediction. This is because the values produced by the neural network are all between the thresholds for exit long and exit short.
· It is not producing entry signals.
This message is generated to indicate that no entry signals are being produced. This results in the model not producing any trades during signal analysis.
This can occur if postprocessing is not used or it is not optimized for the current prediction. This is because the values produced by the neural network are all between the thresholds for enter long and enter short.
· It is not producing long / short exit signals.
This message is generated when the indicated type of exit signal is not being generated. This results in the model entering one type of trade and not exiting it. Typically, this results in a delayed buy and hold signal.
This can occur when the underlying data moves primarily in one direction, resulting in a profitable buy and hold. If the model of the signal is not particularly good, the optimization of the postprocessing will simply optimize the entry condition and never exit.
· It is specialized on positive / negative signals.
This is an analysis of the directional accuracy in the testing set, comparing the percentage of positive signals to negative signals. This is useful for determining the reliability of the signals.
If a problem occurred during the training or calculation phase, the analysis will be replaced with a description of the error. A summary of these error messages is displayed on the help for the Modify Field Dialog: Training Analysis page.
How Did I Get Here?
This is a sub-page of the Modify Field Dialog: Training Analysis page.