Tutorial: Using Correlation Analysis to Identify Inputs: Run Correlation Analysis for a Single Stock
Run Correlation Analysis for a Single Stock
i Tutorial Task
Run a correlation analysis on a Close field.
In this task, we will predict the closing price of “Bellsouth Cp” stock using other data in the portfolio. Before we begin, we will need to import the data for “Bellsouth Cp”, along with some other data so that we have different types of values to analyze.
Commentary
In several of the previous tutorials, we used selected informative inputs to train price predictions and models of the optimal signal. You may be wondering how you can identify informative inputs to use in your predictions.
Much of this process is educated trial and error. The best way to start is by using information contained in the data series you are trying to predict. For example, if you are trying to predict the closing price or the optimal signal for a stock, the prices and financial indicators associated with that stock are an excellent source of inputs.
However, stocks are also subject to broader market pressures. If the entire market or a particular sector is advancing or declining, then a stock will typically react to that trend. Some stocks lead trends, other stocks trail behind, and some even move counter to trends.
Determining informative external inputs is also a process of educated trial and error. However, TradingSolutions includes a tool to highlight points of interest that may warrant investigation -- the Correlation Analysis Wizard.
i What is the Correlation Analysis Wizard?
The Correlation Analysis Wizard allows you to determine the amount of correlation between one field and a collection of other fields over a range of previous samples. For example, you could determine whether the previous closing prices of stocks and indices in your portfolio are correlated to the closing price of another stock.
As you may remember from the optional activities of the previous tutorials, correlation measures the degree by which two data series move in the same direction by the same percentage. Therefore, previous values with a high correlation to the value you are trying to predict may make informative inputs in its prediction.
It is important to note that both positive and negative correlations can indicate informative inputs. Values with a negative correlation tend to move in opposite directions. However, this is still information that can be exploited when making a prediction.
This activity will take you step-by-step through using correlation analysis to determine a set of potential inputs and creating a useful prediction from those inputs.
Step-by-step Instructions
1. Select Analyze Correlations… from the context menu for “Bellsouth Cp” in the Portfolio View.
Right-click on “Bellsouth Cp” in the Portfolio View and select Analyze Correlations… from the context menu. This will display a selection dialog labeled Select a Field to Analyze. This is a list of all of the fields in “Bellsouth Cp” that are available for analysis.
2. On the dialog labeled Select a Field to Analyze, select the “Close” field and press OK.
We want to be able to predict the closing price of “Bellsouth Cp”. To identify inputs for this prediction, we want to determine which fields have higher correlations with the value in the Close field of “Bellsouth Cp”. Select the Close field from the selection tree and press the OK button to continue. This will display the Correlation Analysis Wizard.
Ä Note: If you have previously run correlation analysis for this field, the results from that correlation analysis will be displayed. To follow this tutorial step-by-step, press the Delete Analysis button, then press the Correlation Analysis button on the Overview page. You should now be in the Correlation Analysis Wizard.
3. On the first page of the Correlation Analysis Wizard, select Determine if other fields may be good inputs for predicting this field and press Next.
The Correlation Analysis Wizard allows you to perform two different types of correlation analysis. The default type that is typically performed determines if other fields may be good inputs for predicting the selected field.
The other type of correlation analysis allows you to determine if the selected field may be a good input for predicting other fields. This is useful when you import a new index or calculate a new financial indicator and want to see if it could potentially be used to predict other values in your portfolio.
In this case, we want to predict the selected field, so accept the default type and press the Next button. This will advance you to the Analysis Fields page.
4. On the Analysis Fields page, select “My Portfolio” and press the Add Field from Selection… button.
The Analysis Fields page allows you to select which fields to analyze. We would like to compare the closing price of “Bellsouth Cp” to the previous closing prices of all of the data in the portfolio. To do this, select “My Portfolio” and press the Add Field from Selection… button. This will display a selection dialog named Select a Field Name to Add.
5. On the dialog labeled Select a Field Name to Add, select the “Close” field and press OK.
This dialog allows you to select a specific field name from all of the data that you have selected. In this case, we would like to analyze the closing prices of all of the data, so accept the default selection of the Close field and press the OK button.
This combination of selecting the portfolio and selecting the Close field adds the Close field for all of the data in the portfolio. In this case, we left the default preprocessing of “Percent Change” selected, so the percent change of each Close field was selected.
Ä Note: If you have imported a significant amount of data other than the data in the “Sample Data” directory, you may want to modify this step to reduce the number of fields being analyzed.
Ä Note: Since correlation analysis analyzes previous values, we want to include the closing price of “Bellsouth Cp” in the list of fields to analyze since previous values of “Bellsouth Cp” are also available as inputs.
6. On the Analysis Fields page, press Next.
We have selected all of the fields we want to analyze, so press the Next button.
7. On the Analysis Options page, look over the options and press Finish.
This will advance you to the Analysis Options page.
The Analysis Options page allows you to specify the date range and other options to use for the analysis. Accept the default values on this page and press the Finish button to being the analysis.
Ä Note: The available date range is determined by the date ranges of the data series for each of the selected fields. The default size is also limited to the default size of a prediction date range. If you use only the sample data, the default will be to analyze 12/31/1994 through 12/31/1999. However, if you have imported and selected other stocks that do not include data for this entire range, the available date range may be different, causing different results.
8. Let TradingSolutions perform the requested task.
After the analysis has completed, a Correlation Analysis page for the selected field will be displayed.
9. Examine the Modify Field Dialog: Correlation Analysis page.
The Correlation Analysis page displays the results of the most recent correlation analysis for a field. Like training analysis and signal analysis, it is part of the Modify Field Dialog. However, it is only displayed after a correlation analysis has been run for a field.
Each row on the Correlation Analysis page represents one of the analyzed fields lagged by a given number of samples. For example, if the percent change in the closing price for the S&P 500 Index was analyzed for 0 to 4 lags, then entries for “%Change in S&P 500 Index: Close” as an input with lags from 0 to 4 would be listed individually.
Correlation values range from 1 to -1. A value of 1 indicates the highest level of correlation – when one value increases, the other typically increases. Negative values indicate inverse correlation – when one value increases, the other typically decreases. By default, the entries are listed in order of the absolute value of the correlation since both directions contain information about tendencies.
It is important to note that correlation is only one measure of the informative value of a field as an input. Individually, fields may contain information that is not reflected in this value, such as momentum and non-linear relationships. In addition, fields may contain information that is effective only when combined with other information. For example, volume fields will typically not have a high correlation when compared to price fields, but they may still be useful in providing information about underlying price changes.
i What are the Confidence rows?
In addition to entries for the individual lags of the potential inputs, you will also see rows indicating some percentage of confidence that a correlation exists. These rows are provided to indicate the significance of the other values.
If the values are not zero, doesn’t this mean that a correlation must exist? Correlation analysis is only as good as the data being analyzed. Even if the date range being analyzed has the same characteristics as the data you will be trying to predict, it may still include variability or “noise” that may appear as a correlation when one does not exist.
Because of this, TradingSolutions indicates the values where various levels of confidence exist. For example, the line that indicates “99% Confident a Correlation Exists” displays the correlation value required to be 99% certain that a correlation exists. This is statistics way of saying that 99% of the time, this correlation value was not caused by the values appearing to correlate by chance.