Computational Intelligence has been widely used in recent years in many areas, such as speech recognition, image analysis, adaptive control and time series prediction. This research attempts to explore the usefulness of neural network and support vector machine in financial market. Two popular stock market indexes have been studied: Hong Kong Hang Seng Stock Index and Dow Jones Transportation Index. The performance of neural network and support vector machine are evaluated in two dimensions: error in forecasting and trading profits.
Popular technical indicator, percentage price oscillator (PPO), has been selected as training input and output. Predictive models use previous 8 days PPO to forecast future 5 days PPO. Empirical results on Hong Kong Hang Seng Index show that multilayer perceptron optimized with GA (MLPGA) trading system obtain 6.71 times of original capital from 1997129 to 200738, totally 2500 trading days. While support vector regression optimized by genetic algorithms (SVRGA) trading system generates 5.705 times of original capital during the same time horizon. In contrast, conventional nonpredictive trading system only produces 2.064 times of starting equity. “Buy and Hold” strategy gives 1.605 times return to investors. A recent published fuzzy trading system provides 5.781 dollars as final equity for 1 dollar initial investment.
Don’t waste time! Our writers will create an original "Stock trading using computational intelligence" essay for you
Create orderFurther evaluations of two intelligent trading systems have been made. A back test using the same parameters and same assumptions on Dow Jones Transportation Index have further proved the robustness of the proposed trading systems. MLPGA trading system provides 4.87 times of initial capital and SVRGA trading system obtains 5.168 as final equity. These two intelligent trading systems again outperform conventional trading system, which generate 2.805 dollars for 1 dollar investment.
Acknowledgements
I am very grateful to my final year project supervisor, Associate professor Wang Lipo, and would like to take this opportunity to thank him for his patient and insightful guidance throughout the project. Professor Wang always offers me detailed and valuable explanations and suggestions in our discussion, and provides me useful knowledge about doing research. Not only professor Wang enlightens me in academic area, he also arranges meeting with industrial professionals for me to discuss this project. Again, I would like to express my sincere appreciation to professor Wang.
Zhu Ming
April, 2010.
Stock Trading using Computational Intelligence
Fig 2‑1 A multi layer neural network with L layers 13
Fig 2‑2 Maximummargin hyperplane and margins for a SVM trained with samples from two classes. 16
Fig 2‑3 Genetic Algorithm flowchart, with maximum 100 generation 18
Fig 2‑4 One point crossover 19
Fig 2‑5 roulettewheel selection 20
Fig 3‑1 Dow Jones Industrial Average price, with EMA plotted. 23
Fig 3‑2 Using single EMA 23
Fig 3‑3 Using two EMA to make decision 24
Fig 3‑4 A predictive trading system. 26
Fig 3‑5 Structure of GA optimized MLP 28
Fig 4‑1 Training performance of MLP 33
Fig 4‑2 MSE for out of sample data 34
Fig 4‑3 Linear regression for trained neural network 35
Fig 4‑4 Linear regression for out of sample data 36
Fig 4‑5 Equity curve for intelligent and conventional trading systems 37
Fig 4‑6 Trading signal of NN+GA trading system 38
Fig 4‑7 Trading signal of conventional trading system 39
Fig 4‑8 MSE for GA+SVR model 41
Fig 4‑9 Equity curve for GA+SVR trading system and conventional trading system 42
Fig 4‑10 Comparison of 4 trading systems 43
Fig 4‑11 Equity curves of different trading system on DJT 44
Table 3‑1 Settings for GA and NN 26
Table 3‑2 Settings for GA and SVR 29
Table 4‑1 Data distributions for training and testing neural network 32
Table 4‑2 Total return for different prediction time horizon 34
Table 4‑3 Trading performance comparison 42
Stock Trading using Computational Intelligence
Introduction
Analyzing stock market is one of the most important and fascinating issue as it is highly related with the profitability of investment. There are two main types of analysis in financial market: technical analysis and fundamental analysis. Fundamental analysis is based on the premise that a stock, bond, fund, commodity, or a market as a whole has an underlying intrinsic value. By analyzing the fundamental characteristics, such as assets, liabilities, income, supply or demand, values can be determined [11]. Normally fundamental analysts use a trading strategy called “Buy and Hold”, since they tend to buy the stocks of undervalued companies or the companies with great growth potentials. They believe that the share price would rise eventually since the company they buy is growing. Hence, they would like to keep the stocks for a relative long time. On the other hand, technical analysis believes that the market’s price reflects all the relevant information, such as news and events. Thus, price is the only information they need to analyze. In their perspective, history will repeat itself such that we could trade for profits. Therefore, technical analysis only employs historical data to build the model for future investment.
Over the past decade, Computational Intelligence has been widely used in stock trading, such as using neural networks (NN) [10]). Using computational intelligence could provide opportunities for investors to combine the information gathered from fundamental analysis and technical analysis to make trading decision. Mainly, two types of input data have been used in computational intelligence. One type, price or technical indicators, is considered as technical analysis. The other type includes macroeconomic indices and information related to a specific company, such as the interest rate and P/E ratio.
Many pioneer scholars have focused on minimizing the mean square error (MSE) in price direction prediction as well as providing paper profits in trading financial market. Patel et al [10] uses hierarchical coevolutionary fuzzy system (HiCEFS) to predict a technical indicator and hence build a prudent trading strategy. Furthermore, by testing this model with real world data of Hong Kong Hang Seng Index and NOL stock in Singapore Exchange, they achieved a final return of 14.251 times of original capital on NOL stock in 2329 trading days and 5.781 times of original capital on Hang Seng Index in 2461 trading days.
The objective of this project is to explore and examine the usefulness of computational intelligence in stock trading on Hong Kong Hang Seng Index and Dow Jones Transportation Index. The intelligent trading system built on matlab could analyze the historical data and generate buy or sell signals for any given time series.
The main objectives are as follows:
1. Apply intelligent trading system on Hong Kong Hang Seng index to generate buy and sell signals. The intelligent trading system could be constructed with neural networks optimized by genetic algorithm or support vector machine optimized by genetic algorithm.
2. Examine entry and exit signals generated by intelligent trading system and nonintelligent trading system. Compare the empirical trading profits between them.
3. Compare the trading performance of intelligent trading system with other researcher’s work, using the same data and trading rules.
4. Further validate the trading system’s performance by applying the proposed system on Dow Jones Transportation Average Index, and compare the trading profits with nonintelligent trading system.
This report is organized into 5 chapters:
Chapter 1 provides some background knowledge of financial market and other researcher’s accomplishment on using computational intelligence in financial market. It also gives a detailed project objectives and scope.
Chapter 2 introduces the background knowledge for this project, such as neural network, support vector machine and genetic algorithm.
Chapter 3 describes the proposed methodology of this project. It introduces the technical indicators and inputs to the intelligent trading system, the architectures of the trading system. In addition, it also provides the settings for each intelligent prediction model, as well as the data preparation for these prediction models.
Chapter 4 presents the empirical results of trading Hong Kong Hang Seng Index and Dow Jones Transportation Average Index. Furthermore, it compares the results with nonintelligent trading system as well as “buy and hold” strategy.
Chapter 5 summarizes the project and provides the future work for the project.
Literature Review
An artificial neural network (ANN) is inspired by the structure and functions of biological neural networks, and expressed using mathematical models. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Modern neural networks are nonlinear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs or to find patterns in data. Neural networks are considered as highly parallel system which could learn from the past data and would be able to apply the knowledge learned to new data.
There are varies of ANN structures, multilayer perceptron neural networks (MLP) is one of them. It is a feedforward network has a layered structure. Each layer consists of units which receive their input from units from a layer directly below and send their output to units in a layer directly above the unit. There are no connections within a layer Fig 2‑1. The inputs are fed into the first layer and each input is associated with a weight. The first layer outputs are considered as second layer’s input and eventually calculated the final output. The activation function for each layer is described as:
in which
Information in MLP networks only move in the forward direction, from the input nodes through the hidden layers and to the output layer. There are also no loops in a MLP network.
Fig 2‑1 A multi layer neural network with L layers
Back propagation is a common method of teaching artificial neural networks how to perform a given task. It was first described by Arthur E. Bryson and YuChi Ho in 1969,[14]. Back propagation is a supervised learning method, and is an implementation of the Delta rule. It requires a teacher that knows, or can calculate, the desired output for any given input. In another word, it has to be provided with desired output in order to calculate the errors. The errors propagate backwards from the output nodes to the inner nodes and from the inner nodes to input nodes. Hence back propagation is a method to calculate the gradient of the error for the network with respect to the network’s modifiable weights, either in input layer or in hidden layer.
In short, back propagation algorithm could be describe as below.
Summary of the backpropagation technique:
1. Present a training sample to the neural network.
2. Compare the network’s output to the desired output from that sample. Calculate the error in each output neuron.
3. For each neuron, calculate what the output should have been, and a scaling factor, how much lower or higher the output must be adjusted to match the desired output. This is the local error.
4. Adjust the weights of each neuron to lower the local error.
5. Assign “blame” for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights.
6. Repeat from step 3 on the neurons at the previous level, using each one’s “blame” as its error.
LevenbergMarquardt Algorithm is used for training the neural network. It could be used to modify the ANN’s weights of each layer. The LevenbergMarquardt Algorithm interpolates between the GaussNewton algorithm and the method of gradient descent. It is more robust than the GaussNewton algorithm, which means that in many cases it finds a solution even if it starts very far off the final minimum. On the other hand, for wellbehaved functions and reasonable starting parameters, the LevenbergMarquardt Algorithm tends to be a bit slower than the GaussNewton algorithm. LevenbergMarquardt Algorithm could be expressed as [15]
Support Vector Machine (SVM) is a relatively new learning method developed from statistical learning theory. Compared with traditional statistics, statistical learning theory does not assume infinite samples, but rather focused on estimations utilizing small samples. The basic idea of support vector machine is to find a hyperplane which separates the ddimensional data perfectly into its two classes. Support Vector Machine is a supervised learning method which could map the input space to output space Fig 2‑2.
Given that a training set (), i = 1…, the support vector machine requires the minimum value of following formula [17].
Fig 2‑2 Maximummargin hyperplane and margins for a SVM trained with samples from two classes.
Support Vector Machine used in regression was proposed in 1996 by Vladimir Vapnik, Harris Drucker, Chris Burges, Linda Kaufman and Alex Smola [18], which is called support vector regression (SVR). The model produced by support vector machine used in solving classification problems depends only on a subset of the training data or called support vectors, because the cost function for building the model does not care about training points that lie beyond the margin. Similarly, the model produced by SVR depends only on a subset of the training data, because the cost function for building the model ignores any training data close to the model prediction.
Given a training set (), i = 1…, the target of SVR is to find a linear function that could minimize the discrepancy between the desired output and predicted output. The optimal regression function is the same with SVM.
There are several kernel functions commonly used in SVR, which includes liner, polynomial, radial basis function and sigmoid kernel function. Their respective formula is as below [23]:
n Linear:
n Polynomial:
n Radial Basis Function (RBF):
n Sigmoid:
Here, are kernel parameters
Support Vector Machine or SVR has some advantages when comparing to Neural Networks. For instance, it does not over fit the training data since it uses only several training data as support vectors. However, parameters in SVR would affect the final results in spite that SVR has much fewer parameters compared to NN. The main parameters in SVR are error insensitive tube around the regression function [19] and the balance of training errors with model complexity.
Genetic algorithm (GA) is a searching technique to look for exact or approximate solutions for optimization and searching problems. It is considered as global search heuristics.GA uses techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover.
A typical genetic algorithm requires:
1. a genetic representation of the solution domain
2. fitness function to evaluate the solution domain
In GA, an abstract representation of candidate solutions is called chromosomes, and it could be used in an optimization problem evolves toward better solutions. Solutions are represented in some encoding method, such as binary encoding. A fitness function is a particular type of objective function that prescribes the optimality of a solution so that a particular chromosome may be ranked against all the other chromosomes. The evolution usually starts from a population of randomly generated individuals. In each generation, the fitness of every individual in the population is evaluated. Based on their fitness, the fittest group of individuals are selected and through reproduction, crossover or mutation to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. A common genetic algorithm is shown Fig 2‑3.
Fig 2‑3 Genetic Algorithm flowchart, with maximum 100 generation
When generating the next generation population of solutions, GA would use genetic operators: crossover, and/or mutation. For each new solution to be produced, a pair of “parent” solutions is selected for breeding from the pool selected previously. By producing a “child” solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its “parents”.
Crossover selects genes from parent chromosomes and creates a new offspring. One common way is using single crossover point on both parents’ organism strings. All data beyond that point in either organism string is swapped between the two parent organisms. An illustration on one point crossover is shown in Fig 2‑4
Fig 2‑4 One point crossover
There are other ways for crossover, for example two crossover points could be chosen. Crossover can be rather complicated and very depends on encoding of chromosome. In some cases, GA performance could be enhanced by trying out other crossover techniques.
After a crossover is performed, mutation takes place. The purpose of mutation in GA is to preserve and introduce diversity. Local minima could be prevented because of mutation, and the population of chromosomes would not be too similar to each other so that the evolution could continue. Mutation changes the new offspring randomly. For binary encoding, a common way is switching a few randomly chosen bits from 1 to 0 or from 0 to 1.
Selection would choose individual genomes from a population for breeding next generation. There are varies of selection algorithms, such as roulettewheel selection, rank selection or Tournament selection.
Roulettewheel selection chooses parents according to their fitness. The chromosome has high fitness possesses the higher chances to be selected. The fitness level is used to associate a probability of selection with each individual chromosome. This algorithm could be imagined as roulette wheel in casino, where the larger piece has higher probability to be chosen, as shown in Fig 2‑5. If is the fitness of individual i in the population, its probability of being selected is, where N is the number of individuals in the population.
Fig 2‑5 roulettewheel selection
Tournament selection involves running several “tournaments” among a few individuals chosen at random from the population. The winner of each tournament (the one with the best fitness) is selected. Selection pressure is easily adjusted by changing the tournament size. If the tournament size is larger, weak individuals have a smaller chance to be selected.
Intelligent Trading System Design
Technical analysts seek to identify price patterns and trends in financial markets and attempt to exploit those patterns.[20] People who are using technical analysis would search for archetypal patterns, such as the wellknown head and shoulders or double top reversal patterns, study indicators such as moving averages, and look for forms such as lines of support, resistance, channels, and more obscure formations such as flags, pennants or balance days. In this project, only indicators have been studied since they are quantitative and do not require ambiguous identifications.
Among all the technical indicators, moving average is considered as the simplest and most useful one. It is popular because moving average could discover the trends by smoothing the prices. Most importantly, moving average could be a useful tool since investors can make profits through trends. Exponential moving average (EMA), being one of the moving average indicators, is considered as more adaptive since it puts more weights on recent prices, e.g., today’s close price, while putting less weights on earlier days. Equation below shows the calculation of EMA:
The plot of long term EMA of 45 days and short term EMA of 15 days are plotted with close price for Dow Jones Industrial Average Index in Fig 3‑1, all data and figures are provided by yahoo finance.
Dow Jones Industrial Average price, with EMA plotted.
There are many ways of using EMA, and two common uses are introduced here. First, investors could take a long position, or buy the stock index when close price is above the EMA, and take a short position when close price is under EMA. An example is shown in Fig 3‑2, using 30 days of EMA on Dow Jones Industrial Average. Although there are some whipsaw in the middle, using single EMA is helpful to investor when making buy or sell decisions.
Fig 3‑2 Using single EMA
Another way of using EMA is taking a long position (buy) when short term EMA is above long term EMA, and taking a short position when short term EMA is under long term EMA. An example of how to buy or sell is illustrated in Fig 3‑3, using 15 days EMA and 45 days of EMA. As we could see on the chart, this method is effective by taking large profits and suffering small losses.
Fig 3‑3 Using two EMA to make decision
It is clear that EMA could help investors to identify the trend. However, being able to discover the trend is not good enough, the trading rule should be established to take profits through the trend.
However, using chart and technical indicators are not sufficient since there are some serious disadvantages. For example, we do not know whether this technical indicator could bring investors consistent long term profits. Also, we do not know how many shares should we buy or sell. Without providing more information on these topics, investors may not dare to trade with real money. However, a quantitative trading system based on these indicators could concur the shortcomings. A well established trading system would be able to tell when to buy and when to sell, as well as how many shares to buy and sell. In addition, a trading system could provide back testing results, which could present the trading performance to investors, such as the equity curve or maximum drawdown. Therefore, in this project, a quantitative trading system is built and tested.
This trading system uses a technical indicator named Percentage Price Oscillator (PPO), PPO is calculated as formula below:
A buy signal is triggered if PPO is greater than 0, in other words, when short term EMA crosses over with long term EMA. A sell signal is triggered if PPO is less than 0, which means long term EMA is above short term EMA. This trading system is a typical trend following system which could catch every major trend to make promising profit, while suffering minuscule losses when significant trends are absent in the market.
When using PPO trading system, there would be a lag between the time when the trend starts and the time when the trading system detects it. Failing to compensate the lag has been a dominant disadvantage of traditional trading systems (without prediction). An intelligent trading system attempts to predict PPO in the near future, so as to enter the market before the trend while closing the position before the market falls. The input for our intelligence trading system studied in this paper is PPO of the last 8 days and the output is PPO in the future 5 days. The intelligent model is either an MLP optimized by GA or an SVM optimized by GA. 0.2% of transaction cost and slippage are counted in the process of calculating profits, as indicated in Fig 3‑4.
Fig 3‑4 A predictive trading system.
In this project, a feedforward MLP with one hidden layer is used. The number of hidden neurons is determined to be 30 by the trial and error. The LevenbergMarquardt algorithm is used to train the MLP. Initial weights of the neural network are determined by GA. The settings for the NN+GA model are selected as Table 3‑1.
GA settings 

the population size of GA 
300 
Maximum Generation 
800 
Stop criteria 
maximum generation reached 
the probability of mutation 
0.02 
Neural Network Settings 

layers 
Single hidden layer with 30 neurons 
Transfer function 
Transig, purelin 
Training 
LevenbergMarquardt 
performance 
Mse (mean square error) 
Table 3‑1 Settings for GA and NN
Using genetic algorithm to determine the initial weight and bias is essential since they have great impact on the generalization ability of the neural network. If the weights and bias are initialized with some random number and they happen to be far way from a good solution, or near local optimum, the neural network may not be trained to achieve good performance. Being trapped in local extremes is normally happened. On the other hand, appropriate initialization would put the weights and bias near a good solution, and hence provide a high chance for neural network to reach better outcome.
In this project, genetic algorithm is chosen to provide the initial weights and bias for neural network. The structure of using GA to optimize MLP is shown in Fig 3‑5. The fitness in GA is based on the error of predicted output and desired output, shown as below
Where is the desired output and is predicted output.
Main parameters in SVR are error insensitive tube around the regression function [14] and the balance of training errors with model complexity. In this paper, GA is used to determine the best SVR parameters. The structure of GA optimized SVR is the same as using GA to optimize MLP, where GA is trying to minimize the difference between desired output and predicted output. The settings for GA optimized SVR model are listed in Table 3‑2
Fig 3‑5 Structure of GA optimized MLP
GA settings 

the population size of GA 
30 
Maximum Generation 
200 
Stop criteria 
maximum generation reached 
the probability of mutation 
0.05 
the probability of crossover 
0.4 
SVR Settings 

Kernel function 
radial basis function 
Table 3‑2 Settings for GA and SVR
Once the appropriate raw input data has been selected (in this case, they are previous 8 days PPO) , it must be preprocessed; otherwise, the neural network will not produce accurate forecasts. The decisions made in this phase of development are critical to the performance of a network.
Normalization is commonly used to distribute the input data evenly and scale it into an acceptable range for the network. Knowledge of the domain is important in choosing preprocessing methods to highlight underlying features in the data, which can increase the network’s ability to learn the association between inputs and outputs.
In normalizing data, the goal is to ensure that the statistical distribution of values for each net input and output is roughly uniform. In addition, the values should be scaled to match the range of the input neurons. This means that along with any other transformations performed on network inputs, each input should be normalized as well.
In this project, mapping the training input minimum and maximum values between 1 and 1 is adopted as normalizing method. In this method, it is assumed that the input has only finite real values, and that the elements are not all equal, as indicated below.
Where in this case is 1, is 1. is the largest number of training input, while is the smallest number of training input. stands for each individual training data, and is the normalized training data.
For the testing set, data should also be scaled to a certain range, as training set does. However, the largest number and smallest number of testing set are not available since we assume these data are unknown for trading simulation. Therefore, the testing data set are scaled using the parameters in training input data. In specific, and are still the largest number and smallest number in training data set.
Results and Evaluation
This chapter illustrates the experiment results for 2 intelligent trading models, which are using GA optimized MLP and using GA optimized SVR. In addition, it introduces some evaluation criteria, and evaluates the prediction models according to these criteria. Furthermore, it analyzes and compares the return of capital and maximum drawdown with other publication as well as conventional trading method.
This intelligent trading system uses Hong Kong Hang Seng Stock Index (HSI) from 19861231 to 1997128, total 2500 daily close price as in sample training session, and uses HSI from 1997129 to 200738, total 2500 daily price as out of sample testing data. All the HSI index data was obtained from Yahoo Finance (https://finance.yahoo.com/q/hp?s=^HSI).
In sample data used to train the neural network have been separated into three sets: training, validation, and testing. In this project, we divide the input data randomly such that the first 60% of the samples are assigned to the training set, the next 20% to the validation set, and the last 20% to the test set. Table 4‑1 is to summarize the distribution of experimental data.
Data Set 
Distribution (%) 
Distribution(data) 

Training Data 
Training set 
60% 
1500 
Validation set 
20% 
500 

Test set 
20% 
500 

Total 
100% 
2500 

Testing Data 
100% 
2500 
Table 4‑1 Data distributions for training and testing neural network
The GA optimized MLP model is used to predict the future 5 days PPO. The performance of this predicative model could be evaluated by mean square error (MSE). MSE could be expressed as below
Where is the target output and is the predicted output.
The performance of forecasting in terms of MSE is 0.0087 for out of sample data, while 0.00213 for in sample data. In either case, we could see that the MSE is relatively small, which means the prediction is acceptable. In Fig 4‑2, the difference between desired output and predicted output is plotted, as we could see, although there are some large errors in prediction, most of the forecasting is acceptable.
Fig 4‑1 Training performance of MLP
The training results of neural networks could be further evaluated by linear regression. The best network is indicated by the correlation coefficient, r closed to unity (r â‰ˆ 1) Fig 4‑4 shows the linear regression for out of sample data, which is 0.91227. Although it is nearly 8% lower compare with performance of in sample data, this model could still be considered as well trained neural network.
Fig 4‑2 MSE for out of sample data
PPO of future 5 days is selected to be desired output after prudent consideration. As a matter of fact, forecasting larger time horizon would definitely produce more profit, which is made by early entry and early exit. On the other hand, the larger the time horizon, the harder it is to predict. This would increase the chance of wrong prediction, which decreases the profit. Table 4‑2 is total return of investing 1 dollar, with different prediction time horizon.
Prediction Time Horizon 
Total return 
No prediction 
2.064 
Predict future 3 days PPO 
4.357 
Predict future 5 days PPO 
6.910 
Predict future 7 days PPO 
5.464 
Table 4‑2 Total return for different prediction time horizon
In this experiment, reinvesting all capital is selected as the money management strategy, in which the trading system would reinvest all the profit and initial capital for next buy or sell decision.
Fig 4‑3 Linear regression for trained neural network
Fig 4‑4 Linear regression for out of sample data
The proposed trading system assumes that it is possible to enter the market using the close price on the same day which triggers the trading signal. In addition, it assumes that the initial capital is 1 dollar and it is valid to buy or sell fraction number of the HSI. The PPO is calculated using parameters that short term of 15 days EMA and long term of 45 days EMA.
The equity curves of proposed intelligent trading systems are shown in Fig 4‑5 with equity curve of conventional trading system and equity curve for “buy and hold” trading strategy in contrast. The predictive MLP+GA model achieves 6.71 times of original capital from 1997128 to 200738 while in the mean time, a nonpredictive trading system only achieves 2.064 for 1 dollar investment, and “buy and hold” trading strategy generates 1.605 as final capital. In comparison, Huang and Quek et al. [10] use hierarchical coevolutionary fuzzy system (HiCEFS) to achieve 5.781 times of original capital on Hang Seng Index on the same trading days.
Fig 4‑5 Equity curve for intelligent and conventional trading systems
Sample testing data is shown in Fig 4‑7, it is obvious that prediction trading system would enter the market and exit the market earlier compared with trading system without prediction. However, using prediction has certain disadvantage. During nontrendy time, the proposed trading system may make wrong prediction and hence suffer some losses. For example, NN+GA trading system enters the market at day 61 at price 13030 and exit on the day 138 at price 15600, takes profit of 2570 points. On the other hand, for trading system without prediction, it enters the market at day 65 at price 13630 and exit at day 144 at price 13710, takes a profit of 80 points. That is the reason why the predictive model performs better than trading system without prediction. But during nontrendy market, such as around day 400, the trading system without prediction holds the position while the intelligent model made a wrong prediction. In this case, the investment incurred some losses.
Fig 4‑6 Trading signal of NN+GA trading system
Fig 4‑7 Trading signal of conventional trading system
Moreover, another important criterion to evaluate the trading system is the maximum drawdown (MDD). MDD is defined as the maximum cumulative loss from a market peak to the following trough [22]
The trading system using NN+GA suffers a MDD from 3.079 dollars to 2.443 dollars, which is 20.65% of the highest capital. In contrast, the trading system without prediction would have a MDD from 1.705 dollars to 1 dollar, which is 41.34% of the highest capital. “Buy and Hold” strategy suffers a MDD from 1 dollar to 0.466 dollar, which is 53.4% drop from the peak capital. Thus the NN+GA trading system reduced the risk involved. As it is shown in Fig 4‑5 regarding the conventional trading system without prediction, the capital is back to original 1 dollar after 1276 trading days. This may shake people’s will to follow this system. On the other hand, the MDD happened in NN+GA trading system is from day 903 to day 930, which is easier for investors to follow the trading system.
All the trading records are listed in appendix A.
The performance of forecasting future 5 days PPO using GA optimized SVR is evaluated in terms of MSE. MSE is 0.0058 for out of sample data, in contrast, MSE is 0.0087 in using GA optimized NN model for the same data. In another word, GA optimized SVR has smaller MSE, or better forecasting. In Fig 4‑8, the difference between desired output and predicted output is plotted.
However, better forecasting does not guarantee better profitability. Some wrong prediction at the top or at the bottom would bring larger losses comparing with wrong prediction at other situations.
The same assumptions are made as using GA+NN trading system. In addition, 15 days EMA and 45 days EMA are used to form PPO. The equity curve of GA+SVR trading system is shown in Fig 4‑9 with equity curve of conventional trading system in contrast. This GA+SVR trading system achieves 5.705 times of original capital.
Fig 4‑8 MSE for GA+SVR model
Although this predictive model does not achieve profit as much as GA+NN model, it has its own advantage. First, this SVR model would provide consistent performance after each training session. Second, in term of prediction accuracy, GA+SVR model offers smaller prediction errors while GA+NN mode has larger errors. Last, it trades less frequently compared with GA+NN model, this would give investors different options to choose which type of trading systems are fitting to them. For active traders, GA+NN model could be more suitable for them, while for less active investors, GA+SVR model could be adopted since it trades less frequently.
The comparison of GA+NN trading system, GA+SVR trading system, conventional trading system and “buy and hold” strategy is shown in Fig 4‑10, the equity curves for 4 trading system mentioned above are plotted together for comparison.
Fig 4‑9 Equity curve for GA+SVR trading system and conventional trading system
Trading System 
Final Equity 
MDD 
Win ratio 
Trading times 
Long position times 
Short position times 
GA+NN trading system 
6.71 
20.65% 
49.6% 
127 
63 
64 
GA+SVR trading system 
5.705 
28.5% 
44.8% 
67 
30 
37 
Conventional trading system 
2.064 
41.34% 
48.7% 
41 
20 
21 
Buy and hold strategy 
1.605 
53.4% 
100% 
1 
1 
0 
Table 4‑3 Trading performance comparison
Fig 4‑10 Comparison of 4 trading systems
In designing trading system, one of the most important issues is to avoid over curve fitting the system to back testing data. The more you bend your system around to improve performance on past data, the less likely it is your system will trade profitably in the future. Past performance will only approximate future performance to the extent the system is not over curve fitted. There are many ways to examine the over curve fitting trap. One way is to do back testing long enough. The longer the historical time period a system can trade profitably, the more robust it is. Another way to guard effectively against overcurvefitting is to make sure your system works in many markets using the same parameters. Hence, the trading system is further evaluated by applying to Dow Jones Transportation Index (DJT).
The data used as in sample training data is from 1968920 to 197895, totally 2500 trading days, and data used as out of sample testing data is from 197896 to 1988726, which is 2500 trading days. All data is from yahoo finance (https://finance.yahoo.com/q?s=^DJT). All the same assumptions are the same as trading HSI using intelligent trading systems. The equity curves of GA+NN trading system and GA+SVR trading system are shown in Fig 4‑11 with equity curve of conventional trading system in contrast. This GA+SVR trading system achieves 5.168 times of original capital, while the predictive GA+NN model achieves 4.87 times of original capital while in the mean time, a nonpredictive trading system only achieves 2.805 for 1 dollar investment.
Fig 4‑11 Equity curves of different trading system on DJT
GA+NN trading system and GA+SVR trading system outperform the conventional trading system again on DJT. This further proves that using computational intelligence would enhance the performance of conventional trading system. In addition, the proposed intelligent trading systems, using GA+NN or using GA+SVR, would survive in different market, such as DJT and HSI, and be able to generate profits consistently.
Conclusion
In this project, a predictive trading system is proposed to trade on real market data of Hong Kong Hang Seng Index, and trade on Dow Jones Transportation Index as cross market validation. Neural network optimized by GA and support vector regression optimized by GA are implemented as predictive model in the trading system. The trading system mainly uses technical indicator price percentage oscillator (PPO) as trading rules. Hence the predictive model uses last 8 days PPO as input to predict future 5 days PPO, and based on predicted PPO to make trading decisions.
The testing period is 10 years, which is long enough to reduce the possibility of curve fitting. The proposed predictive trading system produces around 3 times more profits on HSI compared with conventional trading system without prediction, and around 2 times more profits on DJT compared with non predictive trading system.
Despite promising profits generated by the trading system, further improvements such as applying the system to other new immerging markets, such as China Stock market, or applying a better money management strategy can be considered as future research area. Furthermore, due to the randomness introduced by GA, neural network may not always be trained well enough every time. We shall study effective ways to assure reasonable performance for each training session.
[1] E. F. Fama, “The Behavior of Stock Market Prices,” Business, vol. 38, pp. 34105, 1965.
[2] A. P. N. Refenes, A. N. Burgess, and Y. Bentz, “Neural networks in financial engineering: A study in methodology,” IEEE Transactions on Neural Networks, vol. 8, no. 6, pp. 1222 – 1267, 1997.
[3] CHEN, KuanYu and ChiaHui HO, “An Improved Support Vector Regression Modeling for Taiwan Stock Exchange Market Weighted Index Forecasting”, ICNN&B ’05: International Conference on Neural Networks and Brain, Volume 3, , 2005
[4] L. Cao and F. Tay, “Support Vector Machine with adaptive parameters in financial time series forecasting,” IEEE Transactions on Neural Networks, vol. 14, no. 6, pp. 15061518, 2003.
[5] P.B. Patel and T. Marwala, “forecasting closing price indices using neural networks.” In International Conference on Systems, Man and Cybernetics, pp. 23512356, Oct 811, 2006, Taipei, Taiwan.
[6] S.H. Lee, H.J. Kim and J.S. Lim, “forecasting short term KOSPI time series based on NEWFM,” in Advance Language Processing and Web Information Technology (ALPIT), pp. 303307, July, 2007.
[7] B. Doeksen, A. Abraham, J. Thomas, and M. Paprzycki, “Real stock trading using soft computing models,” in Information Technology: Coding and Computing (ITCC), 2005, vol. 2, pp. 162167.
[8] A.S. Chen, M. T. Leung, and H. Daouk, “Application of neural networks to an emerging financial market: Forecasting and trading the Taiwan Stock Index,” Computers and Operations Research, vol. 30, no. 6, pp. 901923, May 2003.
[9] K.K. Ang and C. Quek, “Stock Trading Using RSPOP: A Novel Rough SetBased NeuroFuzzy Approach,” IEEE Transactions on Neural Networks, vol. 17, no.5, pp. 1301 – 1315, 2006.
[10] H.M. Huang, M. Pasquier, and C. Quek, “Financial Market Trading System With a Hierarchical Coevolutionary Fuzzy Predictive Model,” IEEE Transactions on Evolutionary Computation, vol. 13, no.1, pp. 56 – 70, 2009.
[11] H. Bandy, Quantitative Trading Systems, Blue Owl Press, 2007.
[12] B. Krose and P.V.D Smagt, Introduction to Neural Network. The University of Amsterdam, 1996.
[13] S. Russell and P. Norvig. Artificial Intelligence A Modern Approach. p. 578.
[14] A.E.Bryson and YuChi Ho. Applied optimal control: optimization, estimation, and control. Xerox College Publishing. pp. 481.
[15] P.N. Bahrun and M.N. Taib, “Selected Malaysia Stock Predictions using Artificial Neural Network,” in International Colloquium on Signal Processing & Its Applications (CSPA), 2009, pp. 428 – 431.
[16] Lipo Wang (ed.), Support Vector Machines: Theory and Applications. Berlin, Springer, 2005.
[17] C.W. Hsu, C.C. Chang, and C.J. Lin, A practical guide to support vector classification, Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, 2003. [Online]. Available: https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
[18] H. Drucker, C. J.C. Burges, L. Kaufman, A. Smola and V. Vapnik. “Support Vector Regression Machines”. Advances in Neural Information Processing Systems 9, NIPS 1996, 155161, MIT Press.
[19] A. J. Smola and B. Scholkopf, “A tutorial on support vector regression,” NeuroCOLT2 Technical Report NC2TR1998030, 2003.
[20] John J. Murphy, Technical Analysis of the Financial Markets ,New York Institute of Finance, 1999, pages 15,2431.
[21] M. MagdonIsmail, A. Atiya, “Maximum Drawdown,” Risk Magazine, Volume 17, Number 10, pp. 99102, October, 2004.
[22] M. MagdonIsmail, A. Atiya, A. Pratap, Y. AbuMostafa, “On the Maximum Drawdown of a Brownian Motion”, Journal of Applied Probability, Vol. 41, no. 1, PP. 147161, March, 2004.
[23] Appendix
The trading details of NN+GA trading system on HSI are listed below. There would be price difference between exit and enter on the same day. This is due to consideration of slippage and commissions.
27
Our editors will help you fix any mistakes and get an A+!
Get startedWe will send an essay sample to you in 2 Hours. If you need help faster you can always use our custom writing service.
Get help with my paper