Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Finding technical trading rules in high-frequency data by using genetic programming
(USC Thesis Other)
Finding technical trading rules in high-frequency data by using genetic programming
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
FINDING TECHNICAL TRADING RULES IN HIGH-FREQUENCY DATA BY USING
GENETIC PROGRAMMING
by
Cheng Liu
A Thesis Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHEREN CALIFORNIA
In Partial Fulfillment of the Requirements for the Degree
MASTER OF SCIENCE
(APPLIED MATHEMATICS)
August 2014
Copyright 2014 Cheng Liu
i
To my dear parents - for their forever true love and faith to me
To my close friends and professors for support
Paul, Cheng Liu
ii
Table of Contents
Abstract ............................................................................................................................. iii
Introduction ....................................................................................................................... 1
Genetic Algorithms (GA) .................................................................................................. 2
Genetic Programming .............................................................................................. 3
Applying GP to Finding Trading Rules........................................................................... 6
Definition of structures of trading rules ................................................................. 6
Mathematical Models ............................................................................................... 7
Implementation of Genetic Programming .............................................................. 9
Data and Numerical Simulations ............................................................................11
Analysis of Results .................................................................................................. 12
Further Discussion .................................................................................................. 19
Summary .......................................................................................................................... 21
References ........................................................................................................................ 22
iii
Abstract
I use genetic programming to find technical trading rules of S&P 500 index, using one-minute
high frequency intraday data during about one and half year. The model in this paper also
considers short sell when necessary. Without or with very low transaction fee, the model finds
several rules that provide positive excess return, i.e. return over return of passive strategy (buy
and hold). While when the transaction cost is high enough, there is no rule that can generate
positive excess return. And when transaction cost is greater, it is very hard to apply the model to
high frequency data.
Keywords: Genetic Programming; Tree; High Frequency; Technical Trading Rules;
Excess Return; S&P 500 index
1
Introduction
In today’s study of technical analysis of stock market, the demands are tremendously
growing for real-time analysis as well as short-term response to minute-based or even second
financial data. Trading rules are located with the aid of more swift methods of analysis. Given
that genetic algorithms have been widely treated as a useful and helpful technique in finding
trading rules, this article further exploit this algorithm with some improvement on minute-based
high frequency data set to make it ‘swift’. Technical analysis uses information of historical prices
movements to forecast future price trends, which is a method of evaluating securities by
analyzing statistical data from market such as past prices and volume. This approach, despite
does not measure intrinsic value of a security or other asset warranty, is still now widely used by
investment professionals like both hedge fund and individual investors for making trading
decisions. Although it has long history, technical analysis and its claims have been regarded
traditionally with suspicion. For the famous Efficient Market Hypothesis (EMH), technical
analysis was considered fully useless if the market is completely efficient. However, with
accumulating evidence that financial markets may be less efficient than was used to be believed,
there is a renewal of academic and industrial interests in forecasting techniques sparkling. The
research on technical trading rules has undergone steps back and forth. In 1960s and early 1970s,
several great scientists such as (Alexander, 1961), (Fama, 1966) held the attitudes that technical
trading rules are useless undertaking or non-profitable method of analysis based on case studies
on filter rules in Dow Jones and Standard & Poor’s stock. Until 1990s, scientists such as (Brock,
1992) took back the technical trading rules onto the stage. The analysis is considered for the
2
Standard & Poor’s composite index (S&P 500). Even though price data available ahead of the
test has been used, this article still digs deeper into those minute-based data just because of the
concept of real-time analysis and more and more popular high-frequency financial research. With
the development of electronic trading, daily data are not adequate for institutional investors who
take advantage of everything related to trading. High frequency data not only implies the amount
of data, but also indicates the short period of test and data collection. These data sets further help
to find trading rules in such a ‘swift’ scenario. On the other hand, short sell become easy after the
expired of old uptick rule with the replacement of alternative uptick rule. This rule remarks the
revival of short selling. It provides such an opportunity to take short selling into account in
algorithm trading. Based on genetic algorithms, we can test whether there is effective way to find
profitable trading rules.
Genetic Algorithms (GA)
The Genetic Algorithms is a stochastic global search method that mimics natural
evolution. It applies a certain principle of survival to select the best generation from offspring
generated by random parents. Gas work on a group or population of potential solution to search
the local optimal which approximates to the ideal solution hopefully. At each generation, new
offspring will be created by the process of selecting according to the fitness or recombination
from outstanding parents. This process leads the evolution of populations that are better adaptive
to environment. This process mimics natural adaptation so call genetic algorithms. The simple
genetic algorithm (SGA) is given by (Goldberg, 1989) which is used as the basic components of
the GA. The following is a pseudo-code outline of the SGA:
3
Procedure GA
Begin
t=0;
Initialize P(t)
Evaluate P(t);
While not finished
Begin
t=t+1
Select P(t) from P(t-1)
Reproduce pairs in P(t)
Evaluate P(t);
End
End
Table 1: Simple Genetic Algorithm
Genetic Programming
GAs operate on a number of potential solutions i.e. population, which usually is
composed of 100-500 individuals according to scale of problems. There are several methods to
represent individuals like binary string. However, for technical trading rules, real-value
representation is not enough since trading rules include both real and Boolean functions. The
populations could be represented in the tree-like structures, each nodes with successors provides
the arguments for the functions of the nodes, while the terminal nodes i.e. nodes without
successors accord to input value as arguments. The entire structure can evaluate recursively by
evaluating the root node of the tree. This method is developed by (Koza, 1992), which is an
extension of traditional genetic algorithms. It is also called genetic programming (GP). This
method breaks the restriction of fixed length representation of traditional genetic structures. The
functions set is chosen according to the specific problem. A closure property is satisfied by
4
genetic programming that this property guarantees that all possible recombination and
combination of subtree are well defined. Furthermore, genetic programming also keep the
structure of population. Random trees build the initial population. The root node is chosen
randomly from functions of the same types. Each arguments are also select from legal category
with respect to the functions.
In the genetic programming, there are two every important concepts.
Definition
Crossover: Crossover is applied on an individual by simply switching one of its nodes with
another node from another individual in the population.
In the genetic programming, crossover operator recombines two individuals by replacing random
part of subtree. The operator guarantees to choose the same type node according to the functions
on the nodes to maintain the structure.
Definition
Mutation: Mutation affects an individual in the population, which can replace a whole node in
the selected individual, or it can replace just the node's information.
And mutation are used to generate tree to replace the second parent, which provide the diversity
of population. The mutation will be set in a fix probability.
(Koza, 1992) showed the effectiveness of genetic programming. There is more than 99%
probability to find the correct solution by only searching less than 160,000 individuals in the
population of 2
64
. That’s why this method is very attractive for optimization of trading rules
searching.
The following is the recombination or crossover process of evaluation:
5
Parent1 ; and
< >
average price price max
Parent 2: and
< +
min price average max
Offspring: and
< <
average price min price
50 data
6
Applying GP to Finding Trading Rules
In this dissertation, I used genetic algorithm to find technical trading rules, actually moving
average (MA), for S&P 500. The destination of this algorithm is to find a criteria to make
decision in the stock markets. Each individual or chromosome in the genetic algorithm represents
a random generated technical trading rule at the initial points of algorithms. Therefore, as the
most important thing, building blocks of trading rules as the genomes in the algorithms become
the threshold.
The structure of trading rules containing past prices, numerical or Boolean values, logical
functions and all possible combination are the chromosomes in the algorithms. This kind of
restriction guarantees the trading strategy is well defined.
Definition of structures of trading rules
For the functions, we consider real and Boolean. As in real functions, we can define moving
average functions (MA), which is a function of time and past prices. We can also have maximum
and minimum functions of time. Besides, norm function is also a good idea. In the real functions,
real-valued operators are necessary like +, −,∗,÷. For Boolean functions, logical functions and,
or, if-else, not are included. In addition, ≤, ≥ true, false are also in the group.
Real Functions MA, numerical values,
Max, Min, norm, lag
Real Operations +, −,∗,÷
Boolean Functions ≤, ≥, true, false, and, if-else
Table 2. Category of functions in structure of trading rules
7
Mathematical Models
The goal of this problems is to pursue excess return over passive investment strategy: buy-and-
hold and market risk-free return. What this paper wants to prove or discuss is that is there any
trading rule, at least combined by real and Boolean functions, that provides stable excess return
in stock market, especially for composite index S&P 500. Let’s begin by a single trade: define 𝑝 𝑏
and 𝑝 𝑠 as the price of buy and sell respectly. Consider one-way transaction fee as 𝑐 , 𝑟 as the
return of this single trade, then
𝑟 =
(1 − 𝑐 )𝑝 𝑠 (1 + 𝑐 )𝑝 𝑏 − 1 = 𝑒 𝑙𝑜𝑔 𝑝 𝑠 𝑝 𝑏 +𝑙𝑜𝑔 1−𝑐 1+𝑐 − 1
In the market, we consider two signals, bullish or bearish. If our forecast shows bullish, what we
should do is to long and close short positions. And similar if the market shows bearish, we
should short and close long positions. Short sell is allowed in US market and it is even widely
supported by alternative uptick rule approved by SEC after Feb 24, 2010 which replaces the old
uptick rules. This makes short sell very easy to implement especially for very liquidity composite
stock index like S&P 500 since there is very low probability for a composite index decreases
more than 10% in a minute or a day even a month. Hence there are four signals of trading: buy-
to-open, buy-to-close, sell-to-open and sell-to-close we can take.
For time 𝑡 , let 𝑝 𝑡 be the close price of 𝑡 𝑡 ℎ
minute, if it is in long position, then
𝑟 𝑙 (𝑡 ) = 𝑙𝑜𝑔 𝑝 𝑡 𝑝 𝑡 −1
is the continuous compounded return per minute at 𝑡 . Also let 𝑟 𝑠 (𝑡 ) be the return of short position
of time t as the same,
𝑟 𝑠 (𝑡 ) = 𝑙𝑜𝑔 𝑝 𝑡 −1
𝑝 𝑡
8
Since it is not possible to be the status of buy and sell at the same time, we define two indicator
functions: 𝐼 𝑏 (𝑡 ) and 𝐼 𝑠 (𝑡 ) equals to 1 if a rule shows signal of buy and sell accordingly. And
furthermore, it is not hard to see the relationship: 𝐼 𝑏 (𝑡 ) × 𝐼 𝑠 (𝑡 ) = 0 ∀ 𝑡 . If 𝐼 𝑏 (𝑡 ) = 1, then we
can do buy-to-open and buy-to-close the previous sell-to-open position, and the similarly for
𝐼 𝑠 (𝑡 ) = 1, then we can sell-to-close the previous buy-to-open position and sell-to-open.
Therefore, we can define model the continuously compounded return of a set of trades:
𝑟 = ∑ 𝑟 𝑙 (𝑡 )𝐼 𝑏 (𝑡 ) + ∑ 𝑟 𝑠 (𝑡 )𝐼 𝑠 (𝑡 )
𝑇 𝑡 =𝑡 1
+ 2
𝑇 𝑡 =𝑡 1
𝑁𝑙𝑜𝑔 1 − 𝑐 1 + 𝑐
Where 𝑁 be the total numbers of trade points and 𝑇 is the last time.
For the passive strategy—buy and then hold until last day, it is very easy to calculate following
above method:
𝑟 𝑝𝑎𝑠𝑠𝑖𝑣𝑒 = 𝑙𝑜𝑔 𝑝 𝑇 𝑃 𝑡 1
+ 𝑙𝑜𝑔 1 − 𝑐 1 + 𝑐
Where 𝑃 𝑇 is the close price of last time and 𝑃 𝑡 1
is the close price of initial time.
Therefore, the excess return is
r
excess
= 𝑟 − 𝑟 𝑝𝑎𝑠𝑠𝑖𝑣𝑒
And the final return 𝑅 is
𝑅 = 𝑒 𝑟 𝑒𝑥𝑐𝑒𝑠𝑠 − 1
While since 𝑒 𝑡 − 1 is monotonic function, we can directly consider r for convenience.
So the problems become an optimization problem in very brief form:
𝑚𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑅 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝑁 , 𝑐 , 𝑝 𝑡 ∀ 𝑡 ∈ (𝑡 1
, 𝑇 )
To solve this problem, we need to search through trading rules to get local optimal because
obviously this is a nonlinear optimization which is even not convex. We could not also go
9
through all trading rules since they are infinite if without any restriction. Therefore, genetic
programming is so suitable for this problem.
Implementation of Genetic Programming
According to results of (Sweeney, 1988), institutional investors could get on-way transaction
fee at 0.1% − 0.2%. Traders from brokerage trading floors can even get lower. In todays, 21
st
Century, with the development of high-frequency trading and much more competitive
environment of stock brokerages, the transaction fee for large size trades has decreased
significantly. Most prime brokerages like Interactive Brokers and TD Ameritrade provide fixed
trading rates like $ 0.40 per 100 shares for hedge fund companies. For SPY , that is in a range of
0.001% - 0.002%. According to the results of (Brogaard, 2014), the mean value of transaction
cost of FSTE 250 is 0.15% per day with 102 stocks and 9,129 million shares. I will test all of
those rates including free transaction fee. Since the intraday data is one-minute data of S&P 500
which is high-frequency financial data, the cost of transaction will significantly influent the
results if the numbers of trade are very large because the variance of stock index in minutes
won’t be very large. Sometimes the profit from volatility in minutes without using leverage
cannot cover transaction fee.
First generate an initial population of trading rules randomly for one trial. In the pre-decided
training period, the rules are applied to the one-minute data of S&P 500. Then the genetic
algorithms will create a new generation of rules by recombining parents’ rules. The best rules i.e.
the rules provides maximum excess return will be applied to selection period, which is for
validation of inferred trading rules. If the new generation or the new rule provides higher excess
return than old ones, it will be saved. If there is no improvement in the selection period for
10
certain pre-determined numbers generations, the evolution will be terminated. Then the best rules
will be used in test period. If no rules can beat the passive strategy or even risk-free compound,
this trial will be failed and a new trial will start.
The following is the process of genetic algorithms for one trial in finding trading rules
1
:
One trial of the genetic programming to find trading rules
Step 1
Create a random rule
Compute the excess return of this rule in training period
Then do 500 times as initial population
Step 2
Apply the rule with the highest excess return to selection period and compute the relative
excess return.
Save the best one as the initial best rule.
Step3
Pick two best rules randomly as parents.
Crossover: creating a new rule by breaking the parents apart randomly and recombination
Mutation: change the parts of the offspring randomly
Compute the excess rule and save the new rules to replace the old ones
Do 500 times
This is one generation
Step 4
Calculate the excess return of the best rule in selection period. If there is improvement on
return, save the new rule as the best one. Otherwise go back to step 3.
And also if there is no improvement for 30 generation or in total 50 generation, stop the trial.
If there is a rule generated by this process, it will be applied to test period which is fixed so as to
be easy to compare.
1
I modify the code for my model and data with the permission of (Allen, 1999). All code files and log files
are open to get. If necessary, please email to ask: liu210@usc.edu
11
Data and Numerical Simulations
I use one-minute data of S&P 500 index from July 1, 2012 to April 20, 2014 which is provided
by Wharton Research Data Service (WRDS). This period represents post-crisis ages of U.S.
stocks.
Figure 1. The trend graph of SPY from July 1
st
, 2012 to Aril 20
th
, 2014
Since the prices of SPY varies from 130 to 190, I chose the first day as base to normalize the
price. Consider the moving average of 365 minutes. Then dividing prices from 366
th
minute to
last time by the MA (365). Then the normalized price will be around 1 and hence the usable time
will cover all the timeline.
I set the transaction fee from 0, 0.0005%, 0.001%. For each situation, there were 25
independent trials based on 5 different training periods and following selection periods. Each
training period lasts for one month. And selection period lasts for one week. For each pair of
12
training period and selection period, I proceed 5 trails independently. After finding rules, I will
test them in the same test period which lasts for one year.
Analysis of Results
The results are quite different when transaction fee are not the same.
When 𝑐 = 0%, i.e. trading is free of cost, the model can find many rules that provide great
excess return. And the trading frequency are very high.
Table of Results for 1st training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.3668 0.3668 0.3668 0.3659 0.3740 0.3681
Return 0.2009 0.7331 0.7331 0.7331 0.7315 0.7456 0.7353
Trades/Day 0 185.74 185.74 185.74 185.72 185.78 185.744
Table of Results for 2nd training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.3668 0.3949 0.1172 0.3728 0.3293 0.3162
Return 0.2009 0.7331 0.7825 0.3503 0.7435 0.6693 0.6476
Trades/Day 0 185.74 139.12 143.38 185.74 115.50 153.896
Table of Results for 3rd training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.4264 0.1299 -0.0031 0.1340 0.1233 0.1621
Return 0.2009 0.8395 0.3675 0.1972 0.3731 0.3585 0.4123
Trades/Day 0 115.58 62.14 128 31.68 126.44 92.788
Table of Results for 4th training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.3466 0.3659 0.3467 -0.1713 -0.0804 0.1615
Return 0.2009 0.6984 0.7315 0.6986 0.0119 0.1082 0.4114
Trades/Day 0 156.36 185.72 156.38 75.76 77.84 130.412
13
Table of Results for 5th training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.3411 0.3408 0.1821 0.3389 0.3421 0.3090
Return 0.2009 0.6891 0.6886 0.4408 0.6854 0.6908 0.6357
Trades/Day 0 137.2 137.2 146.54 137.22 137.22 139.076
We can see from above results that when trading is free of cost, the trading frequency is very
high, most rules provide more than 30% excess return i.e. around 50% total return. It’s rare to
see some rules lose to the passive strategy. However, zero transaction fee is not practical except
paper money.
Besides, genetic programming also output rules in log files. Here is one simple example of them:
Id = 1388, fitness = 0.0057, birth = 2, time = 1140517.22:49:14 (parents: 475 and 1224)
0 Boolean >
1 Real maximum
2 Variable price
3 Real constant 1.0046
4 Real data
5 Variable price
This rule equals to
𝑑𝑎𝑡𝑎 (𝑝𝑟𝑖𝑐𝑒 ) < max (𝑝𝑟𝑖𝑐𝑒 , 1.0046)
Which means that whether current price is less than the bigger one of 1.0046 and current price
and the price in the rule represents the normalized price.
The result if is presented in the form of tree:
14
>
Maximum data
Price 1.0046 price
Figure 2. The trading rule showed in the form of tree.
Figure 3. Trading rule and normalized price
15
Through the figure above can see when the trade will be executed. The red line is the trading
rule, so when the price is below the red line, the status stays in long position, when price goes
over the red line, the status turns to short position.
While some rules are very complicate:
Id = 1258, fitness = 0.0057, birth = 2, time = 1140517.22:46:33 (parents: 977 and 718)
0 Boolean <
1 Real data
2 Variable price
3 Real moving-average
4 Variable price
5 Real lag
6 Real +
7 Real data
8 Variable price
9 Real constant 0.8495
10 Real norm
11 Real *
12 Real constant 1.5848
13 Real data
14 Variable price
15 Real *
16 Real *
16
17 Real data
18 Variable price
19 Real constant 1.1515
20 Real constant 1.1120
This rule equals to:
𝑑𝑎𝑡𝑎 (𝑝𝑟𝑖𝑐𝑒 ) < 𝑀𝐴 (𝑝𝑟𝑖𝑐𝑒 , 𝑙𝑎𝑔 (0.304332 × 𝑝𝑟𝑖𝑐𝑒 , 𝑝𝑟𝑖𝑐𝑒 + 0.8495))
Show in the form of tree:
<
Data MA
Price Price lag
+ norm
0.8459 data * *
Price * 1.1120 1.5848 data
Data 1.1515 price
price
17
When 𝑐 = 0.0005%, results are not very stable, i.e. not all trials can generate rules that provide
positive excess return. Basically majority rules can generate positive total return. Besides, the
number of trading is very large due to the low cost of trading.
Table of Results for 1
st
training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.1179 -0.0235 -0.1120 -0.1188 -0.1333 -0.1011
Return 0.2009 0.0674 0.1730 0.0737 0.0664 0.0511 0.0855
Trades/Day 0 185.73 166.19 185.75 185.72 186 181.878
Table of Results for 2
nd
training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.0391 0.0391 0.0336 0.0391 0.0402 0.0382
Return 0.2009 0.2488 0.2488 0.2420 0.2488 0.2502 0.2477
Trades/Day 0 114.27 114.27 115.44 114.27 114.29 114.508
Table of Results for 3
rd
training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.0321 -0.1085 0.0587 -0.1586 -0.0077 -0.0497
Return 0.2009 0.1630 0.0755 0.2735 0.0248 0.1917 0.1427
Trades/Day 0 1.33 36.28 32.43 6.80 2.17 15.802
Table of Results for 4
th
training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.2839 -0.2836 -0.0620 -0.2839 -0.1096 -0.2046
Return 0.2009 -0.0959 -0.0956 0.1287 -0.0959 0.0763 -0.0213
Trades/Day 0 77.83 77.84 156.39 77.83 127.86 103.55
Table of Results for 5
th
training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 0.1041 -0.0488 -0.0437 0.1671 -0.0160 0.0326
Return 0.2009 0.3327 0.1437 0.1496 0.4194 0.1819 0.2407
Trades/Day 0 114.32 156.97 156.33 100.05 137.23 132.98
18
When 𝑐 = 0.001%, the results became extremely bad. Therefore, I increase the generation of
evaluation to 100 so as to find better results hopefully. As seen from the last situation, many
trials show the same return, it is because they are already the best rules, i.e. the optimal solution
in global area. With higher transaction cost, no rules are found that have positive excess return.
Even only few rules could provide positive total return.
Table of Results for 1st training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.6035 -0.6027 -0.6027 -0.6027 -0.5779 -0.5979
Return 0.2009 -0.3432 -0.3427 -0.3427 -0.3427 -0.3262 -0.3395
Trades/Day 0 185.72 185.73 185.73 185.73 168.09 182.2
Table of Results for 2nd training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.2586 -0.3000 -0.2724 -0.2034 -0.1813 -0.2431
Return 0.2009 -0.0727 -0.1103 -0.0854 -0.0201 0.0018 -0.0582
Trades/Day 0 6.18 7.56 4.22 4.27 5.61 5.568
Table of Results for 3rd training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 NaN -0.0142 -0.4249 -0.0379 -0.0979 -0.1437
Return 0.2009 NaN 0.1840 -0.2148 0.1563 0.0889 0.0402
Trades/Day 0 0 1.92 8.86 0.65 9.22 5.1625
Table of Results for 4th training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.1875 -0.2843 -0.3206 -0.3297 -0.3521 -0.2948
Return 0.2009 -0.0044 -0.0962 -0.1285 -0.1364 -0.1555 -0.1057
Trades/Day 0 6.35 4.41 0.47 6.35 2.14 3.944
19
Table of Results for 5th training period
Rules Passive Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average
Compound 0.1831 -0.3752 -0.5968 -0.0916 -0.3742 -0.3742 -0.3624
Return 0.2009 -0.1748 -0.3388 0.0958 -0.1740 -0.1740 -0.1641
Trades/Day 0 137.21 185.75 100.08 137.23 137.23 139.5
When c=0.1%, the model has to evaluate over 200 generation to find a rule which is even not
surly for each trial. And the only rule founded still lose to passive strategy. In other training
periods, the model cannot find even one rule.
Table of Results for 1
st
training period
Rules Passive Trial 5
Compound 0.1811 -0.0269
Return 0.1985 0.1667
Trades/Day 0 0.05
We can see that when transaction cost is even higher, the number of trade became very small, i.e.
the trading frequency is very low like only one trade for a month. Actually high frequency data is
not very meaningful in this situation. Conversely, daily data or other long term period data will
be better. In other way, high-frequency data shows the short-term trend of market, moving
average is a trend-following investment method, which means that in high-frequency data
analysis, the numbers of trading will be a lot. If the transaction cost is relative high, it is not
meaningful to proceed high frequency trading.
Further Discussion
As above analysis, this genetic algorithms trading model is not ready to use in real life
commercial purpose. Because it is too sensitive to transaction cost. Besides, genetic
programming is not very stable since its initial population are generated randomly. Compared to
20
(Allen, 1999), I took short position in account and proceed in high-frequency financial data set.
While when facing to high-frequency intraday data, institutional investor will use leverage to
overcome the cost of transaction fee. Besides, the computation also ignore the payment of
dividends, which may lead seasonal disorder to the results. We can see from above that different
training period generate quite different results. If the training period is in a bearish trend, the
rules of course is not suitable for bullish trend. While similarly, if the training period is in the
bullish trend, it will have bad performance during the selection and test period. Moreover, the
ignorance will underestimate the return of passive strategy.
Apart from above, this paper only consider moving average and other simple function of price.
There are also other common technical analysis index like MACD, RSI, KDJ and so on. Those
indexes are more complicate and harder to be part of tree structure in the genetic programming.
Moreover, this model could be also used in other liquidity market like future market, forex
market and other composite index. The model in this paper also do not consider leverage. When
using leverage, the performance of algorithm with high trading fee will have great improvement
since any tiny volatility could be multiplied by 10x or more. Of course the cost of leverage
should be also consider in this case.
21
Summary
I use genetic programming to find technical trading rules of S&P 500 index, using one-
minute high frequency intraday data during about one and half year. The model in this paper also
considers short sell when necessary. Without or with very low transaction fee, the model finds
several rules that provide positive excess return, i.e. return over return of passive strategy (buy
and hold). While when the transaction cost is high enough, there is no rule that can generate
positive excess return. And when transaction cost is greater, it is very hard to apply the model to
high frequency data. But it can work well on daily data.
The model I build is a very direct one. I use limited information as inputs of the
algorithm. Besides, the parameters of model are not necessary optimal. There is much potential
for this model.
Furthermore, in practical investment, especially about high-frequency data or trading, the
requirement of speed of algorithm is very high. The genetic programming is efficient but not fast
enough. When the training period is long, the speed decrease obviously. Besides, the processes
cost at least more than one minute. If trying to use it on real-time model, it has to be optimized to
be faster.
In conclude, the model could find profitable rule when transaction cost is very low. While
when the cost became high, there is barely no rule that provides positive excess return. As I see,
when the cost is very high, the model should be applied to longer-term period data rather than
high frequency data.
22
References
Alexander, S. (1961). Price Movements in speculative markets: trends or random walks.
Industrial Management Review 2, 7-26.
Allen, F. (1999). Using Genetic Algorithms to Find Technical Trading Rules. Journal of
Financial Economics 51, 245-271.
Bauer, R. J. (1994). Genetic Algorithms and Investment Strategies. New York: Wiley.
Brock, W. L. (1992). Simple technical trading rules and the stochastic properties of stock returns.
Journal of Finance 47, 1731-1764.
Brogaard, J. (2014). High-Frequency Trading and the Excution Cost of Institutional Investors.
The Financial Review 49, 345-369.
Dacorogna, M. M., U. A. Muller and R. B. Olsen. (2001). An introduction to high-frequency
Finance. San Diego: Academic Press.
Fama, E. B. (1966). Filter rules and stock market trading. Security prices: a supplement. Journal
of Business 39, 226-241.
Gencay, R. (1998). The predictability of Security Returns with Simple Technical Trading Rules.
Journal of Empirical Finance 5, 347-359.
Goldberg, D. E. (1989). Genetic Algorithms in Search, Omptimization and Machine Learning.
Addison Wesley Publishing Company.
Koza. (1992). Genetic Programming: On the Programming of Computers by Means of Natural
Selection. Cambridge: MIT Press.
Stephen Boyd, Lieven Vandenberghe. (2004). Convex Optimization. Canbridge: Cambridge
University Press.
23
Sweeney, R. (1988). Some new filter rule tests: methods and results. Journal of Financial and
Quantitative Analysis 23, 285-300.
Abstract (if available)
Abstract
I use genetic programming to find technical trading rules of S&P 500 index, using one‐minute high frequency intraday data during about one and half years. The model in this paper also considers short sell when necessary. Without or with very low transaction fee, the model finds several rules that provide positive excess return, i.e. return over return of passive strategy (buy and hold). While when the transaction cost is high enough, there is no rule that can generate positive excess return. And when transaction cost is greater, it is very hard to apply the model to high frequency data.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Improvement of binomial trees model and Black-Scholes model in option pricing
PDF
Supervised learning algorithms on factors impacting retweet
PDF
Asset price dynamics simulation and trading strategy
PDF
Interval arithmetic and an application in finance
PDF
High-frequency Kelly criterion and fat tails: gambling with an edge
PDF
Elements of dynamic programming: theory and application
PDF
Application of statistical learning on breast cancer dataset
PDF
The application of machine learning in stock market
PDF
An application of Markov chain model in board game revised
PDF
An FDTD model for low and high lightning generated electromagnetic fields
PDF
A nonlinear pharmacokinetic model used in calibrating a transdermal alcohol transport concentration biosensor data analysis software
PDF
Uniform distribution of sequences: Transcendental number and U.D. mod 1
PDF
Identifying important microRNAs in progression of breast cancer
PDF
Generalized Taylor effect for main financial markets
PDF
Bayesian hierarchical models in genetic association studies
PDF
Linear filtering and estimation in conditionally Gaussian multi-channel models
PDF
Determining blood alcohol concentration from transdermal alcohol data: calibrating a mathematical model using a drinking diary
PDF
Non-parametric models for large capture-recapture experiments with applications to DNA sequencing
PDF
Topics in selective inference and replicability analysis
PDF
A “pointless” theory of probability
Asset Metadata
Creator
Liu, Cheng
(author)
Core Title
Finding technical trading rules in high-frequency data by using genetic programming
School
College of Letters, Arts and Sciences
Degree
Master of Science
Degree Program
Applied Mathematics
Publication Date
07/24/2014
Defense Date
06/23/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
excess return,genetic programming,high frequency,OAI-PMH Harvest,S,technical trading rules,Tree
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Lototsky, Sergey V. (
committee chair
), Mancera, Ricardo (
committee member
), Sacker, Robert (
committee member
)
Creator Email
liu.paul813@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-448943
Unique identifier
UC11287043
Identifier
etd-LiuCheng-2742.pdf (filename),usctheses-c3-448943 (legacy record id)
Legacy Identifier
etd-LiuCheng-2742.pdf
Dmrecord
448943
Document Type
Thesis
Format
application/pdf (imt)
Rights
Liu, Cheng
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
excess return
genetic programming
high frequency
technical trading rules