0%

I’ve been working on a high-frequency trading strategy project for the past few months. Unfortunately, the project rests on the university GitLab, so I can’t open source our fantastic repo (my teammate created a fork here!). The project expands around RCM-X’s Strategy Studio, and a full report is here. This article summarizes different components of the project and hopefully serves as insights for others working with Strategy Studio.

Data

Data was the trickiest part. HFT strategies rely heavily on accurate nanosecond scale order by order data almost unattainable by individuals. Even if we find the possible data, we still need parsers to convert raw data to Strategy Studio usable format. In the end, we incorporated three data sources that mimic the actual trade data as close as possible.

  • IEX DEEP
    We heavily relied on this data source because of its high availability. It’s available daily and contains trade and order book depth data. Also, our prof gave us the parser, so we plugged in and played.

  • NASDAQ ITCH
    This was another useful data source. However, it’s only available on certain days. Therefore, it did not help with our dev too much. We created a parser by simplifying this original implementation.

  • SIP (from Alpaca)
    This data source only contained NBBO data. Even though we wanted to incorporate it (due to the high availability), we did not have much luck since Strategy Studio documentation did not include sections for using SIP data. Nevertheless, we still implemented a parser for it.

Strategies

Strategy is the core and most exciting part. We implemented four strategies and backtested them against the market in April and May (when they were crashing), and all the strategies outperformed buy-and-hold. Here are the strategies:

  • Buy last and sell first
    Yes, the strategy is literally to buy in the last two minutes of the current trading day and sell them in the first two minutes of the next trading day. It sounded absurd, but we made a profit from SPY when it dropped from 450 to 415.
  • Mean Reversion Strategy
    The strategy assumes that the ticker’s price will move toward the mean price of the last 1000 trades. This was a promising strategy since it doubled our assets within three days. However, due to the volume of trade actions, our VM OOMed, and we could not test on more days.
  • Arbitrage
    A classic strategy trades one ticker based on the movement of another ticker. We picked SPY and AAPL because AAPL is one of the most considerable portions of the SPY. However, we only implemented the Long side, so we did not gain much profit from it in a downward market.
  • Swing Strategy
    Use momentum to track local minimums and maximums, then make trades accordingly. This is like the real-world swing that travels from high to low.

They are implemented by extending StrategyStudio’s Strategy interface. The interface has several event triggers that allow our strategies to submit/cancel/update orders. For instance, , the onBar method will be triggered for the bar quote for a specified time interval. Due to the data availability, we focused on using the onTrade method, which runs whenever a trade is filed.

Analysis && Automation

We wrote a python script to generate visualizations and calculate earnings. We also established Gitlab CI for automatic code linting and wrote Bash script to test strategies automatically since we don’t want to type in the terminal every time.

Was these results ideal?

Definitely not. We did not get the chance to incorporate all the ticker data we wanted, and we could improve our strategies by tuning parameters/picking better base tickers. I guess I had the excuse of a busy final week, but I would improve this if I had more time.

I have recently been working with my research team on analyzing soil characteristics across the entire US midwest. Since the data involves about 50k images with a combined size of more than 10Tb, it was not an easy task to perform. I wrote an asynchronous data pipeline to speed up the task, and I think it is an exciting topic to discuss.

Problem

We need to run image segmentation on about 50k satellite images, each of size around 6144 * 6144, and we had Azure VMs (12CPUs + Tesla V100) to do this job. Clearly, an individual image is too big for our model, so we needed to cut it into 224 * 224. We also wanted overlaid prediction since the center of the crop would get a higher quality prediction. This overlay is slow to process since some memory copy is needed, while a standard crop can just be reshaped.

Due to the scale of data (10Tb+), they are stored on Azure storage blobs because no virtual disk can hold this much data. Consequently, reading an image, predicting, and saving it will not work because the IO overhead will starve our most precious resource (GPU).

Using this “single-threaded workflow,” the prediction process will take a couple of weeks and burn lots of our Azure credit allocation. Therefore, we need a faster prediction pipeline.

Solution

Make preprocessing and postprocessing asynchronous!

A thread-safe queue will be very helpful in this case, the preprocessors do their work and place data into a queue, and the prediction thread can pull from it. This way, we can have multiple preprocessors working in parallel to match the speed of the GPU. The data pipeline becomes:

Python provides a built in Multiproceesing Queue in the multiprocessing package, which perfectly fits our task. Here is a quick code walk-through:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
from multiprocessing import Pool, Manager

def reader(to_process, processed):
while(True):
fname = to_process.get()
# -1 means we exhuasted all threads
# put it back to notify other processes
if (fname == -1):
to_process.put(-1)
break
download(fname) # grab file from cloud
res = process(fname) # cut files into an array of crops
processed.push(res, block=True) # sent preprocessed array to predictor
os.remove(fname) # remove local "cache" file
to_process.put(-1)

def writer(to_save):
while(True):
tmp = to_save.get()
if (tmp == -1):
to_save.put(-1)
break
postprocess(tmp)
upload(tmp)
os.remove(tmp) # remove local "cache" file
to_save.put(-1)

if __name__ == "__main__":
m = Manager()
to_preprocess = m.Queue(0)
preprocessed = m.Queue(50) # max queue size depends on ram size
to_postprocess = m.Queue(50)

# build a shareable task list using another queue!
for f in filelist:
to_preprocess.put(f.link)
to_preprocess.put(-1)

# use process pool to launch async preprocessor and postprocesser
num_thread = 3
a = Pool(num_thread, reader, (to_preprocess, preprocessed,))
b = Pool(num_thread, writer, (to_postprocess, ))

while(True):
# get preprocessed array
tmp = preprocessed.get(block=True)
# if we got -1, it means that we have exhausted all tasks
if (tmp == -1):
to_postprocess.put(-1, block=True)
break

# run prediction
res = predict(tmp)

# send task to postprocessor
to_postprocess.put(res, block=True)

With this parallel data processer, I could finish all prediction tasks within a day using a couple of VM instances. This dramatically saved our time and money.

Is this ideal?

Well, no. Ideally, we should have different VM instances for preprocessing and postprocessing. This would better utilize the resources because a CPU-heavy VM would be much more efficient in preprocessing files than a GPU VM. Then we should have a load balancer to send tasks around instead of running a bunch of multiprocessing queues. However, we do not care that much since this is a use-and-throw data pipeline that runs at most a couple of times.

Nevertheless, this task showed the importance of system programming, even if you are just a parameter tuner.

I wrote this article in 2022 and manually updated the time to align the three-article series. With more knowledge in machine learning, everything I wrote two years ago seemed so naive and incorrect. This article will recap what went wrong and how we might improve the results.

A quick recap of what I did

  1. Use 0-59 day OHLC of a ticker as input.
  2. Ask the model to predict the 60th day.
  3. No padding in data generation – sample 0 has day [0-59] and sample 1 has day [1-60].

What did the model actually produce?

At first glance, the model output does not seem wrong–it fits the actual ticker movements. The reality is that the model learned to repeat the last day’s price as the prediction – given input of day [0-59], the model will spit out the price of day 59. If we modify the model by asking it to give binary output (long vs. short), it will perform like a random guess. In addition, we cannot create a long-term forecast because we fed the actual price instead of repeated predictions in the training process.

What caused these issues?

Because of the uninterpretable nature of deep learning, these are just hypotheses, but I believe they at least contributed to the failure to some degree.

  1. Vallina Percentage Loss does not work
    Because the stock only moves 1~2 percent on most days, the model will not receive a significant penalty when it repeats today’s price and will not update its weight. When I switched to binary loss, it was clear that the model learned nothing and did not perform better than a random guess.
  2. Lack of samples
    If we look at the input generation, it’s clear that two samples will be 59/60=98.33% duplicate. This will confuse the model because two virtually identical samples require different outputs.
  3. Lack of features
    The input only has four features (Open, High, Low, Close). This is very few compared to proven works. For instance, a typical CV task would have at least 32*32 dimensions of features.

How can I make it better?

First of all, pick a better loss function, maybe binary loss since it’s the most straight forward indicator.

Then, do some feature engineering by adding stock indicators. We have the fantastic library stockstats to help:

1
2
3
4
5
6
indicators = ['macd', 'rsi_30', 'cci_30', 'dx_30'] # maybe MAs?
df = pd.read_csv(f) # stock price containing OHLC and volume
stock = Sdf.retype(df.copy())
for indicator in indicators:
tmp = stock[indicator] # this is slow but it works
df[indicator] = tmp.array

In this way, the model would receive more features and be more likely to produce some usable results. Another immediate update in adding more samples, we could randomly select the date range from the stock pool; this will dramatically increase the variance of the training set. For example, we could do something like this (don’t ask me why this is in torch, but previous ones are in tf):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def sample_train_batch(train, seq_len = 200, batch_size = 64):
'''
Sample a random batch of training data
Args:
- train: training set of n*l where n is stock pool and l is stock movement history
- seq_len: length of sample sequence
- batch_size: batch_size
Output:
- sample_bld: Tensor of shape (b, l, d) as sample
- label_b: Tensor of shape (b, 1) as label
'''
sample_bld = np.zeros((batch_size, seq_len, train[0].shape[-1]))
label_b = np.zeros((batch_size, ))
for b in range(batch_size):
sample_from = train[np.random.randint(0, len(train))]
start_idx = np.random.randint(0, sample_from.shape[0] - seq_len - 1)
end_idx = start_idx + seq_len
sample_bld[b] = sample_from[start_idx : end_idx, :]
label_b[b] = sample_from[end_idx + 1, 3] > sample_from[end_idx, 3]
return torch.tensor(sample_bld), torch.tensor(label_b)

Well, the takeaway is, cracking secrets of stock market is not that easy. I will give it another try when I have more time & GPU resources.

It has been another lazy busy two weeks before I had some more time to investigate my stock prediction model, and there’re two major progresses that I think I shall log down.

Prediction

Yesterday night, I figured out the way to predict some ticker price for today. My model predicted that BILI, BAC, INTC, GILD, and CCL will be down in terms of closing price today. It seems that I’m correct so far (approx. 2:30 p.m. cdt). Based on the result of an hour ago, the AMD and BAC will be down on Monday, but that might change depending on price movement, and I’ll do a batch of prediction on Sunday night.

Also, I will try to apply this model to other timeframe as technical analysis works on all timeframes

“Plateau Issue”

In last post, I mentioned that sometimes the model would just give up and I spent some time investigating this issue. I found that the model automatically bounds the ticker price at around $120. To be more clear, the prediction would “stop” once ticker price hits 120 and resume when it drops below. Here’re some examples

This is the label (red) vs. prediction (green) for tsla around 2013, clearly the model just stops after hitting 120 price level (y-axis).

Similarly Microsoft (around the end of 2018) has the same issue, and it’s training set! So supposingly this problem occurred during training and I could drastically increase accuracy if the problem is solved. Also the plateau is not liked caused by extrapolation since Microsoft was way over 120 for more than a year.

This is the most interesting example–J.P. Morgen of most recent 200 days. the prediction accuracy decreases when price nears 120, stops after 120, and picks up again.

A closer look a prediction result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
print(y_pred[-125:-110,:])
[[118.594986 120.296196 118.37011 119.362076]
[118.92082 120.646355 118.720116 119.714264]
[119.21562 120.96588 119.03798 120.034744]
[119.438065 121.209656 119.27927 120.27842 ]
[119.61156 121.401344 119.468 120.46955 ]
[119.74968 121.55503 119.618164 120.62227 ]
[119.86223 121.680786 119.73938 120.74647 ]
[119.9632 121.79347 119.84477 120.8563 ]
[120.07791 121.92008 119.95497 120.975914]
[120.2704 122.12872 120.11988 121.16532 ]
[120.59798 122.48024 120.384384 121.47831 ]
[120.90217 122.803635 120.627945 121.766914]
[121.13604 123.03682 120.82533 121.98914 ]
[121.4418 123.30764 121.10597 122.28069 ]
[121.679 123.51221 121.327576 122.50713 ]
[121.77057 123.59196 121.413956 122.5952 ]
[121.81259 123.62908 121.45422 122.63607 ]
[121.839066 123.65261 121.47986 122.662 ]
[121.85849 123.669914 121.49878 122.6811 ]
[121.87355 123.683334 121.51347 122.695915]
[121.885666 123.69415 121.52532 122.70786 ]
[121.89534 123.70276 121.53476 122.717384]
[121.903244 123.70979 121.54251 122.72517 ]
[121.909775 123.71563 121.548935 122.73163 ]
[121.91516 123.720474 121.55426 122.736984]]

print(y_pred[-50:-30,:])
[[121.97397 123.77238 121.6122 122.79505 ]
[121.974106 123.77251 121.61233 122.795204]
[121.97473 123.77303 121.6129 122.79578 ]
[121.97554 123.7737 121.61364 122.79653 ]
[121.9775 123.775345 121.61543 122.79836 ]
[121.98193 123.77913 121.61937 122.80246 ]
[121.98416 123.78073 121.62105 122.80426 ]
[121.98341 123.779526 121.61937 122.80286 ]
[121.97371 123.76968 121.60829 122.79213 ]
[121.87557 123.675064 121.50528 122.68991 ]
[114.85866 116.58525 114.249084 115.37254 ]
[102.18071 103.6875 100.93708 102.316696]
[103.84379 105.045616 102.54588 103.7781 ]
[ 99.68439 100.732445 98.26677 99.39317 ]
[ 95.785934 96.767426 94.37234 95.400276]
[104.81848 105.789536 103.509056 104.634575]
[ 93.265015 94.249084 91.780655 92.95141 ]
[ 97.01548 97.91057 95.53687 96.73838 ]
[ 89.63896 90.62987 88.16635 89.38889 ]
[ 86.91128 87.87613 85.50348 86.60942 ]]

Although I think it’s possible to get around by factoring the entire x_pred by 1/10, but I do not think that’s the optimal solution since 1.predicting on training set also produces dubious results and 2.factoring does not represent actual market as different price level usually have different volatility. I speculate there’re a couple of possible errors:

  1. Overflow of Numpy/Tensorflow: not really likely under float64
  2. Caused by plateau of MSFT during 2019: Why would the model catch this but not the afterward breakthrough.
  3. TF source code error: I tried to look into it but it’s way too hard for me to navigate around at this moment.

I’ll retrain the model on other tickers like AMZN or TSLA that are way over 120 dollars and see if they make any difference. But I don’t really like this idea as these tickers are not not typical representation of the entire market. If anyone has any, just please, reach out to me.


Sunday update:

Monday night update: It appears that the model is completely wrong with the market… So it’s not that easy to lay on my bed and have computer do my trade…

So here’s my model prediction on Sunday night morning:

Ticker Direction Close price % prediction change*
Intel down 53.57 -4.41%
JP Morgan up 93.69 -0.01%
AMD down 44.08 -7.06%
Bilibili down 24.37 -1.48%
Delta Airline down 23.02 -2.34%
Pinduoduo down 39.77 -7.58%
Gilead Science down 73.27 -3.86%
CVS Health down 55.84 –3.79%
Homel Food down 43.02 -4.42%
Occidental Petroleum down 14.81 -1.40%

** Percent change respect to the model prediction of Friday price (since LSTM predicts a chain of results), while direction is with respect of actual price.*

So based on these results, I shall say that Monday Market would be pretty pessimistic as these tickers are from different industry. However these tickers might not be good representatives considering big market cap companies like Google, Amazon, Costco etc. have prices way beyond $120, which it limited by a “bug” in my model.

Tackle “Plateau issue” by training with Amazon

I trained the model using amazon as my sample. The first noticeable problem is that the model is much harder to train. I got around 19% loss at 30 epoch (training with mstf was around 2.5), 14.5% at 50 epoch and 11% at 70 epoch. Further training was able to reduce loss to 8.42% at 90 epoch, but the 100th epoch rebounds to 10%, which suggests a likely overshoot and overfit. I should’ve record the loss but too late now… The analysis of “amazon model” is the following:

Compare the performance of prediction on bac, observe that the model captures the trend every well across 3 different training stages and it performs exceptionally at 70 epoch. But, at 100 epoch, we see extra volatility (well, price of AMZN is too crazy comparing to bank of America).

Now look at performance of prediction on Microsoft , observe plateau is gone (compare to “msft model”). At 70 epoch, the “amazon model” captures all day-to-day changes accurately. Though they might be good indicators of the market craziness, but I would not consider any of these prediction as usable due to volatility.

The most interesting part is the plateau issue. The plateau is still there except that the height has changed through out training. At 50 epoch, the model plateaus at around 480, at 70, it plateaus around 620, and at 100 epoch, it plateaus at around 800. Note that prediction of the 70-epoch model captures TSLA price every well under 620 dollars while 100-epoch model is just off the target.

So what I’m trying to say is that the model doesn’t work well if trained with amazon. Even with the best performance at 70 epoch, we’re still seeing unusual volatility, and plateau issue remains resolved.

Recently I’m working on a Long Short Term Memory (LSTM) model to predict stock price. I collected ticker price data from Yahoo using yfinance and used Tensorflow with Keras to setup the model. It’s only at very early stage of this project and I haven’t done any tuning yet, but I came up with some rather intriguing results and some of them doesn’t make sense at all. So code is cheap, show me the talk:

Retrieve Data:

Considering that tech companys and traditional consumer business have some slight difference in terms of technical analysis, I was going to put all avaible historical data of SP500 tickers into my model. But, as a prototype, I only used the data for Microsoft:

1
2
3
4
msft = yf.download(tickers = "msft")
mstf = msft.dropna()
msft = msft.drop(columns=["Adj Close", "Volume"])
data = np.array(msft[["Open", "High", "Low", "Close"]])

After keeping “open, high, low, and close”, we arranged data from approximating 8600 trading days into an array of shape (8592, 4). The reason why is that technical anaylsis rely on reading the candle chart, therefore training on just closed price would not create a realiable quant model. Idealy, it would be better if the data is cleaned up for dividens, but it shouldn’t make a significant difference. Also, I inteneded to include volume initially as it’s also an indicator for techinical analysis, but it would likely to overflow the model and normalizing would be hard considering splits or increasing issues.

Clean Up:

So after downloading data, I used 60 days as the sliding window for recurrence. To clarify, the first pair consits of data from [0, 59] as training, and day 60 as label. The second has training data of day [1, 60] and label of day 61. Then henceforth.

1
2
3
4
5
6
7
8
9
10
x_train = []
y_train = []
timestamp = 60
length = len(data)
for i in range(timestamp, length):
x_train.append(data[i-timestamp:i, :])
y_train.append(data[i, :])

x_train = np.array(x_train)
y_train = np.array(y_train)

The result after cleaning is total of 8532 training units, which is rather small, another reason that I’d like to use all tickers of the S&P500.

Model:

So after obtaining training data, I defined my model structure. It consist of four layer of LSTM recurrence layer, and one dense at the end:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_23 (LSTM) (None, 60, 256) 267264
_________________________________________________________________
lstm_24 (LSTM) (None, 60, 512) 1574912
_________________________________________________________________
lstm_25 (LSTM) (None, 60, 512) 2099200
_________________________________________________________________
lstm_26 (LSTM) (None, 256) 787456
_________________________________________________________________
dense_6 (Dense) (None, 4) 1028
=================================================================
Total params: 4,729,860
Trainable params: 4,729,860
Non-trainable params: 0

I used all default hyperparameters, expect that I specified optimizer as “adam” and used “absolute percent error” as the loss function–the stock price ranges from couple cents to hundreds of dollars, so we care about is percent change. Against the costume of running recurrent neuron network, I did not inclue any dropout in my model. I tried two version, keeping everything the same except dropout/no-dropout, and found the no dropout actually create better price prediction.

Then I trained it 30 epoch with 32 as batch size, reducing the lost to around 2.5%. It took less than ten minutes with an RTX2060s–I mean it’s a small model with small data anyway.

Evaluation:

Then I tried predicting on other stock data and here comes the interesting part. For all validation data, I treated it exactly the same as training data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
intc = yf.download(tickers = "intc")
intc = intc.dropna()
intc = intc.drop(columns=["Adj Close", "Volume"])

val_data = np.array(intc[["Open", "High", "Low", "Close"]])
print(val_data.shape)

x_val = []
timestamp = 60
length = len(val_data)
for i in range(timestamp, length):
x_val.append(val_data[i-timestamp:i, :])

x_val = np.array(x_val)

When evaluating different ticker, I just change “intc” to “appl”, “bac” ect. Here’re some interesting results:

The first stock I tried was intel. I plotted the closed price of my model prediction in green, and actual price in red. It can be easily observed that the model prediciton captured almost all fluctuations in the recent “monkey market”–which is surprising as I haven’t done any tuning and it’s only based of one ticker. And I don’t see the likelyhood of overfitting as the model never saw intel price data before. The x-axis is about most recent 100 trading days.

Similarly the model also predict Bank of America pretty acurratly.

Also ccl fits pretty well (How crazy did it moved recently!)

Even luckin coffee, except the crazy sell-off after the financial problem.

Problem:

Seeing the last section, you may say, great, what’s the problem?

The first is that this performence doesn’t really make sense to me. The training data came from only one ticker and it fits pretty well to other tickers in such a short traning time. One possible validation is running a different time interval (e.g. hour chart) since the theory of technical analysis applies through all time intervals. Another good validation would be running a bunch of predictions for tomorrow: that would work, just append the last 60 days of data into x_val. But there come the other problems:

The second is that the model doesn’t care about intraday fluctuations. Even though the close price prediction seems promising, but actually digging into the predicted data tells a different story:

1
2
3
4
5
6
7
8
9
10
11

[[52.58807 53.006332 52.079365 52.606342]
[50.649292 51.084167 50.1335 50.639133]
[52.35407 52.78058 51.84988 52.354694]
[52.34229 52.763435 51.82691 52.352676]
[57.822975 58.231274 57.346855 57.867443]
[58.394894 58.807693 57.91325 58.456905]
[58.013275 58.439564 57.536503 58.05953 ]
[56.24209 56.678978 55.754234 56.268303]
[56.630985 57.066933 56.14414 56.65341 ]
[59.040394 59.46942 58.568333 59.08561 ]]

It’s obvious that column 0 (open price) and cloumn 1 (close price) are very similar–that is against reality and actual data. The third issue is that the model would sometimes give up:

I apologize for the Chinese label as I was discussing with friends about this yesterday. But it’s pretty clear that initially the model fits pretty weel, but at some point, it just give up and I have no idea why. Maybe it requires an examination for the evaluation data. Note that this chart plots almost 6000 trading days.

Sometimes it would give up and pick up again.

As I’ve never systematically learnt recurrent model before, I’m kind of stuck with how to analysis and tackle these issues. If you have any ideas/suggests/discussions, feel free to reach me out.

Update Apr 6th:

今天邮轮股和消费股,CCL和FL为代表,基本印证了我的分析,我也建仓了CCL,今天浮盈$1左右。航空股今天明显很弱,但是明显减速了,筑底的可能性依然很大,保持观察。有空的话估计会更新一下关于大盘的看法。


Update Apr 5th:

周末很都在讨论老巴抛售航空股的问题,我也稍微更新一下吧。

首先为什么老巴会抛售股票?主流的声音有两种,一个是说要把股份控制在10%以避开sec汇报的要求,来抄底的时候可以鬼子进村,打枪的不要。 另一个声音是通过分析财务报告,来说明因为基本面产生了变化,导致巴菲特认为航空股不再拥有投资的价值。我不认为我能比网上的价投人分析得更好,就不瞎逼逼了。在雪球/youtube上都有比较详细的分析。

对我的观点是否有影响?首先我是k线图操盘手,宏观面上的改变不会给我的判断带来直接影响,但是可能会通过反映在股价上来改变我的建仓逻辑。周五提到的是需要股价拉起+macd小绿柱放大才能确定顶部结构。巴菲特的抛售会拉低股价(周五盘后的抛售),那就意味着这个底背离可能需要更长的时间形成,或者干脆直接砸穿。不管是这两种可能性中的哪一个,接下来的3-5个工作日都是重点关注的时间。总结就是说,k线图会告诉我应不应该建仓。

p.s. 很多零售股票的走势也很类似,比如FL,M。尤其是梅西百货的底背离已经完整,但是他的财报一直不怎么样,所以持谨慎态度。


今天盯盘的的时候注意到一些票子都快出现技术面上的底部了,那就把我对他们的想法作为第一篇投资笔记吧。因为一直关注的是CCL,DAL和AAL,所以这篇文章也会用这几只票子做主体。鉴于同类型的股票近期走势都很类似,这三只的图形可以窥见行业其他股票的走势。

Due to timing issues, I will probably work on an English version later.

1. 回测:

首先回看一下他们之前的k线图结构。因为我第一次入场是A股,所以把k线图改成了绿跌红涨,这里说明一下。

首先是CCL:

可以很明显的看出在19年12月到20年1月底走了一个非常标准的M双头,并且macd出现了顶部背离。再看长线的周线图,有一个很长的下降趋势线,并且在20年2月最后一次突破失败之后股价就开始跳水。

这是一张12月到1月的日线图,明显的看到双头做平,颈线有对称的跳空。

这是CCL的周线图,在多次突破失败后狂泻。

然后看DAL:

相似于CCL,也是一个教科书级别的头部反转模式-头肩顶。

这只老巴抄在半山腰上的股票在暴跌之前也是有一定迹象的,可以看到明显的左肩和右肩对齐,而且突破颈线是强突破的跳空+长阴线。

最后看AAL:

和CCL/DAL不同的是,AAL走出了一个长下降趋势通道的下突破。

这是一张AAL的周线图,可以看出3月的第一周出现了明显的趋势通道向下突破。这是一个很不利的信号。本来下降通道就是股票上期下跌,向下突破说明市场更加没有信心。

小结一下,也就是说刨去疫情对欧美的影响,在1月下旬和2月,这几只股票在一月下旬/二月上旬都已经出现在顶部反转/加速下跌的迹象。个人认为这个技术面的因素于基本面共振加速,也加剧了跌势,也就为这几天潜在的抄底带来了机会。

2. 近期结构:

近期(尤其是今天)在这些股票出现了,或者很可能出现底部结构。也就是说,个人的判断他们的底部已经靠近了。从二月中旬三月初一直观望等待的抄底机会马上来了。

依然第一个看CCL:

邮轮在tank了这么就之后,昨天+今天已经筑好W形双底的根基,并且有一个底部背离。一旦向上拉起这个底部结构就可以完全确认。而这个结构相对应的是1月的双头,向上拉起的第一个目标是找颈线,也就是约$17的位置。也就是说这是一个潜在的100%的利润。

我把背离和底部的位置都标出来了。只要黄色圈中macd小绿柱拉长并且股价拉起,这就是一个底部的买点。

当然值得注意的是最近的$8增发问题。要观察一下这个增发是不是导致股价在8美元上下浮动的原因,应该在4月6日可以确定。

然后看DAL:

结构没有很标准,如果按照收盘价看依然是一个双底,但是没有新低所以不是背离。希望他能在砸一下拉低我的进场成本hhh。第一目标依然是35块的颈线。

概念和CCL很相似,依然是需要小绿柱向上增长+股价拉起才进场。

最后看AAL:

AAL是最难看的,双底没能做平,跌穿了。而且今天是加速下跌的,所以可能是机会最差的。

可以看到虽然有背离,但是因为今天(4月3号)拉了一根长阴线,所以需要更长时间减速并且拉起。应该要等一等再进场。

3.基本面

!TODO

一般来说技术派不看基本面,但是考虑到大环境这么差,还是留意一下吧。大概念是航空是刚需,邮轮不可能倒闭,所以一定会有机会。挖个坑,看看要不要补财务分析。

4. 操作逻辑

目前的打算是配置10%-15%左右的资产进去邮轮/航空。比例为航空:邮轮=2:1。逻辑是航空业是刚需,恢复一定比作为选择性消费的邮轮快。考虑到大环境不好,股指在技术面上很难看(挖坑2),会偏向于中短线,见好就收并且严格执行止损


注:不构成任何投资意见。仅作为自己的想法/投资笔记。股市有风险,入市需谨慎。