程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Listen to the God of stocks say that buying gold will never lose money? Then I will use Python to predict the trend of gold price.

編輯:Python

Read gold ETF data

This paper uses machine learning method to predict the price of gold, one of the most important precious metals . We will create a linear regression model , The model from the past gold ETF (GLD) Get information from the price , And return to the next day's gold ETF Price forecast .GLD It is the largest direct investment in physical gold ETF.

The first thing to do is : Import all necessary Libraries .

# LinearRegression  Is a machine learning library for linear regression
from sklearn.linear_model import LinearRegression
# pandas  and  numpy  For data manipulation
import pandas as pd
import numpy as np
# matplotlib  and  seaborn  Used to draw graphics
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-darkgrid')
# yahoo Finance For retrieving data
import yfinance as yf

then , We read the past 12 Annual daily gold ETF Price data and store it in Df in . We delete irrelevant columns and use  dropna()  Function delete  NaN value . then , We draw gold ETF Closing price .

Df = yf.download('GLD', '2008-01-01', '2020-6-22', auto_adjust=True)
Df = Df[['Close']]
Df = Df.dropna()
Df.Close.plot(figsize=(10, 7),color='r')
plt.ylabel("Gold ETF Prices")
plt.title("Gold ETF Price Series")
plt.show()

Define explanatory variables

The explanatory variable is one that is manipulated to determine the next day's gold ETF Price variables . In short , They are what we want to use to predict gold ETF The characteristics of price .

The explanatory variable in this strategy is the past 3 Days and 9 Day moving average . We use  dropna() Function delete NaN Store the characteristic value in X in .

however , You can go to X Add more you think about forecasting gold ETF Price is a useful variable . These variables can be technical indicators 、 other ETF The price of , For example, gold miners ETF (GDX) Or oil ETF (USO), Or US economic data .

Define the dependent variable

Again , The dependent variable depends on the value of the explanatory variable . In short , This is the gold we are trying to predict ETF Price . We will gold ETF Prices are stored in y in .

Df['S_3'] = Df['Close'].rolling(window=3).mean()
Df['S_9'] = Df['Close'].rolling(window=9).mean()
Df['next_day_price'] = Df['Close'].shift(-1)
Df = Df.dropna()
X = Df[['S_3', 'S_9']]
y = Df['next_day_price']

Split the data into training and test data sets

In this step , We split the prediction variables and output data into training data and test data . By pairing the input with the expected output , The training data is used to create a linear regression model .

The test data is used to estimate the training effect of the model .

 • front 80% Data for training , The remaining data is used to test

  •X_train & y_train  It's the training data set

   •X_test & y_test  It's a test data set

t = .8
t = int(t*len(Df))
X_train = X[:t]
y_train = y[:t]
X_test = X[t:]
y_test = y[t:]

Create a linear regression model

We will now create a linear regression model . however , What is linear regression ?

If we try to catch “x” and “y” The mathematical relationship between variables , By fitting a line to the scatter plot ,“ best ” according to “x” The observations explain “y” The observations , So this equation x and y The relationship between is called linear regression analysis .

To further decompose , Regression uses independent variables to explain the changes of dependent variables . The dependent variable “y” Is the variable you want to predict . The independent variables “x” Is the explanatory variable you use to predict the dependent variable . The following regression equation describes this relationship :

Y = m1 * X1 + m2 * X2 + C
Gold ETF price = m1 * 3 days moving average + m2 * 15 days moving average + c

Then we use the fitting method to fit the independent variable and dependent variable (x and y) To generate regression coefficients and constants .

linear = LinearRegression().fit(X_train, y_train)
print("Linear Regression model")
print("Gold ETF Price (y) = %.2f * 3 Days Moving Average (x1) \
+ %.2f * 9 Days Moving Average (x2) \
+ %.2f (constant)" % (linear.coef_[0], linear.coef_[1], linear.intercept_))

Output linear regression model :

gold ETF Price (y) = 1.20 * 3 Day moving average (x1) + -0.21 * 9 Day moving average (x2) + 0.43( constant )

Forecast gold ETF Price

Now? , It's time to check if the model works in the test dataset . We use a linear model created using the training data set to predict gold ETF Price . The prediction method finds a given explanatory variable X Of gold ETF Price (y).

predicted_price = linear.predict(X_test)
predicted_price = pd.DataFrame(
    predicted_price, index=y_test.index, columns=['price'])
predicted_price.plot(figsize=(10, 7))
y_test.plot()
plt.legend(['predicted_price', 'actual_price'])
plt.ylabel("Gold ETF Price")
plt.show()

The picture shows gold ETF The predicted price and the actual price .

Now? , Let's use  score() Function to calculate goodness of fit .

r2_score = linear.score(X[t:], y[t:])*100
float("{0:.2f}".format(r2_score))

Output :

99.21

It can be seen that , Model R Square is 99.21%.R Square is always between 0 and 100% Between . near 100% The score shows that the model well explains gold ETF The price of .

Plot cumulative income

Let's calculate the cumulative return of this strategy to analyze its performance .

The calculation steps of cumulative income are as follows :

•   Generate daily percentage change in gold price

•   When the forecast price of the next day is higher than that of the current day , Create a to “1” Indicates a buying signal

•   Calculate the strategic return by multiplying the daily percentage change by the trading signal .

•   Last , We will draw a cumulative income chart

gold = pd.DataFrame()
gold['price'] = Df[t:]['Close']
gold['predicted_price_next_day'] = predicted_price
gold['actual_price_next_day'] = y_test
gold['gold_returns'] = gold['price'].pct_change().shift(-1)
gold['signal'] = np.where(gold.predicted_price_next_day.shift(1) < gold.predicted_price_next_day,1,0)
gold['strategy_returns'] = gold.signal * gold['gold_returns']
((gold['strategy_returns']+1).cumprod()).plot(figsize=(10,7),color='g')
plt.ylabel('Cumulative Returns')
plt.show()

Output is as follows :

We will also calculate Sharpe ratio :

sharpe = gold['strategy_returns'].mean()/gold['strategy_returns'].std()*(252**0.5)
'Sharpe Ratio %.2f' % (sharpe)

Output is as follows :

'Sharpe Ratio 1.06'

Forecast daily prices

You can use the following code to predict the price of gold , And give us what we should buy GLD Or a trading signal not to hold a position :

import datetime as dt
current_date = dt.datetime.now()
data = yf.download('GLD', '2008-06-01', current_date, auto_adjust=True)
data['S_3'] = data['Close'].rolling(window=3).mean()
data['S_9'] = data['Close'].rolling(window=9).mean()
data = data.dropna()
data['predicted_gold_price'] = linear.predict(data[['S_3', 'S_9']])
data['signal'] = np.where(data.predicted_gold_price.shift(1) < data.predicted_gold_price,"Buy","No Position")
data.tail(1)[['signal','predicted_gold_price']].T

Output is as follows :


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved