This project aims to forecast the exchange rates of ten emerging currencies (USD/BRL, USD/MXN, USD/COP, USD/ARS, USD/NGN, USD/PHP, USD/TRY, USD/RUB, USD/INR, and USD/CNY) by applying machine learning techniques like Long Short-Term Memory (LSTM), Extreme Gradient Boosting XGB Regressor, and the Pycaret library. Through the experimentation of these models, it results that each currency should have its prediction model, instead of generalizing one single model to predict all currencies.
Forecasting the exchange rate is a significant concern when comparing the economic development of any nation. The shapes and trends of different foreign exchange markets provide valuable insights regarding a country's macroeconomic conditions, which any public policymaker, international company, investor, or individual consumer could benefit from. However, predicting currency exchange rates is a difficult task due to many developing countries unstable performance and hectic political or economic conditions. Likewise, this study will contribute to academic research targeted at emerging currencies with highly volatile economic conditions.
# Source: https://finance.yahoo.com/currencies
start_date = '2020-01-01'
end_date = '2023-07-01'
# Retrieve financial market data of currency symbol
Currencies=['USDBRL', 'USDMXN', 'USDCOP', 'USDARS', 'USDNGN', 'USDPHP', 'USDTRY', 'USDRUB', 'USDINR', 'USDCNY']
currencies_names = ['Brazilian Real - USDBRL', 'Mexican Peso - USDMXN', 'Colombian Peso - USDCOP', 'Argentine Peso - USDARS', 'Nigerian Naira - USDNGN',
'Philippine Peso - USDPHP', 'Turkish Lira - USDTRY', 'Russian Ruble - USDRUB', 'Indian Rupee - USDINR', 'Chinese Yuan - USDCNY']
dataframes = []
for i in Currencies:
data_X = yf.download(i+'=X', start=start_date, end=end_date)
dataframes.append(data_X)#
# Define cleaning functions:
# Add new column with the Close change percent between rows
def add_pct_change(df):
if any(col in df.columns for col in set(['Close_Change_Pct', 'Close_Change_Pct_x', 'Close_Change_Pct_y'])):
df.drop(['Close_Change_Pct'], axis=1, inplace=True, errors='ignore')
df['Close_Change_Pct'] = df['Close'].pct_change()
else:
df['Close_Change_Pct'] = df['Close'].pct_change()
return df.sort_values(by='Close_Change_Pct', ascending=True).head(10) # Sort the dates with the largest change pct
# Where the Close change pct is <= -70% it is replaced to the previous value
def replace_close(df):
for row in range(0,len(df)):
df['Close'] = np.where((df['Close_Change_Pct'] <= -0.7), df['Close'].shift(1), df['Close'])
# Feature Creation: Create time series features per period
def Feature_Creation(df):
df.drop(['CloseScaled','DayOfWeek','Month','Quarter','Year','Prediction'], axis=1, inplace=True, errors='ignore')
df['CloseScaled'] = MinMaxScaler().fit_transform(df.filter(['Close']).values)
df['DayOfWeek'] = df.index.dayofweek
df['Month'] = df.index.month
df['Quarter'] = df.index.quarter
df['Year'] = df.index.year
return df
# Apply cleaning functions
for df in dataframes:
add_pct_change(df)
replace_close(df)
replace_close(df)
Feature_Creation(df)
# Visualize Cleaned Data
# Create a figure with 10 subplots arranged in a 5x2 grid
fig, axes = plt.subplots(2,5, figsize=(20,10))
plt.suptitle("Historic Closing Exchange Rates - Cleaned Data").set_y(1)
for df, ax, currency, name in zip(dataframes, axes.flatten(), Currencies, currencies_names):
ax.plot(df['Close'])
plt.gcf().autofmt_xdate()
# ax.set_xticklabels(ax.get_xticklabels(), rotation=90)
ax.set_title(name)
ax.grid(True)
plt.tight_layout()
# Visualize Feature / Target Relationship
# Horizontal plots
fig, axes = plt.subplots(2,5, figsize=(20,10))
plt.suptitle("Closing Rate by Year").set_y(1)
for df, ax, currency, name in zip(dataframes, axes.flatten(), Currencies, currencies_names):
sns.boxplot(y='Close', x= 'Year', data=df, ax=ax, orient='v').set_title(name)#
ax.set_xticklabels(ax.get_xticklabels(), rotation=90)
# x_ticks, x_labels = plt.xticks()# Get the current x tick labels and positions
# plt.xticks(x_ticks[::2], x_labels[::2])# Set the x tick labels and positions to only include every other label
ax.grid(True)
plt.tight_layout()
To use the XGB Regressor, the hyperparameters that control various aspects of the model will be a learning rate of 0.01, the number of gradient-boosted trees equal to 1000, and 50 early stopping rounds. Once the hyperparameters have been specified, the model is fitted to the training data which consists of the day of the week, month, quarter, year, and the scaled closing rate of the first 70% of historical data, then use the model to make predictions on the remaining 30% of test data.
# Define XGB functions:
def XGB_Model(data):
# Apply cleaning functions on data
add_pct_change(data)
replace_close(data)
replace_close(data)
Feature_Creation(data)
train_df = pd.DataFrame(data.CloseScaled.iloc[ :split_date]) # Train in 70% of first dates
test_df = pd.DataFrame(data.CloseScaled.iloc[split_date: ]) # Test in 30% after split
X_train_df = data[['DayOfWeek', 'Month', 'Quarter', 'Year']].iloc[ :split_date]
y_train_df = data[['CloseScaled']].iloc[ :split_date]
X_test_df = data[['DayOfWeek', 'Month', 'Quarter', 'Year']].iloc[split_date: ]
y_test_df = data[['CloseScaled']].iloc[split_date: ]
reg = xgb.XGBRegressor(n_estimators = 1000, early_stopping_rounds =50, learning_rate = 0.01)
reg.fit(X_train_df, y_train_df, eval_set=[(X_train_df, y_train_df), (X_test_df, y_test_df)], verbose=100)
test_df['Prediction'] = reg.predict(X_test_df) # Add the predictions in a new column
# Merge the predictions with the initial df
if any(col in data.columns for col in set(['Prediction', 'Prediction_x', 'Prediction_y'])):
data.drop(['Prediction','Prediction_x','Prediction_y'], axis=1, inplace=True, errors='ignore')
data = data.merge(test_df['Prediction'], how='left', left_index=True, right_index=True)
else:
data = data.merge(test_df['Prediction'], how='left', left_index=True, right_index=True)
RMSE = np.sqrt(mean_squared_error(test_df['CloseScaled'], test_df['Prediction']))
print(f'{Currency} - RMSE Score on Test Set: {RMSE: 0.3f}') # This should be the same score as validation_1-rmse
# Optimized visuals
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20,5))
plt.suptitle("XGBRegressor Model").set_y(1)
axes = [ax1, ax2]
titles = [f'Scaled and Prediction Data - {Currency}', f'Zoom in Test Raw and Prediction Data - {Currency}']
data_to_plot = [data[['CloseScaled', 'Prediction']], data.loc[data.index >= data.index[split_date], ['CloseScaled', 'Prediction']]]
for ax, title, data_to in zip(axes, titles, data_to_plot):
data_to['CloseScaled'].plot(ax=ax, title=title)
data_to['Prediction'].plot(ax=ax, style='--', color='red').grid(True)
ax.axvline(data.index[split_date], color='grey', ls='--')
ax.legend(['Raw Data', 'Prediction Data'])
plt.tight_layout()
return
# Perform XGB across all currencies
for df, Currency in zip(dataframes, Currencies):
XGB_Model(df)
The implementation of the LSTM is initialized as a sequential model, which means that the layers are stacked sequentially. The first layer with 50 units and it returns the full sequence of outputs. The input shape is the length of the training data, i.e., 70% of the historic time series with a batch size of 32 and 1 epoch to make the training faster. The second layer has another 50 units, but this time it only returns the last output sequence. After that, there are two dense layers with 25 and 1 units respectively. Finally, the model is compiled with the Adam optimizer and the mean squared error loss function.
# Define LSTM functions:
def LSTM_Model(data):
# Apply cleaning functions on data
add_pct_change(data)
replace_close(data)
replace_close(data)
Feature_Creation(data)
# Training Dataset
split_date = int(len(data) * 0.7)
train = np.array(data.CloseScaled.iloc[ :split_date]) # Train in 70% of first dates
X_train = []
y_train = []
for i in range(60, split_date):
X_train.append(train[i-60:i])
y_train.append(train[i])
X_train, y_train= np.array(X_train), np.array(y_train) # convert the train data into array
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1)) # Reshape the data
# Testing Dataset
test = np.array(data.CloseScaled.iloc[split_date: ]) # Test in 30% after split
X_test = []
y_test = data.Close.iloc[split_date+60: ] #normal values from original data
for i in range(60, len(test)):
X_test.append(test[i-60:i])
X_test = np.array(X_test) # convert the train data into array
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))#Reshape the data
# Create model LSTM
seq = Sequential() # Initializing the RNN
seq.add(LSTM(50, return_sequences=True, input_shape=(X_train.shape[1], 1))) # Adding the first LSTM layer
seq.add(LSTM(50, return_sequences=False)) # Adding the Second LSTM layer
seq.add(Dense(25))
seq.add(Dense(1))
seq.compile(optimizer='adam', loss='mean_squared_error')# Compile the model
seq.fit(X_train, y_train, batch_size=32, epochs=1)# Traing the model. Set the epochs=10 takes 10 minutes (100 takes too long)
# Get model predicted values
scaler = MinMaxScaler()
scaler.fit(data.filter(['Close']).values)
pred = seq.predict(X_test)
pred = scaler.inverse_transform(pred) # "inverse scaled values to original values"
# Calculate the mean squared error on the training data
mse_seq = mean_squared_error(y_test, pred)
rmse_seq = sqrt(mse_seq)
print(f'{Currency} RMSE: {rmse_seq:.2f}')
# Split Close non-scaled data into train and valid df
train_df = pd.DataFrame(data.Close.iloc[ :split_date+60]) # Train in 70% of first dates
valid_df = pd.DataFrame(data.Close.iloc[split_date+60: ]) # Test in 30% after split
valid_df['Prediction'] = pred # Add Predictions column with the inverse scaled values
# Plot the Training and Testing data sets and zoom in
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20,5))
plt.suptitle("LSTM Sequential Model").set_y(1)
train_df.plot(ax=ax1, label='Training Set', title=f'Raw and Prediction Data - {Currency}')
valid_df.plot(ax=ax1, label=['Valid Data','Prediction Data']).grid(True)
ax1.axvline(data.index[split_date+60], color='grey', ls='--')
ax1.legend(['Training Data', 'Valid Data','Prediction Data'])
valid_df['Close'].plot(ax = ax2, color='darkseagreen', title=f'Zoom in Test Raw and Prediction Data - {Currency}')
valid_df['Prediction'].plot(ax = ax2, style='--', color='red').grid(True)
ax2.axvline(data.index[split_date+60], color='grey', ls='--')
ax2.legend(['Valid Data', 'Prediction Data'])
plt.tight_layout()
return
# Perform LSTM across all currencies
for df, Currency in zip(dataframes, Currencies):
LSTM_Model(df)
An alternative experiment is conducted with the SVC to make predictions on the dataset. This attempt uses the difference between Open - Close and High - Low features to calculate the daily returns, strategy returns, cumulative returns, and cumulative strategy returns for each currency exchange rate. The daily returns are calculated as the percentage change in the Close rate. The strategy returns are calculated by multiplying the predictions (shifted by a one-time step) by the daily returns. The cumulative returns and cumulative strategy returns are calculated by taking the cumulative sum of the daily returns and strategy returns, respectively.
# Define SVC functions:
def SVC_Model(data):
# Apply cleaning functions
add_pct_change(data)
replace_close(data)
replace_close(data)
# Create independent and dependent variables
data['High-Low'] = data['High'] - data['Low']
data['Open-Close'] = data['Open'] - data['Close']
X = data[['Open-Close', 'High-Low', 'Close']] # Define independent variables
# Create signals: If tomorrow's close is > today's, then 1 increase, 0 otherwise
y = np.where(data.Close.shift(-1) > data.Close, 1, 0) # Target variable
# Training Dataset
split_date = int(len(data) * 0.9)
X_train = X[:split_date]
y_train = y[:split_date]
# Testing Dataset
X_test = X[split_date:]
y_test = y[split_date:]
# Create the model SVC
svc = SVC()
svc.fit(X_train[['Open-Close', 'High-Low']],y_train)# Traing the model
svc_score_train = svc.score(X_train[['Open-Close', 'High-Low']],y_train)# score of the model on Train
svc_score_test = svc.score(X_test[['Open-Close', 'High-Low']],y_test)# score of the model on Test
data['Predictions'] = svc.predict(X[['Open-Close', 'High-Low']])# model predictions
data['Return'] = data['Close'].pct_change() # Calculate daily returns
data['Strat_Return'] = data['Predictions'].shift(1)*data['Return'] # Calculate strategy returns
data['Cumul_Return'] = data['Return'].cumsum() # Calculate cumulative returns
data['Cumul_Strat'] = data['Strat_Return'].cumsum() # Calculate strategy returns
return
# Perform SVC across all currencies
fig, axes = plt.subplots(2,5, figsize=(20,10))
plt.suptitle("SVC Model").set_y(1)
for df, ax, currency, name in zip(dataframes, axes.flatten(), Currencies, currencies_names):
SVC_Model(df)
ax.plot(df['Cumul_Return'], label='Currency Returns')
ax.plot(df['Cumul_Strat'], label='Strategy Returns')
plt.gcf().autofmt_xdate()
ax.set_title(name)
ax.grid(True)
ax.legend()
plt.tight_layout()
# Define RFR functions:
def RFR_Model(data,Currency):
# Split the dataset
X = data[['Open','High','Low']]
X_train = X[ :len(data)-1] # all rows but not the last one
X_test = X.tail(1) # the last one row
y = data['Close']
y_train = y[ :len(data)-1] # all rows but not the last one
y_test = y.tail(1) # the last one row
# Create the model Random Forest Regressor
RFR = RandomForestRegressor()
# Train the model
RFR.fit(X_train,y_train)
# Test the model
predictions = RFR.predict(X_train)
# Make prediction
prediction = RFR.predict(X_test)
print(Currency,'prediction:')
print('RFR score is:', (RFR.score(X_train,y_train)*100).round(3),'%')
print('RFR predicts the last day to be:', prediction.round(3))
print('Actual value is:',y_test.values[0].round(3)) # this should be the last value from the data imported
print('Difference between actual and predicted is:',(y_test.values[0] - prediction).round(3))
print()
return
The important PyCaret functions applied in this experiment include the following: Initialize the regression setup on train data composed of 70% of initial historical data and set the target feature as the future price predicted ten days ahead of the closing rate. Then compare_models function trains all the regressions available in the Pycaret library using the default hyperparameters and evaluates performance metrics. Metrics used to compare regression results are MAE, MSE, RMSE, R2, RMSLE, and MAPE. The create_model function trains the model that was identified as the most accurate using the specified RMSE hyperparameter and evaluates its performance metrics using cross-validation. Finally, the function predict_model is used on the founded model for each inference and passes through the test dataset to estimate predictions.
# Define Pycaret functions:
def Comparing_Model(data):
# Apply cleaning functions on data
add_pct_change(data)
replace_close(data)
replace_close(data)
future_days = 10 # variable for predicting days out into the future
data['Future_Price'] = data['Close'].shift(-future_days) # create a new column for the target feature shifted 'n days' up
X = data[['Close','Future_Price']]
X = X[ :len(data)-future_days] # all rows but not the future days
y = data['Future_Price']
y = y[ :-future_days] # all rows but not the future days
split_date = int(len(data) * 0.7) # Change the % to train data
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=split_date, random_state=0, shuffle=False)
print('\n \033[1m' + Currency + '\033[0m')
regression_setup = setup(data=X_train, target='Future_Price', session_id=123) # Initialize the setup
comparing = compare_models(sort='RMSE') # Also sort by RMSE?descending??
best_model = create_model(comparing) # The best model for each currency
unseen_predictions = predict_model(best_model, data=X_test)
data = pd.merge(data, unseen_predictions['prediction_label'], how='left', left_index=True, right_index=True)
# Visualize Scaled / Predictions and Zoom in
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20,5))
plt.suptitle("Best Model").set_y(1)
data['Close'].plot(ax=ax1, title=f'Scaled and Prediction Datasets - {Currency}')
data[['prediction_label']].plot(ax=ax1, style='--', color='red').grid(True)
ax1.axvline(data.index[split_date], color='grey', ls='--')
ax1.legend(['Raw Data', 'Prediction Data'])
data.loc[data.index >= data.index[split_date]]['Close'].plot(ax=ax2, title=f'Zoom in Test Raw and Prediction Dataset - {Currency}')
data.loc[data.index >= data.index[split_date]]['prediction_label'].plot(ax=ax2, style='--', color='red').grid(True)
ax2.axvline(data.index[split_date], color='grey', ls='--')
ax2.legend(['Raw Data', 'Prediction Data'])
plt.tight_layout()
return
# Perform Pycaret evaluating models across all currencies
for df, Currency in zip(dataframes, Currencies):
Comparing_Model(df)
plt.show()
From the experiments described in the previous section, it can be stated that there is not a specific model that adjusts to all currencies, instead individual models should be targeted to each currency to predict its historic trend. The results obtained with the XGB Regressor were not satisfactory due to that it faced a challenge with non-stationarity data like unpredicted exchange rates, making it difficult to train an XGB Regressor model that generalizes well to new data. The advantage of LSTM is that the input values fed to the network not only go through several LSTM layers but also propagate through time within one LSTM cell, resulting in a thorough process of inputs in each time step. Unlike traditional time series models such as ARIMA, RNNs are capable of learning nonlinearities, and specialized nodes like LSTM nodes are even better at this. However, LSTM is prone to overfitting the predictions, so to prevent it it can use alternative techniques such as regularization, early stopping, and dropout. From the Pycaret experiment, it can be appreciated that the most frequent model is the Light Gradient Boosting Machine (LightGBM) and Linear Regression for the task of predicting different trends. LightGBM is an open-source, gradient-boosting framework based on decision trees that can be used for time series forecasting, is like XGBoost, but it is designed to be more efficient and to reduce memory usage. Although, these models may be affected by temporal dependence, non-stationarity, and seasonality present in the data.
For further analysis and discussion, the full printed version of this project is available upon request.