This repository has been archived by the owner on Dec 14, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 70
/
Copy pathworkflow.Rmd
236 lines (145 loc) · 9.15 KB
/
workflow.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
---
title: "PACTA_analysis workflow"
author: "Clare"
date: "08/02/2021"
output: github_document
---
## Aim
The aim of this document is to outline the workflow for running a PACTA project.
## PACTA_analysis Repo
The PACTA_analysis repo contains the primary PACTA functions required to run a portfolio alignment assessment for investors. The current repository is built in a way that the functions work for both an online application - ie for the P2020 projects and platform; as well as the original application which is running a project or generating results offline. The platform can be found at platform.transitionmonitor.com and is maintained by Wim at Constructiva (w.schuermann@constructiva.de )
Whether online or offline, running an analysis has three components
1. Portfolio Cleaning: Clean the input portfolio and merge in financial data, to categorise each holding and identify whether it's equity (EQ) or corporate bonds (CB)
2. Run Analysis: Merge in ald + scenarios, and group results at company, portfolio and regional level
3. Results: Present the results in a clear output format
Both the online and offline scripts can be used to generate reports offline. The difference is that the offline set of scripts is set up to allow the generation of results for multiple portfolios and investors in one run, whereas online is focussed on quickly generating a report for one portfolio at once. While all steps maintain the grouping by investor and portfolio, to allow for this, it hasn't been optimised for running multiple portflios as once. The online tool however has been designed for the interactive reports, whereas the offline scripts have primarily been used to generate the static pdf results.
### Portfolio Cleaning
This occurs in web_tool_script_1.R (online) or 2_project_input_analysis.R (offline).
Both start with reading in the necessary files. This includes some file cleaning, that should be shifting to the data_preprataion repository.
## Difference between online and offline
There are two sets of scripts that call the functions required for both offline or online functionality.
Online: The "web_tool_script" set from 1 to 3 runs through the components of the analysis
Offline: The 1,2,3,4 scripts run similarly, however have an initialisation script that should be run before each online in order to set up project pathways and naming. This allows for added flexibility as you may online want to be focussing on one of the components of the analysis. Ie you've already generated results and just want to create output files.
# Running a project
## Offline
This steps through the functions required to run a project using the original scripts. The aim of this script is to define project paths and set parameters used in the following scripts.
- Set project name and pathways to data in the first chunk of this code
- Run first chunk
- Get portfolio file with format (Investor.Name, Portfolio.Name, ISIN, MarketValue, Currency, NumberofShares), saved as ".csv";
- Place in the project folder
- Check and edit the parameter file to meet the project requirements
Crticial Set Up Components
- Parameter file (ProjectParameters_GENERAL)
- Portfolio file ("*_Input.csv")
- Data files (multiple)
- File paths to
1. Data location
2. Project location (inputs and outputs)
Necessary Files:
File List Check for External Data Requirements:
security_financial_data.rda
consolidated_financial_data.rda
debt_financial_data.rda
bonds_ald_scenario.rda
equity_ald_scenario.rda
masterdata_ownership_datastore.rda
masterdata_debt_datastore.rda
Optional Files:
fund_data_2019Q4.rda (or relevant time stamp)
revenue_data_member_ticker.rda (if not available, set has_revenue = FALSE in parameter file)
company_emissions.rda (if not available, set inc_emissionfactors = FALSE in parameter file)
average_sector_intensity.rda (if not available, set inc_emissionfactors = FALSE in parameter file)
### Set up parameters and paths
```{r 1_portfolio_check_initialisation, echo = FALSE}
# TODO: get to the stage that this script runs alone
# Could we define parameters here in this case?
# source("1_portfolio_check_initialisation.R")
# project_name <- "TestProject"
# twodii_internal <- FALSE
# # TRUE or FALSE: TRUE means that the code is running on a 2dii laptop with dropbox connection
#
# #####################################################################
# ### ONLY FOR EXTERNAL PROJECTS (twodii_internal <- FALSE):
# # Variables must exist for internal projects
# project_location_ext <- "C:/Users/clare/Desktop/ExternalTest"
# data_location_ext <- "C:/Users/clare/Desktop/PACTA/pacta-data/2019Q4"
# ----------------------------------------------------------------------------
devtools::load_all()
use_r_packages()
## Project Initialisation
working_location <- paste0(getwd(), "/")
source("0_portfolio_test.R")
source("0_graphing_functions.R")
source("0_reporting_functions.R")
source("0_portfolio_input_check_functions.R")
source("0_global_functions.R")
source("0_sda_approach.R")
project_name <- "TestProject"
twodii_internal <- TRUE
# TRUE or FALSE: TRUE means that the code is running on a 2dii laptop with dropbox connection
#####################################################################
### ONLY FOR EXTERNAL PROJECTS (twodii_internal <- FALSE):
# Variables must exist for internal projects
project_location_ext <- "C:/Users/clare/Desktop/ExternalTest"
data_location_ext <- "C:/Users/clare/Desktop/TestData"
#####################################################################
create_project_folder(project_name, twodii_internal, project_location_ext)
set_project_paths(project_name, twodii_internal, project_location_ext)
# Move sample files into the correct locations if they don't already exist.
# To do: copy in correct sample parameter files
copy_files(project_name)
# The ProjectParameter file is based off of the ProjectParameters_GENERAL used online for a user not taking part in a COP project
set_project_parameters(file.path(par_file_path,paste0("ProjectParameters.yml")))
# To do: find and add additional parameters
analysis_inputs_path <- set_analysis_inputs_path(twodii_internal, data_location_ext, dataprep_timestamp)
# ----------------------------------------------------------------------------
```
TO DO:
- Improve the sourcing of files. Setting the internal flag to false to allow for defining this is one solution here.
- Synchronise the parameter files with the online parameter files
### Portfolio input analysis
Aims:
- Reads in necessary data files and cleans them
To Run:
- You need a portfolio to be placed in the folder ```{r} fs::path(project_location_ext, project_name, "20_Raw_Inputs")```. There are sample files found in this repository in the folder "sample_files"
Note:
- Fund file optional. This adds additional holdings to the portfolio (adds in the details of the companies owned within a fund). If the file for the current time stamp is there this should be read in. Often this file is project specific as the file size can be very large.
- this is currently reading in the "cleaned" files found in the pacta-data repo. This workflow needs to be cleaned up so that these "cleaned" files can be created for any project.
```{r 2_project_input_analysis, echo = FALSE}
# this depends on having the "cleaned" fst files saved in the folder analysis_inputs_path, "cleaned_files"
source("2_project_input_analysis.R")
```
TO DO:
- move cleaning to data preparation
### Run Analysis
This step merges in the ald_scen file to the specific portfolio, and aggregates at the different levels of granularity required for the results - portfolio level and company level.
```{r echo=FALSE}
source("3_run_analysis.R")
```
There are options as to where to proceed from here. From this point there are options:
/item Run the stress testing code
/item Print static graphics and the pdf report (older code)
/item Print the interactive report (html) and executive summary (pdf)
This is where there is a different in workflow between the online and offline case.
The offline workflow directs a user to the pdf code that we've been using recently. There should be little stopping a user from printing a html report, however the script to do this may need to be created.
#### SDA Approach
The SDA approach is a methodology that is used for "Other" Sectors that do not have clear technology roadmaps, such as Steel, Cement, and Aviation but do have emission factor models.
This is just a test really to understand how this code works as we may want to incorporate and not lose this work that was done by Vincent.
```{r}
# msci_world <- read_rds(file.path(data_location_ext, "Indices_equity_portfolio.rda")) %>%
# filter(portfolio_name == "iShares MSCI World ETF")
#
# port_all_eq_sda <- sda_portfolio_target(market = msci_world,
# portfolio = port_all_eq,
# scenario = "ETP2017_B2DS",
# geography = "Global",
# ald_sector = c("Cement","Steel", "Aviation"),
# start_year = start_year,
# target_year = 2025)
```
## Results
```{r }
# source("5_interactive_report.R")
```
## QA
There is a QA script written by Jacob Kastl that should also be included here.