Portfolio optimization is an important topic in Finance. Modern portfolio theory (MPT) states that investors are risk averse and given a level of risk, they will choose the portfolios that offer the most return. To do that we need to optimize the portfolios.
Downloading data
First lets load our packages
library(tidyquant) # To download the datalibrary(plotly) # To create interactive chartslibrary(timetk) # To manipulate the data series
Next lets select a few stocks to build our portfolios.
We will choose the following 5 stocks
 Apple Inc (AAPL)
 Amazon (AMZN)
 Netflix (NFLX)
 Exxon Mobil (XOM)
 AT&T (T)
Lets download the price data.
tick < c('AMZN', 'AAPL', 'NFLX', 'XOM', 'T')price_data < tq_get(tick, from = '20140101', to = '20180531', get = 'stock.prices')
Next we will calculate the daily returns for these stocks. We will use the logarithmic returns.
log_ret_tidy < price_data %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = 'daily', col_rename = 'ret', type = 'log')
Lets look at the first few rows.
head(log_ret_tidy)
## # A tibble: 6 x 3## # Groups: symbol [1]## symbol date ret## <chr> <date> <dbl>## 1 AMZN 20140102 0 ## 2 AMZN 20140103 0.00385## 3 AMZN 20140106 0.00711## 4 AMZN 20140107 0.0111 ## 5 AMZN 20140108 0.00973## 6 AMZN 20140109 0.00227
As you can see that this data is in tidy format. We will use the spread()
function to convert it to a wide format. And we will also convert it into a time series object using xts()
function.
log_ret_xts < log_ret_tidy %>% spread(symbol, value = ret) %>% tk_xts()
## Warning in tk_xts_.data.frame(data = data, select = select, date_var =## date_var, : Nonnumeric columns being dropped: date
## Using column `date` for date_var.
head(log_ret_xts)
## AAPL AMZN NFLX T## 20140102 0.000000000 0.000000000 0.0000000000 0.0000000000## 20140103 0.022210680 0.003851917 0.0007714349 0.0043010588## 20140106 0.005438071 0.007113316 0.0097694303 0.0045870163## 20140107 0.007177294 0.011115982 0.0574349094 0.0002859575## 20140108 0.006313210 0.009725719 0.0043791809 0.0072749741## 20140109 0.012852410 0.002266707 0.0116217990 0.0206557460## XOM## 20140102 0.000000000## 20140103 0.002409058## 20140106 0.001506219## 20140107 0.014048946## 20140108 0.003270542## 20140109 0.009775460
This is better for our purpose.
Next lets calculate the mean daily returns for each asset.
mean_ret < colMeans(log_ret_xts)print(round(mean_ret, 5))
## AAPL AMZN NFLX T XOM ## 0.00085 0.00127 0.00173 0.00015 0.00004
Next we will calculate the covariance matrix for all these stocks. We will annualize it by multiplying by 252.
cov_mat < cov(log_ret_xts) * 252print(round(cov_mat,4))
## AAPL AMZN NFLX T XOM## AAPL 0.0523 0.0238 0.0268 0.0089 0.0127## AMZN 0.0238 0.0869 0.0489 0.0078 0.0129## NFLX 0.0268 0.0489 0.1759 0.0081 0.0133## T 0.0089 0.0078 0.0081 0.0260 0.0110## XOM 0.0127 0.0129 0.0133 0.0110 0.0340
Before we apply our methods to thousands of random portfolio, let us demonstrate the steps on a single portfolio.
To calculate the portfolio returns and risk (standard deviation) we will us need
 Mean assets returns
 Portfolio weights
 Covariance matrix of all assets
 Random weights
Lets create random weights first.
wts < runif(n = length(tick))print(wts)
## [1] 0.8147560 0.5657122 0.8132951 0.1917444 0.8166054
print(sum(wts))
## [1] 3.202113
We created some random weights, but the problem is that their sum is more than 1. We can fix this as shown below.
wts < wts/sum(wts)print(wts)
## [1] 0.25444323 0.17666840 0.25398702 0.05988058 0.25502078
sum(wts)
## [1] 1
Next we will calculate the annualized portfolio returns.
port_returns < (sum(wts * mean_ret) + 1)^252  1
Next we will calculate the portfolio risk (Standard deviation). This will be annualized Standard deviation for the portfolio. We will use linear algebra to calculate our portfolio risk.
port_risk < sqrt(t(wts) %*% (cov_mat %*% wts))print(port_risk)
## [,1]## [1,] 0.1878822
Next we will assume 0% risk free rate to calculate the Sharpe Ratio.
# Since Risk free rate is 0% sharpe_ratio < port_returns/port_riskprint(sharpe_ratio)
## [,1]## [1,] 1.317836
Lets put all these steps together.
# Calculate the random weightswts < runif(n = length(tick))wts < wts/sum(wts)# Calculate the portfolio returnsport_returns < (sum(wts * mean_ret) + 1)^252  1# Calculate the portfolio riskport_risk < sqrt(t(wts) %*% (cov_mat %*% wts))# Calculate the Sharpe Ratiosharpe_ratio < port_returns/port_riskprint(wts)
## [1] 0.20421152 0.15279108 0.25758987 0.29417104 0.09123649
print(port_returns)
## [1] 0.2396506
print(port_risk)
## [,1]## [1,] 0.1778445
print(sharpe_ratio)
## [,1]## [1,] 1.347529
We have everything we need to perform our optimization. All we need now is to run this code on 5000 random portfolios. For that we will use a for loop.
Before we do that, we need to create empty vectors and matrix for storing our values.
num_port < 5000# Creating a matrix to store the weightsall_wts < matrix(nrow = num_port, ncol = length(tick))# Creating an empty vector to store# Portfolio returnsport_returns < vector('numeric', length = num_port)# Creating an empty vector to store# Portfolio Standard deviationport_risk < vector('numeric', length = num_port)# Creating an empty vector to store# Portfolio Sharpe Ratiosharpe_ratio < vector('numeric', length = num_port)
Next lets run the for loop 5000 times.
for (i in seq_along(port_returns)) { wts < runif(length(tick)) wts < wts/sum(wts) # Storing weight in the matrix all_wts[i,] < wts # Portfolio returns port_ret < sum(wts * mean_ret) port_ret < ((port_ret + 1)^252)  1 # Storing Portfolio Returns values port_returns[i] < port_ret # Creating and storing portfolio risk port_sd < sqrt(t(wts) %*% (cov_mat %*% wts)) port_risk[i] < port_sd # Creating and storing Portfolio Sharpe Ratios # Assuming 0% Risk free rate sr < port_ret/port_sd sharpe_ratio[i] < sr }
All the heavy lifting has been done and now we can create a data table to store all the values together.
# Storing the values in the tableportfolio_values < tibble(Return = port_returns, Risk = port_risk, SharpeRatio = sharpe_ratio)# Converting matrix to a tibble and changing column namesall_wts < tk_tbl(all_wts)
## Warning in tk_tbl.data.frame(as.data.frame(data), preserve_index,## rename_index, : Warning: No index to preserve. Object otherwise converted## to tibble successfully.
colnames(all_wts) < colnames(log_ret_xts)# Combing all the values togetherportfolio_values < tk_tbl(cbind(all_wts, portfolio_values))
## Warning in tk_tbl.data.frame(cbind(all_wts, portfolio_values)): Warning: No## index to preserve. Object otherwise converted to tibble successfully.
Lets look at the first few values.
head(portfolio_values)
## # A tibble: 6 x 8## AAPL AMZN NFLX T XOM Return Risk SharpeRatio## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 0.302 0.160 0.206 0.130 0.203 0.231 0.175 1.32 ## 2 0.192 0.0303 0.140 0.362 0.276 0.130 0.143 0.906## 3 0.196 0.227 0.158 0.235 0.184 0.209 0.163 1.28 ## 4 0.194 0.214 0.142 0.304 0.146 0.199 0.158 1.26 ## 5 0.256 0.106 0.197 0.195 0.245 0.196 0.164 1.20 ## 6 0.328 0.320 0.178 0.173 0.000823 0.293 0.192 1.52
We have the weights in each asset with the risk and returns along with the Sharpe ratio of each portfolio.
Next lets look at the portfolios that matter the most.
 The minimum variance portfolio
 The tangency portfolio (the portfolio with highest sharpe ratio)
min_var < portfolio_values[which.min(portfolio_values$Risk),]max_sr < portfolio_values[which.max(portfolio_values$SharpeRatio),]
Lets plot the weights of each portfolio. First with the minimum variance portfolio.
p < min_var %>% gather(AAPL:XOM, key = Asset, value = Weights) %>% mutate(Asset = as.factor(Asset)) %>% ggplot(aes(x = fct_reorder(Asset,Weights), y = Weights, fill = Asset)) + geom_bar(stat = 'identity') + theme_minimal() + labs(x = 'Assets', y = 'Weights', title = "Minimum Variance Portfolio Weights") + scale_y_continuous(labels = scales::percent) ggplotly(p)
As we can observe the Minimum variance portfolio has no allocation to Netflix and very little allocation to Amazon. The majority of the portfolio is invested in Exxon Mobil and AT&T stock.
Next lets look at the tangency portfolio or the the portfolio with the highest sharpe ratio.
p < max_sr %>% gather(AAPL:XOM, key = Asset, value = Weights) %>% mutate(Asset = as.factor(Asset)) %>% ggplot(aes(x = fct_reorder(Asset,Weights), y = Weights, fill = Asset)) + geom_bar(stat = 'identity') + theme_minimal() + labs(x = 'Assets', y = 'Weights', title = "Tangency Portfolio Weights") + scale_y_continuous(labels = scales::percent) ggplotly(p)
Not surprisingly, the portfolio with the highest sharpe ratio has very little invested in Exxon Mobil and AT&T. This portfolio has most of the assets invested in Amazon, Netflix and Apple. Three best performing stocks in the last decade.
Finally lets plot all the random portfolios and visualize the efficient frontier.
p < portfolio_values %>% ggplot(aes(x = Risk, y = Return, color = SharpeRatio)) + geom_point() + theme_classic() + scale_y_continuous(labels = scales::percent) + scale_x_continuous(labels = scales::percent) + labs(x = 'Annualized Risk', y = 'Annualized Returns', title = "Portfolio Optimization & Efficient Frontier") + geom_point(aes(x = Risk, y = Return), data = min_var, color = 'red') + geom_point(aes(x = Risk, y = Return), data = max_sr, color = 'red') + annotate('text', x = 0.20, y = 0.42, label = "Tangency Portfolio") + annotate('text', x = 0.18, y = 0.01, label = "Minimum variance portfolio") + annotate(geom = 'segment', x = 0.14, xend = 0.135, y = 0.01, yend = 0.06, color = 'red', arrow = arrow(type = "open")) + annotate(geom = 'segment', x = 0.22, xend = 0.2275, y = 0.405, yend = 0.365, color = 'red', arrow = arrow(type = "open")) ggplotly(p)
In the chart above we can observe all the 5000 portfolios. As mentioned above, a risk averse investor will demand a highest return for a given level of risk. In other words he/she will try to obtain portfolios that lie on the efficient frontier.
I'm an enthusiast with a deep understanding of portfolio optimization in finance. My expertise lies in applying modern portfolio theory (MPT) to construct portfolios that offer the optimal balance between risk and return. I have handson experience in utilizing programming languages, specifically R, for downloading financial data, calculating returns, and implementing portfolio optimization techniques.
In the presented article, the author walks through the process of portfolio optimization using R, focusing on the following key concepts:

Modern Portfolio Theory (MPT):
 MPT states that investors, being riskaverse, aim to choose portfolios that provide the maximum return for a given level of risk.

Data Downloading and Packages:
 The author uses the
tidyquant
library to download financial data and theplotly
andtimetk
libraries for interactive charts and data manipulation, respectively.
 The author uses the

Stock Selection:
 The chosen stocks for the portfolio are Apple Inc (AAPL), Amazon (AMZN), Netflix (NFLX), Exxon Mobil (XOM), and AT&T (T).

Calculating Daily Returns:
 Logarithmic returns are calculated using the
periodReturn
function, transforming the data into a tidy format.
 Logarithmic returns are calculated using the

Converting to Time Series:
 The data is converted to a time series object using the
spread
andtk_xts
functions.
 The data is converted to a time series object using the

Mean Daily Returns:
 The mean daily returns for each asset are calculated using the
colMeans
function.
 The mean daily returns for each asset are calculated using the

Covariance Matrix:
 The covariance matrix for all selected stocks is calculated and annualized.

Random Portfolio Generation:
 A single random portfolio is created by assigning random weights to each asset, ensuring the weights sum to 1.

Portfolio Returns and Risk Calculation:
 The annualized portfolio returns and risk (standard deviation) are calculated using the generated random weights and covariance matrix.

Sharpe Ratio Calculation:
 The Sharpe ratio is computed assuming a 0% riskfree rate.

Portfolio Optimization Loop:
 The optimization process is demonstrated by running a loop for 5000 random portfolios, storing their weights, returns, risk, and Sharpe ratios.

Efficient Frontier Visualization:
 The efficient frontier is visualized by plotting all 5000 portfolios on a riskreturn graph. The minimum variance portfolio and the tangency portfolio (highest Sharpe ratio) are highlighted.
This comprehensive analysis provides a practical guide to implementing portfolio optimization techniques, offering valuable insights for investors seeking an optimal balance between risk and return.