I use Google Analytics to track traffic on this site, and I’ve found that the two previous posts on downloading return data from Yahoo! Finance are relatively popular. For this reason, I’m posting some updated R code which can be used to download returns for a batch of stocks or funds.
My first post on this topic included R code for downloading monthly or weekly returns from Yahoo! Finance, and I created a short video tutorial to demonstrate the code. My second post provided a Google Docs spreadsheet which can be used to download monthly returns.
In this post, I’ve updated the original R code to allow monthly returns for multiple stocks or funds to be downloaded into a single CSV file. This is useful if you want to compare factor loadings for multiple funds, or if you simply want to get the monthly returns for all the funds in your portfolio.
The code uses an input file “funds.csv” which is a list of tickers symbols for the target stocks or funds. An example file is available here.
The monthly returns are recorded in a file called “fundreturns.csv” which is stored in R’s working directory
Note that I believe Yahoo! Finance limits the amount of data that you can download from the site in a given period of time. I don’t know the specific rules, but if you put a large number of tickers into the CSV file you might find that the downloads start failing and you’ll have to wait for some period of time before they start working again.
The R code is posted below. The variable “startdate” should be set to a date a few days prior to the start of the first month for which you want to get return data. For example, if the first monthly return you are targeting is January 2006, then you can set the start date to “12-25-2005”. If the start date chosen is earlier than the earliest available price history for some funds, then there could be some misalignment between returns and dates in the output file, so be sure to choose a date which is compatible with available price history for the all the stocks and funds in your list.
R Code:
This code requires both the “tseries” and “zoo” packages to be installed. For details on installing packages on your specific platform please see the R project documentation on package installation.
# Batch Fund Download # calcinv: 09/19/2013 # Import tseries and zoo libraries library(zoo) library(tseries) # Uncomment the setInternet2 line if a proxy is required # setInternet2(TRUE) # Set Start Date startdate = "2005-12-25" # Load CSV file into R testfunds <- read.table("funds.csv",sep=";",header=FALSE) # Extract tickers and fund weights ticks <- testfunds[,1] # Setup Equity Fund Variables funds <- NULL # Download equity fund data for(i in 1:length(ticks)){ # Download Adjusted Price Series from Yahoo! Finance prices <- get.hist.quote(ticks[i], quote="Adj", start=startdate, retclass="zoo") # Convert daily closing prices to monthly closing prices monthly.prices <- aggregate(prices, as.yearmon, tail, 1) # Convert selected monthly prices into monthly returns to run regression r <- diff(log(monthly.prices)) # convert prices to log returns r1 <- exp(r)-1 # back to simple returns # Now shift out of zoo object into ordinary matrix rj <- coredata(r1) # Put fund returns into matrix funds <- cbind(funds, rj) } fundfile <- cbind(as.character(index(r1)),funds) header <- c("Dates",as.character(ticks)) fundfile <- rbind(header,fundfile) # Write output data to csv file write.table(fundfile, file="fundreturns.csv", sep=",", row.names=FALSE,col.names=FALSE)
I really like your code. But I am having a problem running the “get.hist.quote” command on my computer. It hangs trying to get to this URL:
“trying URL ‘http://chart.yahoo.com/table.csv?s=SPY&a=11&b=25&c=2005&d=11&e=31&f=2013&g=d&q=q&y=0&z=SPY&x=.csv'”
And when I try putting this URL in my browser, I get a 404 error. Seems that my install of R / tseries is referencing an old link. I have a fresh install of R and have tried it on two separate computers.
I have used/developed a work around that uses the right yahoo link:
gethistory = function(symbol){
data = read.csv(paste(‘http://chart.finance.yahoo.com/table.csv?s=’,symbol,sep=”))
data$Date = as.Date(data$Date)
data
}
But this is much less efficient than using tseries and the “get.hist.quote” command. Have you encountered this before? Any ideas as to what may be wrong with “get.hist.quote”?
I just tried it, and I’m having the same problem that you are having.
The tseries source code is available on this page: http://cran.r-project.org/web/packages/tseries/index.html
In the zip file listed next to “package source” you will find a file “finance.R” which includes the get.hist.quote function. I modified the URL and saved the function to a new file. The new version seems to work, though I haven’t test carefully. I’ll email it to you.
would you mind emailing me the modified url as well?
This site was… how do you say it? Relevant!! Finally I’ve found something
that helped me. Thanks a lot!