Some code to read in data from Google Analytics data. Some modifications include adding the SSL authentication code and modifying (in bold) the table.id parameter to choose correct website from a GA profile with many websites
The Google Analytics Package files can be downloaded from http://code.google.com/p/r-google-analytics/downloads/list
It provides access to Google Analytics data natively from the R Statistical Computing programming language. You can use this library to retrieve an R data.frame with Google Analytics data. Then perform advanced statistical analysis, like time series analysis and regressions.
Supported Features
- Access to v2 of the Google Analytics Data Export API Data Feed
- A QueryBuilder class to simplify creating API queries
- API response is converted directly into R as a data.frame
- Library returns the aggregates, and confidence intervals of the metrics, dynamically if they exist
- Auto-pagination to return more than 10,000 rows of information by combining multiple data requests. (Upper Limit 1M rows)
- Authorization through the ClientLogin routine
- Access to all the profiles ids for the authorized user
- Full documentation and unit tests
> library(XML)
>
> library(RCurl)
Loading required package: bitops
>
> #Change path name in the following to the folder you downloaded the Google Analytics Package
>
> source(“C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/RGoogleAnalytics.R”)
>
> source(“C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/QueryBuilder.R”)
> # download the file needed for authentication
> download.file(url=”http://curl.haxx.se/ca/cacert.pem”, destfile=”cacert.pem”)
trying URL ‘http://curl.haxx.se/ca/cacert.pem’ Content type ‘text/plain’ length 215993 bytes (210 Kb) opened
URL downloaded 210 Kb
>
> # set the curl options
> curl <- getCurlHandle()
> options(RCurlOptions = list(capath = system.file(“CurlSSL”, “cacert.pem”,
+ package = “RCurl”),
+ ssl.verifypeer = FALSE))
> curlSetOpt(.opts = list(proxy = ‘proxyserver:port’), curl = curl)
An object of class “CURLHandle” Slot “ref”: <pointer: 0000000006AA2B70>
>
> # 1. Create a new Google Analytics API object
>
> ga <- RGoogleAnalytics()
>
> # 2. Authorize the object with your Google Analytics Account Credentials
>
> ga$SetCredentials(“USERNAME”, “PASSWORD”)
>
> # 3. Get the list of different profiles, to help build the query
>
> profiles <- ga$GetProfileData()
>
> profiles #Error Check to See if we get the right website
$profile AccountName ProfileName TableId
1 dudeofdata.com dudeofdata.com ga:44926237
2 knol.google.com knol.google.com ga:45564890
3 decisionstats.com decisionstats.com ga:46751946
$total.results
total.results
1 3
>
> # 4. Build the Data Export API query
>
> #Modify the start.date and end.date parameters based on data requirements
>
> #Modify the table.id at table.id = paste(profiles$profile[X,3]) to get the X th website in your profile
> # 4. Build the Data Export API query
> query <- QueryBuilder() > query$Init(start.date = “2012-01-09”, + end.date = “2012-03-20”, + dimensions = “ga:date”,
+ metrics = “ga:visitors”,
+ sort = “ga:date”,
+ table.id = paste(profiles$profile[3,3]))
>
>
> #5. Make a request to get the data from the API
>
> ga.data <- ga$GetReportData(query)
[1] “Executing query: https://www.google.com/analytics/feeds/data?start-date=2012%2D01%2D09&end-date=2012%2D03%2D20&dimensions=ga%3Adate&metrics=ga%3Avisitors&sort=ga%3Adate&ids=ga%3A46751946”
>
> #6. Look at the returned data
>
> str(ga.data)
List of 3
$ data :’data.frame’: 72 obs. of 2 variables: ..
$ ga:date : chr [1:72] “20120109” “20120110” “20120111” “20120112” … ..
$ ga:visitors: num [1:72] 394 405 381 390 323 47 169 67 94 89 …
$ aggr.totals :’data.frame’: 1 obs. of 1 variable: ..
$ aggregate.totals: num 28348
$ total.results: num 72
>
> head(ga.data$data)
ga:date ga:visitors
1 20120109 394
2 20120110 405
3 20120111 381
4 20120112 390
5 20120113 323
6 20120114 47 >
> #Plotting the Traffic >
> plot(ga.data$data[,2],type=”l”)
Update- Some errors come from pasting Latex directly to WordPress. Here is some code , made pretty-r in case you want to play with the GA api
library(XML) library(RCurl) #Change path name in the following to the folder you downloaded the Google Analytics Package source("C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/RGoogleAnalytics.R") source("C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/QueryBuilder.R") # download the file needed for authentication download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem") # set the curl options curl <- getCurlHandle() options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE)) curlSetOpt(.opts = list(proxy = 'proxyserver:port'), curl = curl) # 1. Create a new Google Analytics API object ga <- RGoogleAnalytics() # 2. Authorize the object with your Google Analytics Account Credentials ga$SetCredentials("ohri2007@gmail.com", "XXXXXXX") # 3. Get the list of different profiles, to help build the query profiles <- ga$GetProfileData() profiles #Error Check to See if we get the right website # 4. Build the Data Export API query #Modify the start.date and end.date parameters based on data requirements #Modify the table.id at table.id = paste(profiles$profile[X,3]) to get the X th website in your profile # 4. Build the Data Export API query query <- QueryBuilder() query$Init(start.date = "2012-01-09", end.date = "2012-03-20", dimensions = "ga:date", metrics = "ga:visitors", sort = "ga:date", table.id = paste(profiles$profile[3,3])) #5. Make a request to get the data from the API ga.data <- ga$GetReportData(query) #6. Look at the returned data str(ga.data) head(ga.data$data) #Plotting the Traffic plot(ga.data$data[,2],type="l")
Hey kaushik,
Due to number of changes in Google API service system, this is depreacated. I would like to recommend you this post (http://www.tatvic.com/blog/ga-data-extraction-in-r/) for Ga data extraction in R.
Thanks. I will post on this shortly. Lets connect on LinkedIn . I am at http://linkedin.com/in/ajayohri
Hi! I’m using your post here as a reference to access my Google Analytics data natively from R. However, I receive an error message “Error in GetDataFeed(query.builder$to.uri()) : 401 Unauthorized” and the script stops running. I’m wondering if the script you published above still works, as I believe this issue is tied to the deprecation of the v2 of the Google Analytics Data Export API Data Feed and the Oauth 2 requirement. Do you know of a workaround? Thanks. (https://groups.google.com/d/topic/google-analytics-data-export-api/ACj9Nlg_E20/discussion)