How To Create Google Analytics Segments [Conditional Data Pulls] In R Studio
Alright! Another day, another R related question.
In the previous examples that I looked at using GA + R, the conditional data pull was based on dates only. Link to previous posts on this.
http://analyticslog.com/blog/how-to-find-outliers-in-boxplots-via-r-programming
http://analyticslog.com/blog/how-to-analyze-day-of-week-google-analytics-data-in-r-studio
So, what if we wanted to create a conditional data pull i.e. apply segments while pulling the info in R Studio. How would we go about it?
Had just begun searching for this topic when the good people at Digital Analytics Power Hour podcast provided the link on Twitter: http://www.dartistics.com/googleanalytics/simple-dynamic-segment.html
Creating dynamic Google Analytics segments in R Studio
This blog post will heavily use [steal] the code from Dartistics but will I try and apply a different condition to my segment definition as a test, see what pops up and mainly, try to go through the steps and see if I understand anything from the code.
Main bits from the code relating to the query [Full code at bottom of post]So, here's the bit that defines the segment_element
This uses a GA reporting V4 syntax to define the segment. In this case, we are looking at the landingPagePath [landing page] dimension, using a matches RegEx where the landingPagePath matches GTM OR Facebook. Straightforward and as close what you'd do in GA interface.
There's 3 parts to fully defining the segment.
Executing ?segment_element brings up this in the Help tab.
segment_element is the lowest hierarchy of segment creation, for which you will also need:
segment_define : AND combination of segmentFilters
segment_vector_simple or segment_vector_sequence
segment_element that are combined in OR lists for segment_vectors_*
So, segment element is within segment_vector_simple which is within segment_define [which is used while defining the data to pull]
Next up, ?segment_vector_simple
Usage
segment_vector_simple(segment_elements)
Arguments
segment_elements
A list of OR lists of segment_element
ok...so it's just wrapping segment_element inside the vector. Reading the comments from Dartistics code now.
# Create a segment vector that has just one element. See ?segment_vector_simple() for details. Note # that the element is wrapped in a list(). This is how you would include multiple elements in the # definition.
...
So, segment_vector_simple() lets you add 1 or more segment elements in the definition....in this case, it's just one element...
Defines the segment to be a set of SegmentFilters which are combined together with a logical AND operation. segment_define is in the hierarchy of segment creation that also includes segment_vector_simple() and segment_element()
Usage
segment_define(segment_filters, not_vector = NULL)
...
Honestly, I didn't get why the code has list in it...
my_segment_definition <- segment_define(list(my_segment_vector))
Moving on...
Ok, so this bit of the code is creating a variable called my_segment where the name of the segment in this case is "Landing page matches Regex GTM|FB" and the details of the segment are in the variable my_segment_definition.
# <whew>!!!
[line 50 of code....yes!!]
pulls the dimensions and metrics only where segment parameters are met.
head(ga_data) looks [almost] fine
Now, let's try plotting the data.
Tried running this code but got an error
What does that mean?
?geom_bar()
There are two types of bar charts: geom_bar makes the height of the bar proportional to the number of cases in each group (or if the weight aethetic is supplied, the sum of the weights). If you want the heights of the bars to represent values in the data, use geom_col instead. geom_bar uses stat_count by default
So, geom_bar uses count...which is what we don't want.
Found a solution on Stackoverflow tackling this.
This one executes!
Went back and re-read geom_bar() help and found out about geom_col(). From the geom_box help: If you want the heights of the bars to represent values in the data, use geom_col instead.
This led to another search query. How to tilt x axis labels in ggplot2...Again, StackOverflow
https://stackoverflow.com/questions/1330989/rotating-and-spacing-axis-labels-in-ggplot2
X axis title rotated at 25 degrees.
Ok, so we can see that two posts stand out in the last 30 days...GTM sending undefined value to GA and Difference in eng. rate between Ad Manager/FBInsights
Notice that the x axis label is cropped for the first label. Will try and search around for a solution for this and post in a separate blog post.
Full code below with the main chunk from Dartistics + my changes to segment_element and playing around with geom_bar / geom_col