Use IMPORTHTML in Google Sheets + Google Analytics data to know website users per million city population
I got this idea from watching MeasureSchool’s video on BigQuery where the speaker talked about sending external population type data as a custom dimension to Google Analytics.
In this blog post, we’re going to use Google Sheets to send details about a city’s population and find out Google Analytics traffic [users] per 1 M city population. The main aspects of pulling this info are:
Use Google Sheets GA plug-in to run a report by cities [in this example, I’ve used US cities]. This is quite straightforward if you’re familiar with the plug-in.
Using IMPORTHTML formula in Google Sheets to pull the population information on key cities of US.
Finding a way to clean the data and then match the data, connecting the population and web traffic numbers by city.
Creating a Data Studio map chart type with a calculated metric to show Users Per 1M population to make comparisons between cities fairer.
Let’s go!
1. Google Analytics Google Sheets: Let’s skip this one. There' are lots of videos on basic features of a Google Analytics Google Sheets add-on. We will download Users as a metric along with city, country as dimensions with segment country ==United States.
2. Using IMPORTHTML in Google Sheets: Well, this is a pretty handy formula. The syntax is pretty much (link to page that contains the data, table/list depending on how data exists, index number to retrieve the data). In this case, it’s from Wiki page on cities of US, index 5 [it’s a bit of trial and error as there are smaller tables on the page]
=IMPORTHTML("https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population","table",5). All you need to do is open a new tab in the same sheet where you have Google Analytics report. You need to enter this formula in just one cell and it will automatically populate all columns/rows. No need to drag across.
3. Cleaning up the data: So, many of the city names came out with asterisks and [char within parentheses]. Link to the Google Sheet. https://docs.google.com/spreadsheets/d/1eQkYPYLcrgux12BEGpgyZ8lt5_MtUGt9XBUENgUa3k4/edit#gid=1163726676
In order to counter this, let’s add a new column B here: City Clean in order to remove the asterisks. This runs on the below formula:
=if(left(D2,1)="*", mid(D2,2,find("*",D2,2)-2),D2)
How does this formula work? It’s a nested formula using IF condition.
> IF the first character from LEFT formula for cell D2 [which has the city name from Wiki] is a *,
> THEN use MID formula for D2 where start extracting the text from position 2
> In order to know when to stop [Find the second * so that the city name is between the first and second ], use FIND function to find * starting from second character and then subtract 2 positions to know how long should the MID function run till.
> IF the above condition is met, it runs that formula, otherwise, it just picks the value in D2 and shows in B2.
Note: About 10 rows of data had “City name [x]”. I cheated and just manually pasted those ones.
Ok, now to get the traffic numbers in the Wiki Google sheet.
=iferror(index('ImportHTML - example'!C:C,match(B2,'ImportHTML - example'!B:B,0),0, index('ImportHTML - example'!C:C,match(B2,'ImportHTML - example'!B:B,0))))
What does this formula do? It runs an Index /Match to pull the website traffic numbers from the Google Analytics sheet. It looks long because it’s wrapped inside an IFERROR formula - which helps to remove “NA” values when the city is not found.
So, if city is not found, return a 0 , ELSE, find the traffic numbers.
4. Visualize in Google Data Studio: You can now connect your Google Sheet to Data Studio, create a calculated field where you find out the users per 1 M city population and use it in your Google Maps chart type.
Link to data visualization: https://datastudio.google.com/reporting/2c5b80ea-55ba-4c19-ad7a-e033051051a4
That’s about it. Hope you found this post helpful.
Here’s a link to another post that uses Google Sheets Geocode add-on to get lat/longs of addresses and create a Google map chart type in Google Data Studio.