Two years ago, Google introduced the Indexing API with the intent of solving an issue that affected jobs/streaming websites – having outdated content in the index. The Google Developers documentation says:
“You can use the Indexing API to tell Google to update or remove pages from the Google index. The requests must specify the location of a web page. You can also get the status of notifications that you have sent to Google. Currently, the Indexing API can only be used to crawl pages with either job posting or livestream structured data.”
Many SEOs are using Indexing API also for non-job-related websites and that’s why I decided to build an R script to try out the API (and it worked).
I’m going to show you how the script works, but don’t forget that there is a free quota of 200 URLs sent per day!
Create the Indexing API Credentials
First of all, you have to generate the client id and client secret keys for the APIs.
Open the Google API Console and go to the API Library.
Open the Indexing API page and enable the API. Then go to the Credentials page and there you’ll find your credentials.
googleAuthR package & Indexing API options
I created the script using the googleAuthR package, which allows you to send requests to Google APIs.
The code takes in input, a character vector of maximum 200 URLs and returns in a data frame the response of the API (see the screenshot below).
You can use the script to update or delete pages. You just have to change the line 38:
- Use type = “URL_UPDATED” if you have to update the page
- Use type = “URL_DELETED” if you have to remove the page
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
# LAST UPDATE: 20-03-2021 # Install & Load the packages install.packages("googleAuthR") install.packages("tidyverse") install.packages("readr") library("googleAuthR") library("tidyverse") library("readr") # Set credentials and scope clientId <- "PASTE HERE YOUR CLIENT ID" clientSecret <- "PASTE HERE YOUR CLIENT SECRET" scope <- "https://www.googleapis.com/auth/indexing" options("googleAuthR.client_id" = clientId, "googleAuthR.client_secret" = clientSecret, "googleAuthR.scopes.selected" = scope, "googleAuthR.verbose" = 0 # Not mandatory - I just use it to debug the script ) # Google API OAuth gar_auth() # List of URLs - Daily Limit of 200 URLs urls <- read_csv("~/Desktop/Your-file-name.csv") urls <- urls[ ,1] # indexingApi function - you can use the function to send requestes to the indexing API using urls vector as an input # It also GET the response from the API and stores it in a data frame indexingApi <- function(page) { body <- list( url = page, type = "URL_UPDATED") f <- gar_api_generator("https://indexing.googleapis.com/v3/urlNotifications:publish", "POST") result <- f(the_body = body) result <- as.data.frame(result[[6]][[1]][[2]]) return(result) } # The API responses are stored in a data frame APIResponse <- map_dfr( .x = urls, .f = indexingApi) # You can download the API responses as .csv file write.csv(APIResponse, "Your-file-name.csv") |
Is this code using a regular comma-separated values (csv) file? I keep on getting “Error: API returned: Invalid JSON payload received. Unknown name “url” at ‘url_notification’: Proto field is not repeating, cannot start list.” if I add more than just one URL to the file.
Thanks for your comment!
I realized that one line of code was missing. You can find the updated code snippet above, it should work now 🙂
I have got a poblem
Error 400: redirect_uri_mismatch
How can i fix it?
Hi Nick,
It looks like that you did something wrong when you setup the API. https://stackoverflow.com/questions/11485271/google-oauth-2-authorization-error-redirect-uri-mismatch I would check the settings in the API console.
Hi Ruben, amazing guide, as usual! 🙂
In case the read_csv won’t work, I’ve found a workaround using: urls <- file.choose()
Thank you Alessandro! Yes, there are multiple ways to import different files into R
Hi Ruben, thanks for you great work ! Despite the fix you’ve made I still receive an error when trying to load more than just one URL from the csv file : “Error : API returned: Invalid JSON payload received. Unknown name “url” at ‘url_notification’: Proto field is not repeating, cannot start list.” 🙁
Hi Vince, are you sure that the .csv file is in the correct format? Because for me it’s working fine.