
Scraping data is powerful. But what's even better is automating where that data lands. If you're manually copying and pasting scraped info into spreadsheets, you're stuck in the past. It's time to streamline your workflow.
This guide walks you through saving scraped web data—specifically book details like titles, prices, and availability—directly into Google Sheets using Python. You'll learn how to scrape, organize, and store your data with minimal hassle.
Before diving in, make sure you have:
A Google Cloud account set up with Google Sheets API enabled.
A service account with Editor permissions and its JSON credentials file.
Python installed, along with these libraries: requests, beautifulsoup4, and gspread.
Use this command to install the libraries if you haven't already:
pip install requests beautifulsoup4 gspread
To interact with Google Sheets programmatically, you’ll need to:
Create a Google Cloud Project: Head to Google Cloud Console and spin up a new project.
Enable the Google Sheets API: Find it in the APIs library and enable it for your project.
Create a Service Account: Under IAM & Admin, create a service account with the Editor role.
Download Credentials: Grab the JSON key file for this service account and store it safely. You’ll need its file path soon.
If you get stuck, Google has solid documentation and plenty of tutorials walking through these steps.
Create a new Google Sheet—let's call it Book Data.
Now, share this sheet with your service account email (found in your JSON file under client_email). Make sure to grant Editor permissions so your script can write data.
Alternatively, you could make the sheet public, but sharing with the service account is more secure.
We'll scrape data from the Books to Scrape demo site—a perfect sandbox for practicing.
We want three key pieces of info from each book on the first page:
Title
Price
Availability
Here's a Python function that does just that:
from typing import List
import requests
from bs4 import BeautifulSoup
def extract_book_data() -> List[dict]:
url = "http://books.toscrape.com/"
response = requests.get(url)
book_data = []
if response.status_code == 200:
soup = BeautifulSoup(response.content, 'html.parser')
books = soup.find_all('li', class_='col-xs-6 col-sm-4 col-md-3 col-lg-3')
for book in books:
title = book.find('h3').find('a')['title']
price = book.find('p', class_='price_color').text
availability = book.find('p', class_='instock availability').text.strip()
book_data.append({
'title': title,
'price': price,
'availability': availability
})
else:
print(f"Failed to retrieve page: Status code {response.status_code}")
return book_data
Next, push your scraped data into the Google Sheet. The function below connects using your JSON key file and updates the sheet:
import gspread
def save_books_to_google_sheet(json_key_path, spreadsheet_url, books_data):
try:
client = gspread.service_account(filename=json_key_path)
sheet = client.open_by_url(spreadsheet_url).sheet1
# Write headers
sheet.update("A1", [["Title", "Price", "Availability"]])
# Write book details starting from row 2
for idx, book in enumerate(books_data, start=2):
sheet.update(f"A{idx}", [[book["title"], book["price"], book["availability"]]])
print("Data successfully written to Google Sheet!")
except Exception as e:
print(f"Error: {e}")
Here's how to run everything:
if __name__ == "__main__":
json_key_path = "path/to/your/service_account.json" # Update this
spreadsheet_url = "https://docs.google.com/spreadsheets/d/YOUR_SPREADSHEET_ID" # Update this
books_data = extract_book_data()
if books_data:
save_books_to_google_sheet(json_key_path, spreadsheet_url, books_data)
else:
print("No data scraped. Check the website or your connection.")
Now that you've learned how to scrape book data using Python and automatically save it to Google Sheets, you can streamline your workflow and eliminate tedious manual tasks. Automation not only boosts efficiency but also lets you focus on analyzing and using your data effectively. Give it a try, experiment with the code, and watch how easy managing your scraped data can become.