Files
gousto-scraper/README.md

1.6 KiB

Gousto Recipe Scraper

A Python script to scrape recipe data from Gousto's website and save it to a JSON file.

Prerequisites

  • Python 3.7+
  • Chrome or Chromium browser (for Selenium)
  • ChromeDriver (will be installed automatically by webdriver-manager)

Setup

  1. Clone this repository:

    git clone <repository-url>
    cd gousto-scraper
    
  2. Create and activate a virtual environment:

    # On Linux/MacOS
    python3 -m venv venv
    source venv/bin/activate
    
    # On Windows
    python -m venv venv
    .\venv\Scripts\activate
    
  3. Install the required packages:

    pip install -r requirements.txt
    

Usage

Run the scraper with the following command:

python scraper.py

This will:

  1. Scrape recipe data from Gousto's website
  2. Save the results to gousto_recipes.json

Options

  • --use-selenium (default: True): Use Selenium for JavaScript rendering
  • --headless (default: True): Run browser in headless mode
  • --max-pages: Maximum number of recipe pages to scrape (default: all)
  • --output: Output JSON file path (default: gousto_recipes.json)

Example:

python scraper.py --max-pages 5 --output recipes.json

Output

The script saves the scraped data to a JSON file containing an array of recipe objects, each including:

  • Title
  • Description
  • Ingredients
  • Cooking time
  • Nutritional information
  • And more

Notes

  • This script is for educational purposes only
  • Be respectful of Gousto's website - don't make too many requests in a short period
  • The website structure might change over time, which could break the scraper