1.6 KiB
1.6 KiB
Gousto Recipe Scraper
A Python script to scrape recipe data from Gousto's website and save it to a JSON file.
Prerequisites
- Python 3.7+
- Chrome or Chromium browser (for Selenium)
- ChromeDriver (will be installed automatically by webdriver-manager)
Setup
-
Clone this repository:
git clone <repository-url> cd gousto-scraper -
Create and activate a virtual environment:
# On Linux/MacOS python3 -m venv venv source venv/bin/activate # On Windows python -m venv venv .\venv\Scripts\activate -
Install the required packages:
pip install -r requirements.txt
Usage
Run the scraper with the following command:
python scraper.py
This will:
- Scrape recipe data from Gousto's website
- Save the results to
gousto_recipes.json
Options
--use-selenium(default: True): Use Selenium for JavaScript rendering--headless(default: True): Run browser in headless mode--max-pages: Maximum number of recipe pages to scrape (default: all)--output: Output JSON file path (default: gousto_recipes.json)
Example:
python scraper.py --max-pages 5 --output recipes.json
Output
The script saves the scraped data to a JSON file containing an array of recipe objects, each including:
- Title
- Description
- Ingredients
- Cooking time
- Nutritional information
- And more
Notes
- This script is for educational purposes only
- Be respectful of Gousto's website - don't make too many requests in a short period
- The website structure might change over time, which could break the scraper