Overview
Dive into advanced Scrapy techniques in this 29-minute EuroPython Conference talk by Juan Riaza. Explore the anatomy of a Scrapy Spider, learn to use the interactive shell, and understand items and item loaders. Discover examples of pipelines and middlewares, techniques to avoid getting banned, and how to deploy Scrapy projects. Gain insights into 'Scrapy-fying' early on, parsing HTML, handling item exporters, and working with the Jungle Item. Uncover methods for visualization, open-source tools like Creepy and Demon, and learn about cloud-based solutions. The talk also covers API usage, Python integration, and provides recommendations for effective web scraping.
Syllabus
Intro
API
Python
HTML Parsers
Example
What I recommend
Interactive Cell
XKB Project
Item Loader
Item exporters
Jungle Item
Under the hood
Item Pipeline
Item Requests
Escapee
Visualization
How to avoid getting banned
Open Source
CreepyDemon
EscapeeCloud
About our products
Hiring
QA
Taught by
EuroPython Conference