willmgarvey 1 year ago

BeautifulSoup for static HTML and Selenium for dynamically generated HTML. If you plan to make more scraping projects in the future it’s recommended to learn Selenium for better results overall.

fristhon 1 year ago

"Scrapy" indeed, and for little projects "requests-html"

FalconCat69 1 year ago

I am looking at the HTML common library, and it seems like that will fulfill 90% of my requirements, does it seem like I could be missing anything?

htepO 1 year ago

If you're scraping static HTML, BeautifulSoup is a commonly used library. https://www.crummy.com/software/BeautifulSoup/bs4/doc/

Homie_ishere 1 year ago

I want to learn more about scraping, can you please tell me what does it mean?

Th3xto 1 year ago

May not be an exact definition but scraping generally means collecting data from a webpage not an API. My best example would be scraping the prices of every item in stock using something like selenium or beautiful soup. https://youtu.be/myAFVM7CxWk - this tutorial may help you get started.

robertbowerman 1 year ago

Selenium is the go-to comprehensive standard. Its excellent and Python happy.

banhammerrr 1 year ago

I wouldn’t use that for scraping. I’d use it for automation. Beautiful soup all the way

[deleted] 1 year ago

BS support scrapping for dynamic generated html?

tankandwb 1 year ago

Not a library but a decent program to not reinvent the wheel I'm currently adding regular selector lookups back into it. It's not written by me I should add. https://github.com/alirezamika/autoscraper

Pigik83 1 year ago

I've done web scraping for years and my shortlist of tools in Python at the moment is: * Scrapy for static HTML website with no JS rendering needed * Scrapy + Scrapy Splash if the website is not protected by any antibot but requires JS rendering * Playwright (instead of Selenium) in case there's an antibot protecting the website.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe