Trying to scrape a wedding gallery website for wedding images, what software would you suggest?

Arietty@jlai.lu · 4 months ago

Trying to scrape a wedding gallery website for wedding images, what software would you suggest?

echindod@programming.dev · 4 months ago

I’d probably use selenium. But that depends.

ExperimentalGuy@programming.dev · 4 months ago

Before scraping I would verify that there is no HTTP API that you can use to craft requests instead of scraping from the website. These might be higher quality than what you can scrape. If there is no easy to use http API, go to scraping then. I would generally consider scraping the last option, unless it’s a ridiculously easy website to scrape.

8263ksbr@lemmy.ml · 4 months ago

Puppeteer and playwright were not mentioned yet

Kissaki@programming.dev · edit-2 4 months ago

You didn’t even describe how it’s on the website.

I would use the webbrowser/Firefox save page functionality.

Or open the webbrowser dev tools and document.querySelectorAll('img') and get the URLs from it and use those.

Or Page info media tab.

Or dev tools network tab. To identify and use the image web requests.

Or use Nushell with query module enabled, and http get query html.

Or my own C# until.

But I suspect there’s Auth in play, so the only easy access is within the browser session?