Skip to content

Scrapy based project, crawling and scraping all images that saved into a 3-level folders (Girl/Album/img). Pandas is utilised for creating, reading and editing csv files. Selenium is utilised for getting JS-loaded elements.

Notifications You must be signed in to change notification settings

I-m-PhD/meituScrapy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

Aims at MEITU131.

Specification

Built on macOS 13.1
with os built-in Python: 3.9.6
utilising scrapy pillow pandas
some project also utilises selenium for JS-loaded elements

Contents

Contains three scrapy projects:

  • meitu_model starting with the page 女神大全
  • meitu_rank starting with the page 排行榜DIGG/总榜
  • meitu_rank_diy is similar to meitu_model while it collects the entire models including those invisible in the index page, sorts them according to the "score" (JS-loaded element in each model's page), and parses the top listed ones.

About

Scrapy based project, crawling and scraping all images that saved into a 3-level folders (Girl/Album/img). Pandas is utilised for creating, reading and editing csv files. Selenium is utilised for getting JS-loaded elements.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages