This is a scrapy pipeline that provides an easy way to store files and images using various folder structures.
Given this scraped file: 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg, you can choose the following folder structures:
Using the file name
class: scrapy-folder-tree.ImagesHashTreePipeline
full ├── 0 . ├── 5 . . ├── b . . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg Using the crawling time
class: scrapy-folder-tree.ImagesTimeTreePipeline
full ├── 0 . ├── 11 . . ├── 48 . . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg Using the crawling date
class: scrapy-folder-tree.ImagesDateTreePipeline
full ├── 2022 . ├── 1 . . ├── 24 . . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg pip install scrapy-folder-treeUse the following settings in your project:
ITEM_PIPELINES = { 'scrapy_folder_tree.FilesHashTreePipeline': 300 }