Wrote a small script to scrape a forum thread for images. Could be useful with a thread like this.
https://github.com/me2cool/forum-image-scraper
Code is bit messy for now...
The following property values should work for this (අත්-වැඩ) thread.....just replace the content in the input.properties file with below content
Code:[UserInput] # Host Site url, without an ending "/" hostSite = https://elakiri.com # Specific thread url. Define the complete thread without the ending "/" thread = https://elakiri.com/threads/%E0%B6%85%E0%B6%AD%E0%B7%8A-%E0%B7%80%E0%B7%90%E0%B6%A9.1791466 # Pagination handling, the text before the page number pageAppenderBefore = /page- # Pagination handling, the text after the page number pageAppenderAfter = # Pagination handling, some forums increment direct urls to pages by mulitiplying page number (e.g. page urls can be 1, 10, 20, 30) pageValueMultiply = 1 # We will start downloading from this page startPage = 1 # We will end the download from this page endPage = 20 # Should we also download FB image links? This takes little bit of extra time as we have to look for the direct image link in the source shouldDownloadSocialLinks = True # Ignore images below this size. This is defined in Bytes. 1000 Bytes is 1 KB. Default value here is 10KB. smallImageSize = 10000
If someone finds this useful....
Last edited:
..If someone gives it a try, let me know