DEV Community

Bum Kom
Bum Kom

Posted on

Crawler Web dev.to using Colly when learning Golang

I would like to recommend a website of mine that I made during my Golang learning.
My website http://techdaily.info is for learning golang language.
Besides crawling dev.to, I also crawl some other websites like freecodecamp.com, medium.com, hashnode.com, logrocket.com, infoq.com
So I built a website that specializes in crawling other sites
some technology that i used.

  • Golang
  • Colly
  • Nginx
  • Service
  • Docker
  • Mysql
  • Run action deploy to server
  • Cronjob daily crawl

Build Run Local

Change file app_example.yaml to app.yaml

cp app_example.yaml app.yaml 
Enter fullscreen mode Exit fullscreen mode

Build Docker

docker-compose up --build 
Enter fullscreen mode Exit fullscreen mode

Install package Golang

docker-compose exec crawl go mod tidy 
Enter fullscreen mode Exit fullscreen mode

Folder vendor

docker-compose exec crawl go mod vendor 
Enter fullscreen mode Exit fullscreen mode

Run Crawl

docker-compose exec crawl go run cmd/main.go 
Enter fullscreen mode Exit fullscreen mode

Use air autoload

docker-compose exec crawl air -c .air.conf 
Enter fullscreen mode Exit fullscreen mode

Deploy

Run file makefile build project into folder bin

make copy_template build_app_web build_app_crawl 
Enter fullscreen mode Exit fullscreen mode

Create Services in run in background

Create Service and Run App Web

sudo nano /lib/systemd/system/app_web.service 
Enter fullscreen mode Exit fullscreen mode

Copy Content

[Unit] Description=App Web [Service] Type=simple Restart=always RestartSec=5s WorkingDirectory=/root/actions-runner/crawl/crawl/crawl/bin ExecStart=/root/actions-runner/crawl/crawl/crawl/bin/app_web [Install] WantedBy=multi-user.target 
Enter fullscreen mode Exit fullscreen mode
sudo systemctl enable app_web sudo systemctl start app_web sudo systemctl status app_web 
Enter fullscreen mode Exit fullscreen mode

Run App Crawl

./app_crawl 
Enter fullscreen mode Exit fullscreen mode

Add CronTab

crontab -e 
Enter fullscreen mode Exit fullscreen mode

add cron time

*/60 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article */20 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article-detail 
Enter fullscreen mode Exit fullscreen mode

Reload cron run

sudo service cron reload 
Enter fullscreen mode Exit fullscreen mode

Website

http://techdaily.info/


"Buy Me A Coffee"

https://github.com/chieund/crawl

Top comments (0)