Documentation for WeHeard Web crawler
April 12, 2024
WH web-crawler
Installation
Crawler
Ensure you have the required .env
file
git clone git@github.com:hoshistech/weheard-crawler.git ( This way because I use SSH :) )
cd weheard-crawler
source venv/bin/activate ( this is to activate dependency environment in MacOS/Linux )
pip install -r requirements.txt
Input desired usernames into usernames.py
file
After External Dependency is installed : You can go ahead with this.
Check last section for external dependencies
In current working directory
cd main
python main.py
crawler should automatically add products to the database || give a response on account validity/error.
NOTE : API needs to be running to save the scraped products to the database.
API
Built with fastAPI - why? ( minimalistic: models and api routes all in a single file. )
cd main/api
source apienv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload ( API launches with hot-reload)