How to contribute
Thank you for your interest in contributing to the project! We are very happy to have you here.
How the project is organized
Project structure
The project is organized as follows:
flowchart
. --> docs
. --> travel_pricing_scraper
. --> tests
The project is divided into three main folders: docs, travel_pricing_scraper, and tests. Where each directory has its own purpose.
travel_pricing_scraper
The travel_pricing_scraper folder contains the source code of the project.
flowchart
. --> travel_pricing_scraper
travel_pricing_scraper --> scrapers
travel_pricing_scraper --> flights
travel_pricing_scraper --> models.py
travel_pricing_scraper --> cli.py
The CLI code is in travel_pricing_scraper/cli.py. The scrapers folder contains the code for the scrapers. The flights folder contains the code for the flights. The models.py file contains the code for the models used in the project. The docs of the API are in travel_pricing_scraper as well, with the use of mkdocstrings. and they follow the Google docstring style. so if you change the code, please update the docs as well.
About the lib
The library uses Playwright as scraper to fetch the data and returns the response as Flight that is a pydantic model dataclass with the following structure:
class Flight(BaseModel):
id: uuid.UUID
origin: str
destination: str
departure_date: datetime.datetime
arrive_date: datetime.datetime
price: float
carrier: str
The CLI
The CLI is in travel_pricing_scraper/cli.py and it uses typer to create the CLI. If you want to extend the CLI you can check their docs.
To rich response of the CLI, we use rich if you want to extend the CLI you can check their docs.
An object of Console from rich and app from Typer are passed to the main function in cli.py, it is important to keep this structure to keep the CLI consistent.
console = Console()
app = typer.Typer()
Tests
For the tests we use pytest. The tests are in the tests folder and the test configuration can be found in the file pyproject.toml
Important to note about the tests are that not all tests are only located in the travel_pricing_scraper/tests directory. The addopts = "--doctest-modules" flag is being used. So, if you modify something, be aware that docstrings also run tests and serve as the basis for API documentation, so be careful with changes.
The test coverage is automatically generated using pytest-cov, and it is displayed when the test task is executed:
task test
Similarly, the linters are prerequisites for these tests.
Docs
The entire documentation is based on the usage of mkdocs with the mkdocs-material theme.
flowchart
. --> docs
. --> mkdocs.yml
docs --> files.md
docs --> api
docs --> assets
docs --> templates
docs --> stylesheets
All the docs configuration can be found in the file mkdocs.yml in the root of the project.
To complement the documentation, we are using Jinja templates, which are located in the templates folder in places where instructions can be repeated, if you encounter blocks like:
{ % % }
{{ command }}
You can be sure that it is a jinja template.
The macros are being created using mkdocs-macros and are defined in the mkdocs configuration file:
extra:
vars:
project: travel_pricing_scraper
The assets folder contains the images used in the docs. The stylesheets folder contains the stylesheets used in the docs.. The files.md file contains the docs for the files in the project.
API documentation
The API documentation is generated using mkdocstrings and it is located in the api folder. The API documentation is generated from the docstrings of the code, so if you change the code, please update the docs as well. And the files in docs/apihas a tag:
::: modulo
This means that the code contained in the docstrings in this block will be used.
Tools
This project has some tools to help you to contribute to the project.
-
Poetry - Dependency manager
-
Taskipy - For automating routine tasks. Run the tests, linters, documentation, etc.
So in order to contribute to the project, you need to install poetry:
pipx install poetry
Steps to execute some tasks
Here are listed commands that you can use to perform common tasks. How to clone the repository, how to install dependencies, run tests, etc...
How to clone the project
git clone https://github.com/kalelmartinho/travel-pricing-scraper.git
How to install dependencies
poetry install
playwright install
How to execute the CLI
travels travels [subcommand]
How to execute the linter
task lint
How to execute the tests
task test
How to execute the docs
task docs
TODO
Improvements that can be made to the project.
- Add a
--headlessflag to the CLI.- So that the user can choose whether to run the browser in headless mode or not.
- Add a way to consult codes for airports
- Currently, the user needs to know the airport codes to use the CLI. It would be interesting to add a way to consult the codes for airports.
- Add scraping for other agencies
- Currently, the project only scrapes data from decolar.com. It would be interesting to add other agencies to the project.
To other improvements, you can check the issues
Continuous improvement
This document can be improved by anyone who has an interest in improving it. So feel free to provide more tips for people who want to contribute too