Skip to content

News Feed Compiler

This project compiles a news feed from various news outlets for the last two weeks. It utilizes Python with Flask framework and stores the data in a SQLite database. The news feed is generated by scraping articles from popular news websites, including CNN, BBC, Reuters, The Guardian, AP News, and Wikipedia News.

Requirements

Before running the project, ensure you have the following software installed on your system:

Installation

  1. Clone the repository to your local machine.
  2. Install the dependencies using Poetry:
Terminal window
poetry install
  1. Run everything.py to generate everything.csv and then run init_db.py to create the news_websites sqlite database:

Running the Project

To compile the news feed from the last week, follow these steps:

  1. Execute the main Python flask script:
Terminal window
poetry run python news_web.py

Note: The news.py file contains the link scraper function, and individual news outlet scrapers are stored in separate files (e.g. BBC.py, CNN.py, Reuters.py, TheGuardian.py, APNews.py, WikiNews.py). The data is stored temporarily in CSV files (e.g. news_week.csv and theconversation.csv) before being merged into everything.csv.

Disclaimer

This project is intended for educational purposes only. The scraping of news websites may violate the terms of service of those sites. Always review and comply with the terms of service of any website you intend to scrape. Additionally, excessive scraping can put a strain on the servers of the target website, leading to unintended consequences. Use this project responsibly and with the utmost respect for the websites you are accessing.# News Feed Compiler

This project compiles a news feed from various news outlets for the last week. It utilizes Python with Flask framework and stores the data in a SQLite database. The news feed is generated by scraping articles from popular news websites, including CNN, BBC, Reuters, The Guardian, AP News, and Wikipedia News.

Requirements

Before running the project, ensure you have the following software installed on your system:

Installation

  1. Clone the repository to your local machine.
  2. Install the dependencies using Poetry:
Terminal window
poetry install
  1. Run everything.py to generate everything.csv and then run init_db.py to create the news_websites sqlite database:

Running the Project

To compile the news feed from the last week, follow these steps:

  1. Execute the main Python flask script:
Terminal window
poetry run python news_web.py

Note: The news.py file contains the link scraper function, and individual news outlet scrapers are stored in separate files (e.g. BBC.py, CNN.py, Reuters.py, TheGuardian.py, APNews.py, WikiNews.py). The data is stored temporarily in CSV files (e.g. news_week.csv and theconversation.csv) before being merged into everything.csv.

Disclaimer

This project is intended for educational purposes only. The scraping of news websites may violate the terms of service of those sites. Always review and comply with the terms of service of any website you intend to scrape. Additionally, excessive scraping can put a strain on the servers of the target website, leading to unintended consequences. Use this project responsibly and with the utmost respect for the websites you are accessing.