Skip to content

Weekly Log - [Week 0]

Goals for the Week

  • To get familar with using poetry
  • Be comfortable using headless browsers and drivers over urllib
  • Start using pandas dataFrames to csv over dictionaries to csv

Accomplishments

  • Scrapped weather data and cleaned it up to present it in a more readable format
  • Scrapped quotes from website as well as checked their frequencies with the most popular tag list
  • Scrapped clothes data from scrolling pages

Challenges Faced

  • Challenge 1: Using headless browsers as well as installing the new tools involed with them
  • Challenge 2: Stale element exceptions
  • Challenge 3: Time out exceptions when using wait

Lessons Learned

  • Lesson 1: dataFrames >>> dictionaries when storing data, also easier to convert to csv
  • Lesson 2: Keep a list of links over getting the actual data when scraping from multiple pages to avoid stale element exceptions
  • Lesson 3: options.add_argument(‘—headless’), and other options are very useful when using selenium

Goals for Next Week

  • Goal 1: Avoid using bs4 as much as possible and instead use selenium to get the data

Action Items

  • Action Item 1: Re-learn panda syntax