Skip to content Goals for the Week
- To get familar with using poetry
- Be comfortable using headless browsers and drivers over urllib
- Start using pandas dataFrames to csv over dictionaries to csv
Accomplishments
- Scrapped weather data and cleaned it up to present it in a more readable format
- Scrapped quotes from website as well as checked their frequencies with the most popular tag list
- Scrapped clothes data from scrolling pages
Challenges Faced
- Challenge 1: Using headless browsers as well as installing the new tools involed with them
- Challenge 2: Stale element exceptions
- Challenge 3: Time out exceptions when using wait
Lessons Learned
- Lesson 1: dataFrames >>> dictionaries when storing data, also easier to convert to csv
- Lesson 2: Keep a list of links over getting the actual data when scraping from multiple pages to avoid stale element exceptions
- Lesson 3: options.add_argument(‘—headless’), and other options are very useful when using selenium
Goals for Next Week
- Goal 1: Avoid using bs4 as much as possible and instead use selenium to get the data
Action Items
- Action Item 1: Re-learn panda syntax