Exegetic Analytics is a Data Science consultancy specialising in data acquisition and augmentation, data preparation, predictive analytics and machine learning. Our services are used by a range of industries from Education to Security, Food Delivery to Politics. Our consultants are based in Durban and Cape Town and we engage with clients all over the world. Our products and services are used by a multitude of industries including Aerospace, Education, Finance, Food and Transport.
Exegetic Analytics also offers training, with experienced and knowledgeable facilitators. Our courses focus on practical applications, working through examples and exercises based on real-world datasets.
All of our training packages include access to:
- our online development environment and
- detailed course material which participants will have continued access to even once the training has concluded.
For more information about what we do, you can refer to our website.
These are some of the companies who have benefitted from our trainning:
Take a look at our full list of courses to see what other training we have on offer.
If this proposal is of interest to you or you would like to hear more about what we do you can get in touch on email@example.com or +27 (0)73 805 7439.
2. Course Description
|Objectives||In this course you’ll learn how to use Python to selectively, systematically and automatically scrape data from websites.|
|Requirements||You are assumed to have have experience with Python. Familiarity with HTML and CSS will be an advantage.|
Return to our list of courses.
3. Course Outline
- Introduction to Web Scraping
- Structure of a web page
- Browser tools
- Status codes
- GET request
- The response
- Status code
- Query strings
- Request headers
- Beautiful Soup — Parsing HTML
- Navigating HTML
- Using tag names
- Parents, siblings, descendants and children
- Various forms of
- Scrapy — Creating a Spider
- What is a spider?
- Scrapy shell
- Writing spiders
- Gathering links
- Extracting and storing data
- Following links and recursion
- Spider patterns
- Selenium — Scraping dynamic sites
- Selenium on Docker
- Debug images and VNC connections
- Locating elements
- Waiting (explicit and implicit waits)
- Selenium on Docker