Web Scraping in Python

Learning to program  in   Python  is very easy. I think I made that perfectly clear in my article – resources for learning  Python  – a couple of months back. The language has a specific syntax that very much resembles the English language. It’s almost like you’re putting together real words to complete specific tasks.

I started learning  Python  because I wanted to expand my own development career to new languages, I was told it would be the easiest way, and it actually seems about right. By learning to program  in   Python , you become accustomed to the many aspects of programming, all of which are similar among the vast amount of programming languages.

You’ll find those who’re more interested in learning  Python  programming language for the sake of tinkering with  web  frameworks, but then there are those – presumably You – who’re more into  scraping  things from the  web , and then making that data look beautiful for everyone else to enjoy.  Python  is acclaimed as the perfect language to learn when it comes for quick and easy  web   scraping .

I recently published an article  on   web scraping  tools, in which I discuss some of the most popular  scraping  apps and tools that have a GUI (Graphics User-Interface), so in turn being very accessible to beginners and lesser educated developers. But, the feedback I received implied that I should make another post – dedicated specifically to  tutorials  on how to  scrape   in   Python . Here we are, ready to explore some examples of how to  scrape  the  web  using a simple  Python  script.

 Python   Web   Scraping  Resource

Jake Austwick has put together a great  tutorial  (resource) on how to get started with  scraping   in   Python . The whole  tutorial  is based (mainly) on two libraries: lxml, and Requests. Jake will guide you through the most common misconceptions and pitfalls that many young scrapers experience, but there is also plenty of sound advice to be found. Remember, if a platform has an API – it’s probably best to use that for gathering info, building a separate scraper can be time costly!

Extracting NBA data from ESPN

Right, nothing teaches better than practice, and tiny snippets! I feel this quick  tutorial  from Daniel Rodriguez is perfect for learning and seeing how quickly you can build a scraper to  scrape  anything you like. In this sample, Daniel is  scraping  some NBA player information from ESPN, alongside the information for player stats, the teams that are playing in the NBA right now, and also the game schedules.

 Web   Scraping  101 with  Python 

In this  Python   scraping   tutorial , Greg Reda is teaching us how to use lxml, and BeautifulSoup combined! The  tutorial  is for  Python  2.7 users, it’s a fairly low-level introduction for those who want to see how to select HTML elements, and how to put data back together using database libraries.

Simple  Web   Scraping  with  Python 

I really like this  tutorial , it’s small, but complex at the same time. Daniel Forsyth gives us some insight on how to  scrape  famous ticket selling websites for the latest tickets! Imagine that, being able to  scrape  tickets as soon as they come available! Surely, you could outperform some human behaviour, and perhaps even snag a ticket you’ve been meaning to snag for so long? Either way, great  tutorial  on how simple  Python  can be.

Fast  Scraping   in   Python  with asyncio

 Python  3.4 added a new asynchronous I/O module named asyncio (formerly known as Tulip). The asyncio module provides a new infrastructure with a plugabble event loop, transport and protocol abstractions, a Future class (adapted for use within the event loop), coroutines, tasks, threadpool management, and synchronization primitives to simplify coding concurrent code. — Dr Dobb’s

Here we have Georges Dubus takes us through the new  Python  module asyncio, the objective of his  tutorial  is to  scrape  a few torrents, and then sort them by their magnet links. Whether you use the scraper for yourself or not, it still has some value for those who’re just starting out.

 Web   Scraping   in   Python 

I hope this  tutorials  on how to  scrape  the  web  with  Python  are going to prove useful to you. I couldn’t find any more that were of bigger scope than a few lines of code, do you know of any good  scraping   tutorials  ( in   Python !) that I may have missed? Please, look in your saved links and drop a comment with what you’ve got, I’m sure the community will appreciate more resources.