Web Scraping TripAdvisor

Since writing this post I have placed the associated code on the
JMP File Exchange …

The problem with the internet is that it gives you too much information, or rather, it takes too long to gather the information.  I often cross reference hotel booking sites with TripAdvisor, and its a laborious process.  So this evening I decided to streamline my process by writing a script to gather to user reviews into a JMP table and simple report.

tripadvisor

I’m pretty happy with this for a couple hours of coding.  I can enter my usual search in the form of the hotel name and location, and I get a JMP table containing the user reviews.  From the table I display one review at a time – but can easily scroll through them with “next” and “previous” buttons.  And to make it more visually interesting I’ve added a word cloud.

Currently the code only picks out the first page of reviews (typically 10 to 15) – that’s probably enough but I’m going to take a look at scrolling through the pages – I probably wont read all the reviews but it will provide richer data for the word cloud.

I’ll tidy the code up a bit and pop it on to the JMP File Exchange.

Share the joy:
facebooktwittergoogle_pluslinkedinmailfacebooktwittergoogle_pluslinkedinmail

Leave a Reply