Text Processing Workflow

In my previous post I introduced the sample data table Pet Survey.  I created a column formula to classify each respondent to determine whether they owned a cat, a dog, or both.  In this simple example, there were signs of the problems that arise when processing unstructured text data.  My classification of “dog” missed out responses referring to huskies; my classification of “cat” incorrectly included references to cattle.  I looked at the Text Explorer platform and focused on the output contained in the lists of terms and phrases.  In this post I want to focus on workflow: using the functionality within Text Explorer platform to gain meaningful insights into my data, and to answer specific questions.


Web Scraping TripAdvisor

Since writing this post I have placed the associated code on the
JMP File Exchange …

The problem with the internet is that it gives you too much information, or rather, it takes too long to gather the information.  I often cross reference hotel booking sites with TripAdvisor, and its a laborious process.  So this evening I decided to streamline my process by writing a script to gather to user reviews into a JMP table and simple report.