A neuron is a single node within a neural network. By analogy with neurons within the brain we can think of a neuron “firing” in response to an input trigger, and we can think of machine learning as the process of training the neuron to recognise that input trigger.
In my previous post I introduced the sample data table Pet Survey. I created a column formula to classify each respondent to determine whether they owned a cat, a dog, or both. In this simple example, there were signs of the problems that arise when processing unstructured text data. My classification of “dog” missed out responses referring to huskies; my classification of “cat” incorrectly included references to cattle. I looked at the Text Explorer platform and focused on the output contained in the lists of terms and phrases. In this post I want to focus on workflow: using the functionality within Text Explorer platform to gain meaningful insights into my data, and to answer specific questions.
In this post I will walk through some of the common tasks that are undertaken when we process unstructured text-based data. This will also give me the opportunity to introduce the terminology associated with text processing.
Traditionally statistical methods have focused on the use of numerical data, perhaps partitioned by classification data. A classic example of this would be oneway analysis of variance, or linear multiple regression containing classification variables that had been internally coded as integer values.
JSL is often described as a scripting language. Personally I think that doesn’t do it justice. I prefer to think of it as a programming language. The difference? For me an obvious difference is that instead of using hard-coded values I want to use variables. In particular I want to use variables to handle column references.
It’s been a while – so, since it’s Friday, here is a collection of Friday’s Functions … some of my favourite user-defined functions.
The JMP website has introduced a similar theme, JSL Cookbook, so probably I will change the tag associated with these posts to be JSL Cookbook instead of Friday’s Functions.
Most of the work associated with building a predictive model is associated with either performance tuning or data prepping.
I’m almost half way through prepping some data. It’s not necessary to script this but a script allows me to adjust the data preparation in the future and more importantly to document the sequence of steps that I have taken.
I was recently asked a question about updating display boxes. Display boxes are the building blocks of JMP output windows. Fundamentally there are two methods of updating these display boxes, which I will take a closer look at. (more…)
I’m sure there is a more technically correct term for this: I use the phrase segmented regression to describe the process whereby I select a segment of data within a curve and build a regression model for just that segment.
I have some code to aid the process. The code illustrates how to perform regression on-the-fly as well as how to utilise the MouseTrap function to handle mouse movement events.