This is step 5 in the creation of the oneway advisor. In the previous step code was produced for testing whether the data within each level of the grouping (X) variable were normally distributed.
In this step code will be developed to determine whether the residuals are normally distributed.
The code will have the same structure as prior step: a user-defined function will be implemented to perform the test and return a result as a p-value. The main code will set the status icon based on the value of this p-value, and a tool-tip will be implemented to provide a description if the user hovers the mouse over the icon.
Testing for a Normal Distribution
First I want to develop a function that will test whether a set of data contained in a data table column is normally distributed. To do this interatively in JMP I would perform the following steps:
- Analyze>Distribution
- Continuous Fit>Normal
- Fitted Normal>Goodness of Fit
This sequence generates the following JMP-generated script:
1 2 3 4 5 6 |
Distribution( Continuous Distribution( Column( :height ), Fit Distribution( Normal ) ) ); |
This code can be made more flexible by explicitly adding a table reference (dt) and a column variable (col):
1 2 3 4 5 6 |
dist = dt << Distribution( Continuous Distribution( Column( Eval(col) ), Fit Distribution( Normal ) ) ); |
Note also that the Distribution object is assigned to the variable dist – this is required in order to extract results from the report window.
Extracting Results from the Report Window
Sending the message Report to the distribution object (dist) generates a reference (rep) to the window containing the results. This window is simply a composition of display boxes which can be referenced individually and manipulated by sending them messages (typically to interrogate their values e.g. “<<Get”).
1 2 3 4 5 6 |
rep = dist << Report; gof = rep["Goodness-of-Fit Test"]; tb = gof[TableBox(1)]; cb = gof[NumberColBox(2)]; stat = cb << Get; pValue = stat[1]; |
In the code above the p-value has been retrieved from number column box (cb) contained within a table box (tb) contained within an outline box (gof).
If you have been following the sequence of steps for the oneway advisor then you will recognise this code from the previous step.
Test Normal
The above code snippets can be combined into user-defined function Test Normal, that will take two arguments – a table reference and a column name, and return a single value, the p-value associated with the hypothesis that the data are normally distributed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
Test Normal = Function({dt,col},{Default Local}, dist = dt << Distribution( Invisible, Continuous Distribution( Column( Eval(col) ), Fit Distribution( Normal( Goodness of Fit( 1 ) ) ) ) ); rep = dist << Report; gof = rep["Goodness-of-Fit Test"]; tb = gof[TableBox(1)]; cb = gof[NumberColBox(2)]; stat = cb << Get; pValue = stat[1]; rep << Close Window; Return(pValue); ); |
To make this function available for use by the oneway advisor, the code needs to be added to the file Analysis Components.jsl.
Accessing Residuals Data
We now have a mechanism for testing whether the residuals are normally distributed – but we have no residuals!
When we perform modelling activities in JMP the residuals only become available to us if we choose to save them to the data table. Using the JMP scripting language this activity can can be automated by using the message Save Residuals.
Messages are “sent” using the operator “<<“. But to send a message we first need an object: the code below creates a oneway object then sends the message:
1 2 3 4 5 6 |
ow = dt << Oneway( Invisible, Y( Eval(yCol) ), X( Eval(xCol) ), ); ow << Save Residuals; ow << Close Window; |
Test Normal Oneway Residuals
Everything is now in place to create a user-defined function for testing whether the residuals associated with the oneway analysis are normally distributed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
Test Normal Oneway Residuals = Function({dt,yCol,xCol},{Default Local}, ow = dt << Oneway( Invisible, Y( Eval(yCol) ), X( Eval(xCol) ), ); ow << Save Residuals; ow << Close Window; nc = NCols(dt); colResids = Column(dt,nc); colName = colResids << Get Name; pValue = Test Normal(dt,colName); dt << Delete Column(colResids); Return(pValue); ); |
Notes:-
- Lines 9 and 10 – when the residuals are saved to the table they become the last column of the table., therefore the function NCols is used to determine the position of the residuals data.
- Line 12 – uses the Test Normal function that was defined earlier
- Line 13 – once the test has been performed the data can be deleted to restore the table to its original state
This function definition needs to be added to the file Analysis Components.jsl.
Invoking the Function
The code step4.jsl already has a block of code responsible for checking assumptions. This block can now be extended to include the test for normality of residuals:
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
// Check assumptions alpha = 0.05; dt = Data Table(dtName); arrPValues = Test Normal Each Level(dt,yColName,xColName); minPValue = Min( arrPValues<<Get Values ); If (minPValue<=alpha, btnNormalLevels << Set Icon(nsICONS:FAIL_ICON) , btnNormalLevels << Set Icon(nsICONS:PASS_ICON) ); strTip = "p-value=" || Char(Round(minPValue,4)) || ".\!N" || "This is the smallest p-value for all the tests \!N" || "(one for each level of the grouping variable). \!N" || "For a test of normality, small p-values imply \!N" || "that the data are not normally distributed."; btnNormalLevels << Set Tip(strTip); pValue = Test Normal Oneway Residuals(dt,yColName,xColName); If (pValue<=alpha, btnNormalResids << Set Icon(nsICONS:FAIL_ICON) , btnNormalResids << Set Icon(nsICONS:PASS_ICON) ); strTip = "p-value=" || Char(Round(pValue,4)) || ".\!N" || "For a normality test, small p-values imply that\!N" || "the data are not normally distributed."; btnNormalResids << Set Tip(strTip); |
Save the revisions as step5.jsl.
Wow cuz this is excellent work! Congrats and keep it up.