JSL is often described as a scripting language. Personally I think that doesn’t do it justice. I prefer to think of it as a programming language. The difference? For me an obvious difference is that instead of using hard-coded values I want to use variables. In particular I want to use variables to handle column references.
The Column Function: Generating Column References
The start point for scripting a JMP platform is to create the platform interactively and then save the associated script (from the red triangle: Save Script>To Script Window).
1 |
Bivariate( Y( :weight ), X( :height ) ); |
If I always want weight on the y-axis and height on the x-axis then I can use this script as-is. But this is where I make a distinction between scripting and programming. When I write programs, I want them to be general purpose, and that means replacing the column references with variables:
1 2 3 |
yCol = Column("weight"); xCol = Column("height"); Bivariate( Y( yCol ), X( xCol) ); |
If you have a data table that contains column names that are not programmer-friendly the JMP generated code looks slightly different:
1 |
Bivariate( Y( :weight ), X( :Name( "height / meters" ) ) ); |
But still the Column function can be used to generate a valid column reference from the literal name of the column:
1 2 3 |
yCol = Column("weight"); xCol = Column("height / meters"); Bivariate( Y( yCol ), X( xCol) ); |
This example is based on the Bivariate platform, the principle is general. Here is an example using Graph Builder:
JMP-generated code:
1 2 3 4 5 6 |
Graph Builder( Size( 536, 455 ), Show Control Panel( 0 ), Variables( X( :Name( "height / meters" ) ), Y( :weight ) ), Elements( Points( X, Y, Legend( 3 ) ), Smoother( X, Y, Legend( 4 ) ) ) ); |
Code generalised using variables:
1 2 3 4 5 6 7 8 |
yCol = Column("weight"); xCol = Column("height / meters"); Graph Builder( Size( 536, 455 ), Show Control Panel( 0 ), Variables( X( xCol ), Y( yCol ) ), Elements( Points( X, Y, Legend( 3 ) ), Smoother( X, Y, Legend( 4 ) ) ) ); |
Handling Send To Report Messages
If I perform graphical edits the JMP generated codes becomes significantly more complex. In the following example I have chosen to add grid-lines for both the x-axis and y-axis:
1 2 3 4 5 6 7 8 9 10 |
Graph Builder( Size( 536, 455 ), Show Control Panel( 0 ), Variables( X( :height ), Y( :weight ) ), Elements( Points( X, Y, Legend( 3 ) ), Smoother( X, Y, Legend( 4 ) ) ), SendToReport( Dispatch( {}, "height", ScaleBox, {Label Row( Show Major Grid( 1 ) )} ), Dispatch( {}, "weight", ScaleBox, {Label Row( Show Major Grid( 1 ) )} ) ) ); |
Notice in particular that the lines to ‘show major grid’ contain explicit references to the names of the columns. So these also need to be replaced with variables:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
xName = "height"; yName = "weight"; xCol = Column(xName); yCol = Column(yName); Graph Builder( Size( 536, 455 ), Show Control Panel( 0 ), Variables( X( xCol ), Y( yCol ) ), Elements( Points( X, Y, Legend( 3 ) ), Smoother( X, Y, Legend( 4 ) ) ), SendToReport( Dispatch( {}, xName, ScaleBox, {Label Row( Show Major Grid( 1 ) )} ), Dispatch( {}, yName, ScaleBox, {Label Row( Show Major Grid( 1 ) )} ) ) ); |
Here is another example, based on changing the font size for the axis labels:
JMP-generated code:
1 2 3 4 5 6 7 8 |
Bivariate( Y( :weight ), X( :height ), SendToReport( Dispatch( {}, "weight", TextEditBox, {Set Font Size( 12 )} ), Dispatch( {}, "height", TextEditBox, {Set Font Size( 12 )} ) ) ); |
Code generalised using variables:
1 2 3 4 5 6 7 8 9 10 11 12 |
xName = "height"; yName = "weight"; xCol = Column(xName); yCol = Column(yName); Bivariate( Y( yCol ), X( xCol ), SendToReport( Dispatch( {}, yName, TextEditBox, {Set Font Size( 12 )} ), Dispatch( {}, xName, TextEditBox, {Set Font Size( 12 )} ) ) ); |
Handling Lists of Columns
Some platforms take a sequence of column references:
1 2 3 |
Multivariate( Y( :Sepal length, :Sepal width, :Petal length, :Petal width ) ); |
In this example the multivariate analysis is performed on 4 variables. Whilst the code could be generalised using 4 variables to represent the columns the code wouldn’t be fully general: what if I wanted 5 variables in the analysis?
Structurally, from a programming perspective, I want to use a list data structure to contain the column references. But will JMP allow me to do that – there is only one way to find out:
1 2 3 |
Multivariate( Y( {:Sepal length, :Sepal width, :Petal length, :Petal width} ) ); |
This works! The curly braces {…} collect all of the column references together into a single list structure. Now I ought to be able to simple replace the explicit list with a variable reference:
1 2 3 4 |
myList = {:Sepal length, :Sepal width, :Petal length, :Petal width}; Multivariate( Y( myList ) ); |
Unfortunately, and perhaps surprisingly, this doesn’t work. We’ve done nothing wrong – everything is consistent with how computer code ought to work.
However, there is a curious feature of the scripting language – expressions are evaluated inwards, starting from the outside, whereas most languages evaluate inner parts first. There is a word for this – frustratingly I have forgotten it -does anyone know?
Armed with this knowledge I can explicitly ask JMP to evaluate the list variable before evaluating entire statement:
1 2 3 4 |
myList = {:Sepal length, :Sepal width, :Petal length, :Petal width}; Multivariate( Y( Eval(myList) ) ); |
In practice I would also want to write some code to populate the list:
1 2 3 4 5 6 7 8 9 10 |
dt = Open("$SAMPLE_DATA/Iris.jmp"); myList = {}; For (i=1,i<=4,i++, col = Column(dt,i); InsertInto(myList,col) ); Multivariate( Y( Eval(myList) ) ); |
Using List Variables
List of column references can also be used in places where you might not expect. Below is the JMP generated code for the distribution platform:
1 2 3 4 5 6 7 |
Distribution( Continuous Distribution( Column( :Sepal length ) ), Continuous Distribution( Column( :Sepal width ) ), Continuous Distribution( Column( :Petal length ) ), Continuous Distribution( Column( :Petal width ) ), Histograms Only ); |
Generalising this code whilst maintaining this code structure would be difficult. Fortunately a list variable can be used to achieve the same result (for most platforms):
1 2 3 4 5 6 7 8 9 10 11 |
dt = Open("$SAMPLE_DATA/Iris.jmp"); myList = {}; For (i=1,i<=4,i++, col = Column(dt,i); InsertInto(myList,col) ); Distribution( Column( Eval(myList) ), Histograms Only ); |
Dynamical Construction of Expressions
The above example of list usage doesn’t work with Graph Builder. Assume I want to plot multiple y-variables overlaid against a common x-variable. If I don’t know how many y-variables will be plotted then I have to resort to building up the appropriate expression using substitution methods (see this earlier post for a more detailed discussion):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
// data source dt = Open("$SAMPLE_DATA/Iris.jmp"); // user-specified list of y columns lstNumericCols = dt << Get Column Names(numeric); New Window("Select:", <<Modal, clb = Col List Box(<<SetItems(lstNumericCols)) ); lstColNames = clb << Get Selected; // construct a string of y-variable specifications yStr = ""; For (i=1,i<=NItems(lstColNames),i++, colName = lstColNames[i]; str = Eval Insert("Y(:^colName^,Position(1)),"); yStr ||= str; ); // use string substitution to construct the graph builder expression gbStr = Eval Insert(" Graph Builder( Show Control Panel(0), Variables(X(:Petal Width),^yStr^) ) "); // evaluation the expression Eval(Parse(gbStr)); |
One thought on “Handling Column References”