etablesaw

Eclipse integration for the tablesaw data frame library for java

Home
Data frame editor
Viewing and plotting table data
Linking table providers and consumers
Xaw Scripting DSL

Xaw Scripting DSL

Although some data manipulation may be done in the editor and views, the power of the tablesaw library is unleashed by the xaw scripting DSL. The Xaw language is basically syntactic sugar for Java provided out-of-box by Xbase, some extra table and column-oriented operators and literal syntax and extension methods for reading and writing files and linking it to the table data registry.

The scripts are translated to Java in the context of the classpath of the projects they’re within, and can be executed within the workbench so they may consume table data from or provide table data to workbench parts.

A Xaw script typically load table data from one or more files, manipulates tables and columns, derive new tables and output the result. In addition, the integration with the table registry makes it possible to use intermediate and resulting table data as the source for views.

Tablesaw extensions

Underlying xaw are some extensions to the tablesaw library, to make it easier to work with. There are two kinds of extensions:

xaw scripts

A xaw script consists of a set of import statements (similar to Java’s), followed by a xaw statement declaring the qualified name of the script (and corresponding Java class) and any number of table (type) declarations, statements and (helper) function declarations.

table declarations define new typed table classes, that gives you type-safe access to columns and rows and improves code completion. Consider the following table declaration:

table tab1 {
   String name,
   double age
}

will generate subclasses of TypedTable and Row with type-safe accessor methods for name and age columns and values, e.g. the getNameColumn method in the TypedTable subclass will return a StringColumn, and the getAge method in the Row subclass will return a double.

The new TypedTable subclass may be used in variable declarations using the new table instance syntax (see below).

The xaw editor

The xaw editor is a standard editor as generated by Xtext. Upon save (and build), a Java class is generated for the script itself (subclass of XawBase), with additional Java classes generated for table types.

A notable feature is the ability to execute xaw scripts inside the workbench in the context of the enclosing project’s classpath.

First and foremost, this allows scripts to import/export tables from/to the table registry using the importTable and exportTable functions:

Note that when executed as normal Java classes outside the workbench, these methods use the file system, rather than the table registry.

Second, the project may have library dependencies besides tablesaw, e.g. SMILE for statistics and machine learning, and these will be available.

Third, standard output is captured by the console, so simplify programming and debugging.

Special xaw syntax

To allow creation of typed table instances, xaw provides the # <table-def> # syntax. <table-def> may just name a previously defined table type (see above) or inline the whole table type declaration. E.g.

var tab1 = # tab1 #

will declare a tab1 variable of type tab1 initialised to an (empty) tab1 instance. You can also inline the whole table type as follows:

val tab1 = # tab1: String name, double age # 

This will both declare the tab1 type and the tab1 variable as above. The table instance creation syntax may be used anywhere an expression is allowed. If used as the initial value in variable declaration, as shown here, the type name may be omitted, in case the variable name is used.

The contents of the new table instance may be provided in two ways. If nameColumn and ageColumn columns were prepared in advance, we could populate the table as follows:

val tab1 = # String name = nameColumn, double age = ageColumn #

Alternativel, we could fill the table with specific contents, where each element is an expression of the appropriate type:

val tab1 = # String name,   double age  #
| "Hallvard",   52|
| "Marit",      54|

Since there are specialised columns for time, both date and time of day, corresponding literals are supported:

var day = @16-11-1966 // LocalDate variable
var time = @11:38:05  // LocalTime variable

The @ character is also used for URL literals, e.g. @"https://hallvard.github.io/etablesaw". In many cases, a simpler variant may be used, e.g. the same literal could be written as @"hallvard.github.io/etablesaw. Since the double forward slash is used for comments, it was difficult to support the full URL syntax without the quotes.

Helper methods may be defined using the def keyword. These do not see top-level script variables, since the corresponding Java methods are static.

Special xaw operators

The underlying Xbase expression language supports operator overloading and custom operators. In addition to the standard Xbase operators like +, -, +=, -=, >> and <<, we have added & for evaluating a predicate, and &, |, &= and |= for handling (row) selections.

Here’s a list of all the overloaded operators, grouped on operand type(s):