adams.flow.transformer.SpreadSheetQuery

Name

adams.flow.transformer.SpreadSheetQuery

Synopsis

Applies a query (SELECT, UPDATE, DELETE) on a spreadsheet.
Variables are supported as well, e.g., : SELECT * WHERE Blah = @{val} with 'val' being a variable available at execution time.
The following grammar is used for the query:

expr_list ::= expr_list expr_part | expr_part;

expr_part ::= select | update | delete;

select ::= SELECT col_list [limit]
              | SELECT col_list WHERE cond_list [limit]
              | SELECT col_list ORDER BY order_list [limit]
              | SELECT col_list WHERE cond_list ORDER BY order_list [limit]
              | SELECT agg_list
              | SELECT agg_list GROUP BY col_list
              | SELECT agg_list HAVING cond_list
              | SELECT agg_list GROUP BY col_list HAVING cond_list
              ;

update ::= UPDATE SET upd_list
              | UPDATE SET upd_list WHERE cond_list
              ;

delete ::= DELETE WHERE cond_list
              ;

col_list ::= col_list COMMA col
              | col
              | SELECT NUMBER [subsample: <1 = percent; >= 1 number of rows]
              ;

col ::= *
              | COLUMN
              | COLUMN AS COLUMN
              ;

upd_list ::= upd_list COMMA upd | upd;

upd ::= COLUMN = value
              ;

order_list::= order_list COMMA order | order;

order ::= COLUMN
              | COLUMN ASC
              | COLUMN DESC
              ;

cond_list ::= cond_list cond
              | cond
              ;

cond ::= COLUMN < value
              | COLUMN <= value
              | COLUMN = value
              | COLUMN <> value
              | COLUMN >= value
              | COLUMN > value
              | COLUMN REGEXP STRING
              | COLUMN IS NULL
              | CELLTYPE ( COLUMN ) = "numeric|long|double|boolean|string|time|date|datetime|timestamp|object|missing"
              | ( cond )
              | cond:c1 AND cond:c2
              | cond:c1 OR cond:c2
              | NOT cond
              ;

value ::= NUMBER
              | STRING
              | PARSE ( "number" , STRING )
              | PARSE ( "date" , STRING )
              | PARSE ( "time" , STRING )
              | PARSE ( "timestamp" , STRING )
              ;

limit ::= LIMIT NUMBER:max
              | LIMIT NUMBER:offset , NUMBER:max
              ;
agg_list ::= agg_list COMMA agg
              | agg
              ;

agg ::= COUNT [(*)|(COLUMN)] [AS COLUMN]
              | MIN ( COLUMN ) [AS COLUMN]
              | MAX ( COLUMN ) [AS COLUMN]
              | RANGE ( COLUMN ) [AS COLUMN] (= MIN - MAX)
              | MEAN ( COLUMN ) [AS COLUMN]
              | AVERAGE ( COLUMN ) [AS COLUMN]
              | STDEV ( COLUMN ) [AS COLUMN]
              | STDEVP ( COLUMN ) [AS COLUMN]
              | SUM ( COLUMN ) [AS COLUMN]
              | IQR ( COLUMN ) [AS COLUMN]
              | INTERQUARTILE ( COLUMN ) [AS COLUMN]

Notes:
- time format: 'HH:mm'
- date format: 'yyyy-MM-dd'
- timestamp format: 'yyyy-MM-dd HH:mm'
- STRING is referring to characters enclosed by double quotes
- COLUMN is either a string with no blanks (consisting of letters, numbers, hyphen or underscore; eg 'MyCol-1') or a bracket enclosed string when containing blanks (eg '[Some other col]')
- columns used in the ORDER BY clause must be present in the SELECT part; also, any alias given to them in SELECT must be used instead of original column name

Additional information

Flow input/output:
- input: adams.data.spreadsheet.SpreadSheet
- output: adams.data.spreadsheet.SpreadSheet

Options

loggingLevel

The logging level for outputting errors and debugging output.

command-line -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST>

default WARNING

min-user-mode Expert
name

The name of the actor.

command-line -name <java.lang.String>

default SpreadSheetQuery
annotations

The annotations to attach to this actor.

command-line -annotation <adams.core.base.BaseAnnotation>

default
skip

If set to true, transformation is skipped and the input token is just forwarded as it is.

command-line -skip <boolean>

default false
stopFlowOnError

If set to true, the flow execution at this level gets stopped in case this actor encounters an error; the error gets propagated; useful for critical actors.

command-line -stop-flow-on-error <boolean>

default false

min-user-mode Expert
silent

If enabled, then no errors are output in the console; Note: the enclosing actor handler must have this enabled as well.

command-line -silent <boolean>

default false

min-user-mode Expert
query

The query to execute.

command-line -query <adams.parser.SpreadSheetQueryText>

default SELECT *

command-line	`-logging-level <OFF\|SEVERE\|WARNING\|INFO\|CONFIG\|FINE\|FINER\|FINEST>`
default	`WARNING`
min-user-mode	`Expert`

command-line	`-name <java.lang.String>`
default	`SpreadSheetQuery`

command-line	`-annotation <adams.core.base.BaseAnnotation>`
default

command-line	`-stop-flow-on-error <boolean>`
default	`false`
min-user-mode	`Expert`

command-line	`-silent <boolean>`
default	`false`
min-user-mode	`Expert`

command-line	`-query <adams.parser.SpreadSheetQueryText>`
default	`SELECT *`