Package adams.parser

Class SpreadSheetQuery

  • All Implemented Interfaces:
    Destroyable, GlobalInfoSupporter, LoggingLevelHandler, LoggingSupporter, OptionHandler, SizeOfHandler, GrammarSupplier, Serializable

    public class SpreadSheetQuery
    extends AbstractSymbolEvaluator<SpreadSheet>
    Evaluates spreadsheet subset queries.

    The following grammar is used:

    expr_list ::= expr_list expr_part | expr_part;

    expr_part ::= select | update | delete;

    select ::= SELECT col_list [limit]
    | SELECT col_list WHERE cond_list [limit]
    | SELECT col_list ORDER BY order_list [limit]
    | SELECT col_list WHERE cond_list ORDER BY order_list [limit]
    | SELECT agg_list
    | SELECT agg_list GROUP BY col_list
    | SELECT agg_list HAVING cond_list
    | SELECT agg_list GROUP BY col_list HAVING cond_list
    ;

    update ::= UPDATE SET upd_list
    | UPDATE SET upd_list WHERE cond_list
    ;

    delete ::= DELETE WHERE cond_list
    ;

    col_list ::= col_list COMMA col
    | col
    | SELECT NUMBER [subsample: <1 = percent; >= 1 number of rows]
    ;

    col ::= *
    | COLUMN
    | COLUMN AS COLUMN
    ;

    upd_list ::= upd_list COMMA upd | upd;

    upd ::= COLUMN = value
    ;

    order_list::= order_list COMMA order | order;

    order ::= COLUMN
    | COLUMN ASC
    | COLUMN DESC
    ;

    cond_list ::= cond_list cond
    | cond
    ;

    cond ::= COLUMN < value
    | COLUMN <= value
    | COLUMN = value
    | COLUMN <> value
    | COLUMN >= value
    | COLUMN > value
    | COLUMN REGEXP STRING
    | COLUMN IS NULL
    | CELLTYPE ( COLUMN ) = "numeric|long|double|boolean|string|time|date|datetime|timestamp|object|missing"
    | ( cond )
    | cond:c1 AND cond:c2
    | cond:c1 OR cond:c2
    | NOT cond
    ;

    value ::= NUMBER
    | STRING
    | PARSE ( "number" , STRING )
    | PARSE ( "date" , STRING )
    | PARSE ( "time" , STRING )
    | PARSE ( "timestamp" , STRING )
    ;

    limit ::= LIMIT NUMBER:max
    | LIMIT NUMBER:offset , NUMBER:max
    ;
    agg_list ::= agg_list COMMA agg
    | agg
    ;

    agg ::= COUNT [(*)|(COLUMN)] [AS COLUMN]
    | MIN ( COLUMN ) [AS COLUMN]
    | MAX ( COLUMN ) [AS COLUMN]
    | RANGE ( COLUMN ) [AS COLUMN] (= MIN - MAX)
    | MEAN ( COLUMN ) [AS COLUMN]
    | AVERAGE ( COLUMN ) [AS COLUMN]
    | STDEV ( COLUMN ) [AS COLUMN]
    | STDEVP ( COLUMN ) [AS COLUMN]
    | SUM ( COLUMN ) [AS COLUMN]
    | IQR ( COLUMN ) [AS COLUMN]
    | INTERQUARTILE ( COLUMN ) [AS COLUMN]

    Notes:
    - time format: 'HH:mm'
    - date format: 'yyyy-MM-dd'
    - timestamp format: 'yyyy-MM-dd HH:mm'
    - STRING is referring to characters enclosed by double quotes
    - COLUMN is either a string with no blanks (consisting of letters, numbers, hyphen or underscore; eg 'MyCol-1') or a bracket enclosed string when containing blanks (eg '[Some other col]')
    - columns used in the ORDER BY clause must be present in the SELECT part; also, any alias given to them in SELECT must be used instead of original column name


    -logging-level <OFF|SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST> (property: loggingLevel)
        The logging level for outputting errors and debugging output.
        default: WARNING
     
    -env <java.lang.String> (property: environment)
        The class to use for determining the environment.
        default: adams.env.Environment
     
    -expression <java.lang.String> (property: expression)
        The spreadsheet query to evaluate.
        default: SELECT *
     
    -symbol <adams.core.base.BaseString> [-symbol ...] (property: symbols)
        The symbols to initialize the parser with, key-value pairs: name=value.
        default: 
     
    -reader <adams.data.io.input.SpreadSheetReader> (property: reader)
        The spreadsheet reader for loading the spreadsheet to work on.
        default: adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.SpreadSheet
     
    -input <adams.core.io.PlaceholderFile> (property: input)
        The input file to load with the specified reader; ignored if pointing to 
        directory.
        default: ${CWD}
     
    Author:
    FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Field Detail

      • m_Sheet

        protected SpreadSheet m_Sheet
        the spreadsheet to use.
      • m_Reader

        protected SpreadSheetReader m_Reader
        the spreadsheet reader for loading the spreadsheet.
    • Constructor Detail

      • SpreadSheetQuery

        public SpreadSheetQuery()
    • Method Detail

      • getGrammar

        public String getGrammar()
        Returns a string representation of the grammar.
        Returns:
        the grammar, null if not available
      • setReader

        public void setReader​(SpreadSheetReader value)
        Sets the spreadsheet reader that loads the sheet.
        Parameters:
        value - the reader
      • getReader

        public SpreadSheetReader getReader()
        Returns the spreadsheet reader that loads the sheet.
        Returns:
        the reader
      • readerTipText

        public String readerTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setInput

        public void setInput​(PlaceholderFile value)
        Sets the spreadsheet file to load, ignored if pointing to directory.
        Parameters:
        value - the input file
      • getInput

        public PlaceholderFile getInput()
        Returns the spreadsheet file to load, ignored if pointing to directory.
        Returns:
        the input file
      • inputTipText

        public String inputTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the GUI or for listing the options.
      • setSheet

        public void setSheet​(SpreadSheet value)
        Sets the underlying spreadsheet.
        Parameters:
        value - the spreadsheet
      • getSheet

        public SpreadSheet getSheet()
        Returns the underlying spreadsheet.
        Returns:
        the spreadsheet
      • loadSheet

        protected void loadSheet()
        Loads the spreadsheet from disk, if possible.
      • evaluate

        public static SpreadSheet evaluate​(String expr,
                                           HashMap symbols,
                                           SpreadSheet sheet)
                                    throws Exception
        Parses and evaluates the given expression. Returns the result of the mathematical expression, based on the given values of the symbols.
        Parameters:
        expr - the expression to evaluate
        symbols - the symbol/value mapping
        Returns:
        the evaluated result
        Throws:
        Exception - if something goes wrong
      • main

        public static void main​(String[] args)
        Runs the evaluator from command-line.
        Parameters:
        args - the command-line options, use "-help" to list them