Updates 2017/07/21

The new semester started last week, so I was busy with my students. Development has mainly happened around deeplearning4j and prediction support for Microsoft's deeplearning library CNTK.

Fixes

  • LoadBalancer: fixed losing of outer variables; uses a Flow control actor now internally for better encapsulation.

Changes

  • Added support for outputting relative paths with the FileSystemSearch source: LocalDirectorySearch, LocalDirectorySearchWithComparator, LocalDirectorySearchWithCustomSort, LocalFileSearch
  • Panels managed by the DisplayPanelManager get re-used via their unique ID now properly (eg when using a variable), not just when mergable. Allows out of order updates of sequence plots now.
  • The Min and Max transformers can return 1-based indices now.
  • Added support for ADAMS_LIBRARY_PATH environment variable to adams.core.management.Launcher: its content gets supplied to the JVM via -Djava.library.path (used for native libraries like CNTK, MKL).
  • adams-dl4j:
    • Added ability to DL4JTrainModel transformer for testing the model on a test set (split off the training data) and output the best model found so far, with associated statistic(s).
    • Added support for criteria to stop training to DL4JTrainModel rather than just having fixed number of epochs.
  • adams-weka: The WekaFilter transformer can make use of storage and source actor now for obtaining the actual filter to use, not just serialized file or the filter specification.

Additions

  • Added the ArrayNormalize array statistic, which normalizes an array to sum up to 1.0.
  • adams-cntk:
    • added support for applying CNTK models: CNTKModelApplier.
    • added spreadsheet writer for CNTK text file format: CNTKSpreadSheetWriter
    • added image feature generator: DefaultCNTK
  • adams-cntk-weka: Added pseudo-classifier that uses a pre-built model: functions.CNTKPrebuiltModel
  • adams-dl4j: Added transformer for randomizing dataset: DL4JRandomizeDataset
  • adams-imaging:
    • Added ScaleReportObjects transformer for scaling objects defined in reports.
    • Added example flow for training an OpenCV Haar cascade from annotated images: adams-imaging-opencv_train_haar.flow
  • adams-imaging-openimaj: added generic object detector class hierarchy, to be used by adams.flow.transformer.locateobjects.OpenIMAJObjectDetector
  • adams-spreadsheet: added class hierarchy for processors that work on the selected rows in a spreadsheet table, e.g., copying files using the filename from the specified column. Functionality available through SpreadSheetDisplay sink.

Updates 2017/06/30

A lot of effort has gone into deeplearning4j the last few weeks: upgraded to the latest version, support for random network generation (how doesn't want to avoid hyper parameter fiddling???) and instructions for using Intel's MKL libraries for speeding up model building.

Filters can be serialized from the Weka Investigator now as well and re-used with the filter called SerializedFilter.

Fixes

  • The FoorLopp source now skips the consistency tests if a variable is attached to at least one of the properties: lower/upper/step
  • Downgraded MySQL the driver to 5.1.42, after receiving java.sql.SQLNonTransientConnectionException: CLIENT_PLUGIN_AUTH is required exceptions when using 6.0.6 of the JDBC driver.
  • removed double quotes from default executable of JDeps and JMap control actors.
  • adams-dl4j:
    • The DL4JModelToJson and DL4JModelToYaml conversions now distinguish between Model and MultiLayerNetwork objects, to retrieve the correct configurations to convert.
    • The DL4JModelWriter sink ensures now that MultiLayerNetwork has been initialized to avoid errors.
  • adams-event: fixed forcing of variables in Cron standalone actor.
  • adams-net:
    • JavaMailSendEmail - using the javax.activation.DataHandler class with a URL didn't close the stream of attachements, resulting in locked files on Windows.
    • re-using existing sessions now: FTPConnection, SMBConnection, SSHConnection

Changes

  • The Exec source can output stdout and stderr at the same time, ignore process errors and supports a working directory for the process.
  • Boolean/Mathematical/StringExpression: added "str(...)" method for converting objects/numbers into strings: str(expr) = any object's toString() method; str(expr,numdec) = any number is output with at most numdec decimals after the decimal point (trailing 0s get chopped off); str(expr,decformat) = applies the format to the number using java.text.DecimalFormat
  • SelectFile and SelectDirectory now support output with forward slashes.
  • adams-dl4j:
    • Upgraded deeplearning4j to 0.8.0
    • DL4JTrainModel now as a monitor variable for resetting the model, allowing for training sequentially on multiple datasets.
    • Added instructions for using Intel MKL libraries to speed up processing.
    • Moved the InMemoryStatsListenerConfigurator to the new adams-dl4j-insight module.
  • adams-weka:
    • The Weka Investigator now allows filters to be serialized in the pre-process panel.
    • The PrincipalComponentsJ filter now has the option -simple-attribute-names, which generates attributes like PCA_1...n instead of compiling them from the other attribute names.

Additions

  • Added simple GUI tool for performing XSLT (XML, XSL and Output panel), available from the main menu under Maintenance.
  • adams-dl4j:
    • Added the CallableActorScoreListenerConfigurator iteration listener, which forwards the iteration count/score pair to a callable actor (eg for plotting).
    • Added conversion for turning DL4J datasets into spreadsheets: DL4JDataSetToSpreadSheet
    • Added conversion for converting spreadsheets into DL4J DataSets: SpreadSheetToDL4JDataSet
    • Added fake configurator, as it only retrieves model from storage: FromStorage
    • DL4JModelGenerator source generates model(s) using the specified generator scheme.
    • Added previews in the Preview browser for DL4J models in JSON and YAML
    • Conversions for recreating models from JSON and YAML: DL4JJsonToModel and DL4JYamlToModel
    • Conversion for creating actual model from configurator: DL4JConfiguratorToModel
  • New module: adams-dl4j-insight for providing insight in model building, which is not necessary when deploying models (avoiding bloat).
  • adams-dl4j-weka: added conversions WekaInstancesToDL4JDataSet and WekaInstanceToDL4JINDArray, using Mark Hall's code from the Weka package for DL4J.
  • adams-imaging: added the RandomBoundingBox left-click processor.
  • adams-spreadsheet:
    • added simple spreadsheet filtering framework via the SpreadSheetFilter transformer and the filter class hierarchy it uses. Initial filters: Normalize, Standardize.
    • The SpreadSheetInsertColumnPosition conversion inserts column position in string (eg BG), replacing the specified placeholder
  • adams-weka:
    • The WekaFilter spreadsheet filter allows to apply any Weka filter to a spreadsheet.
    • weka.filters.SerializedFilter is a meta-filter that applies a serialized, trained filter to the data (no further training required).

Updates 2017/06/12

A lot of work has been done on better integration of the deeplearning4j framework. Support for rsync within the flow was added as well, e.g., for syncing local files with ones on a cloud server.

Fixes

  • adams-dl4j:
    • DL4JCrossValidationEvaluator and DL4JTrainTestSetEvaluator now storing the model rather than the configurator in the container that they are forwarding.
    • DL4JDatasetIterator now fits preprocessor first if an instance of DataNormalization.
  • adams-weka: When changing the model file in the Investigator's classify/cluster tab now correctly resets any previously loaded model.

Changes

  • Using now processoutput4j library (https://github.com/fracpete/processoutput4j) for capturing the output from processes launched from within Java.
  • The following sources now have an additional conversion option to directly convert their output to a different type: Variable, VariablesArray, CombineVariables, StorageValue, StorageValuesArray, CombineStorage, StringConstants.
  • The IncVariable and IncStorageValue transformers can output the incremented value now instead of forwarding the input token.
  • MessageDigest can operate on arrays now as well, computing a single digest over all of them.
  • Upgraded lanterna to 3.0.0-rc1 (used for terminal-based user interfaces).
  • Added option to the CsvSpreadSheetReader to drop rows with too few/many cells: -skip-differing-rows
  • adams-dl4j:
    • added support for mini-batches to DL4JCrossValidationEvaluator, DL4JTrainModel and DL4JTrainTestSetEvaluator.
    • added support for listeners when training a model
    • added deeplearning4j-ui_2.10 dependency, to monitor training progress using InMemoryStatsListenerConfigurator
    • DL4JTrainModel/DL4JTrainTestEvaluation/DL4JCrossValidationEvaluator now store the final epoch number in the container (model/evaluation).
    • DL4JTrainModel allows incremental training now, outputting the model every X epochs (output interval).
  • adams-rats: added accepted/generated types of Rat input/output to additional information output displayed in the help screen.
  • adams-spreadsheet:
    • SpreadSheetSubset now supports R-like matrix subset expressions ,3:9 instead of specifying row and col ranges.
    • SpreadSheetSplitColumn now uses the header for the generated columns if it can be split into the same number of elements.
  • adams-weka:
    • The SpreadSheetToWekaInstances conversion can enforce STRING attributes now by using -1 as maxLabels.
    • The PartitionedMultiFilter2 (and therefore MetaPartitionedMultiFilter) now filters the data only once during the first batch, resulting in speed improvements.

Additions

  • With the adams.logging.Logging console application, you can connect to an ADAMS instance that is, for instance, running as a daemon/service, listening to its logging output. The logging.sh/logging.bat scripts start the listening application (just outputs the logging to stdout).
  • New boolean conditions for checking boolean flags: StorageFlagSet and VariableFlagSet.
  • Convenience transformer for setting a boolean flag in storage: SetStorageFlag.
  • Added new menu item Full expansion to the Flow editor for creating a fully expanded flow (i.e., pulls in all external actors).
  • Added FileTailer transformer for monitoring text files ala tail -f on Unix systems.
  • Added most of the functionality of the Remote Control Center GUI to the terminal-based interface (can be started up with adams.terminal.Main).
  • Added RightPad conversion for padding strings on the right-hand side.
  • New module adams-rsync for rsync support:
    • RSync - offers all (!) rsync options
    • Rsync4jRsyncBinary - outputs the rsync binary used by rsync4j library
    • Rsync4jSshBinary - outputs the ssh binary used by rsync4j library
    • SimpleRSync - commonly used rsync options
  • adams-dl4j:
    • added NormalizerMinMaxScaler and NormalizerStandardize dataset pre-processors for scaling numeric attributes.
    • added SimpleRegressionMultiLayerNetwork as an example for performing regression.
  • adams-twitter:
    • upgraded twitter4j to 4.0.6
    • Added TwitterUser transformer for retrieving information about a user.
  • adams-visualstats: added MOA-based CUSUM (cumulative sum) and Page-Hinkley test control charts.
  • adams-weka: added Kennard-Stone filter.

Updates 2017/05/12

Being busy with commercial ADAMS projects still result in ample number of improvements to the base ADAMS system. The last few weeks were no exception.

Fixes

  • The table model for spreadsheets now displays NaN, +/-Infinity as strings.
  • Spreadsheet writers that can use formatting now use 'NaN' and '+/-Infinity' strings for these numbers.
  • Fixed the forceVariables method for Tee/Trigger/LoadBalancer/WhileLoop and derived actors: internally used Sequence actor gets updated correctly now.
  • adams-net: FTPSend and SFTPSend now forward the successful filenames as the documentation says.
  • adams-dl4j: fixed handling of regression problems.

Changes

  • In order to make actor names unique, they now get appended by * (x)* with x being a number starting from 2
  • The SetVariable standalone/transformer can interpret the variable value now as boolean, string or mathematical expression, making it easier to compute new values.
  • Added ability to use custom dirs/jars for JDeps control actor instead of the application's classpath.
  • The CallableActorScreenshot control can forward screenshot as BufferedImageContainer now as well, not just storing it in a file.
  • The actorFile property can contain now programmatically set variables like flow_dir, enabling the include external actor derived actors to make use of a variable as well (relative to the main flow). Instead of attaching a variable to the property, you have to use mixed notation: @{flow_dir}/some.flow.
  • Added equal frequency calculation to the ArrayHistogram statistic.
  • The RandomNumberGenerator source can output arrays now.
  • With the ArrayHistogramRanges transformer it is possible to output the interval ranges that the ArrayHistogram statistic generates (easier than iterating through the header names of the generated spreadsheet).
  • Added support for restorable actors, ones that can write/read their state to/from disk during execution; currently supported by: EnterValue, EnterManyValues, SelectDirectory, SelectFile.
  • adams-spreadsheet:
    • SpreadSheetStatistic now supports column names in locations.
    • SpreadSheetExtractArray can output strings now as well, instead of just the native cell object type.
  • adams-weka:
    • WekaInstancesStatistic now supports attribute names in locations.
    • The WekaGeneticAlgorithm transformer can be initialized from a WekaGeneticAlgorithmInitializationContainer container now, containing algorithm and training data.
  • adams-spectral-2dim got renamed to adams-spectral-2dim-core.

Additions

  • Added HasClass boolean condition that checks whether the specified class is available on the classpath.
  • Added StringExpression source and transformer for evaluation string processing expressions, like left(upper("Hello World!"), 5).
  • Added meta-marker paintlet ByNameMarkerPaintlet that matches the name of the sequence against the supplied regular expression to determine whether to paint the markers or not.
  • With the ArrayHistogramRanges transformer it is possible to output the interval ranges that the ArrayHistogram statistic generates (easier than iterating through the header names of the generated spreadsheet).
  • adams-pdf: added MetaHeadline PDF proclet to insert headline and then apply a base-proclet.
  • adams-spreadsheet: the SpreadSheetHistogramRanges transformer is the equivalent of ArrayHistogramRanges but for SpreadSheet objects.
  • adams-weka:
    • The WekaInstancesHistogramRanges transformer is the equivalent of ArrayHistogramRanges but for Instances objects.
    • Added support for using test data to the WekaGeneticAlgorithm transformer, but only Hermione takes advantage of it.
    • Added convenience transformer WekaGeneticAlgorithmInitializer to generate a WekaGeneticAlgorithmInitializationContainer container for priming a genetic algorithm.
  • Added some modules to the adams-spectral-base framwork:
    • adams-spectral-2dim-handheld contains support for some handheld NIR scanners, like the SCiO (https://www.consumerphysics.com/myscio/scio/).
    • adams-spectral-2dim-webservice adds webservice capability
    • adams-spectral-2dim-rats adds RATS support

Have a great weekend!

Updates 2017/04/21

With Easter out of the way, it is time for an update. A lot of incremental improvements as always, but noteworthy are the added associator support in the flow and Investigator, additional paintlets allowing for more sophisticated plots, and websocket support.

Fixes

  • Debugged flows that open another tab with the fully expanded flow (e.g., when using external actors) now show their variables correctly in the tree.
  • adams-weka: The ByNumericValue row finder no longer skips non-missing values.

Changes

  • DOMToString conversion now allows specification of number of spaces for indentation.
  • JDBC URLs are now supported via Favorites in the GenericObjectEditor, e.g., used in the DatabaseConnection standalone actor (JdbcUrl type). The same applies to BaseURL, BaseURI, BaseHostname, EmailAddress ByteFormatString, DateFormatString and DecimalFormatString types.
  • The SetVariable standalone can initialize the specified variable now from an environment variable. Useful when overriding flow defaults when running a flow from a script.
  • When starting a flow, two boolean variables now get automatically added regarding the headless state of the execution: is_headless, has_gui
  • Rendering spreadsheets when debugging a flow now offers a search box as well.
  • adams-latex: Code generators now list in their help what packages they require.
  • adams-meka: upgraded to MEKA 1.9.1.
  • adams-ml: added ability to the ActualVsPredictedPlot sink to use meta-data color schemes for coloring in the data points based on meta-data.
  • adams-net:
    • Upgraded jsoup dependency to 1.10.2
    • The HttpRequest source now allows specification of request headers.
  • adams-spreadsheet: The SpreadSheetSetCell and SpreadSheetSetHeaderCell transformers now forward the sheet/row even if the columns/rows were not found, to avoid accidental stopping of processing of data.
  • adams-webservice:
    • The XML logging interceptors (in/out) now support pretty-printed XML and optional output to a log file.
    • Interceptor generators for SOAP messages now have an enabled flag to easily turn them on/off without having to replace them.

Additions

  • Added PrettyPrintXML conversion to make pretty-printing of XML easier (combines XMLToDOM and DOMToString).
  • Introduced the MetaDataColorPaintlet interface for paintlets (e.g., use in the SequencePlotter or SimplePlot sinks) that can extract the color to plot a sequence point in from its meta-data.
  • Added meta-paintlet ByNamePaintlet that only applies its base-paintlet if the ID of the sequence matches the specified regular expression.
  • Added meta-paintlet for drawing error data only if ID matches supplied regular expression: ByNameErrorPaintlet.
  • Enabled attaching of breakpoint to the selected actor while the flow is running using the Attach here... sub-menuitem of the Breakpoint menu in the Flow editor.
  • Added support for callable database connections with the CallableDatabaseConnection standalone. This standalone simply references an actual DatabaseConnection which is listed below a CallableActors actor. This allows the centralization of database connections in a flow and then simply referencing them where needed.
  • The ListCallableActors source actor outputs all the callable actors available within this particular scope. Useful when iterating over predefined setups (eg Weka classifiers).
  • The UpdateCallableDisplay control actor forces a refresh of the specified callable display actor whenever a token passes through.
  • adams-latex:
    • The ListRequiredLatexPackages flow processor outputs all the required packages found in a flow.
    • Added the MiniPage code generator.
  • adams-net:
    • Added the HttpRequest transformer which uploads the incoming string payload.
    • Added basic websocket support with WebSocketServer standalone and WebSocketClient sink actors.
  • adams-rats:
    • Added Rat output for websockets called WebsocketOutput.
    • Added a post-processing framework for webservice responses to the Webservice rat output. Allows handling of, e.g., error messages returned by the other end of the webservice.
  • adams-spreadsheet:
    • The ByNumericRange row finder uses array of BaseInterval ranges, i.e., using mathematical intervals like [-1.0;3.5).
    • Added row finder for finding a row that comes closest to a supplied value: ClosestNumericValue.
  • adams-weka:
    • Added the WekaBootstrapping transformer which performs bootstrapping on incoming Evaluation objects/containers and outputs a spreadsheet with each row representing the statistics collected from a subsample run.
    • The ClassifierErrors output if the Weka Investigator Classify tab now allows you to color in the data points using meta-data in conjunction with a meta-data-color scheme.
    • The ByNumericRange row finder uses array of BaseInterval ranges, i.e., using mathematical intervals like [-1.0;3.5).
    • Added basic associations support: WekaAssociatorSetup source and WekaTrainAssociator transformer.
    • Added new pluggable metric for regression: Bias (mean(predicted) - mean(actual)), R^2 ("r squared") for convenience.
    • Added the ability to perform train/validate/test evaluation to the classify tab of the Weka Investigator. The validation output gets inserted as separate history entry.
    • Added a dedicated Associate tab to Weka Investigator for building associators like Apriori.

Have a good weekend!

Cheers, Peter

Updates 2017/03/31

Many little improvements happened across the modules. But the major new additions are better support for generating LaTeX documents and the control center for remote commands. You can now easily launch and monitor flows on remote machines (as long as ADAMS is running and listening, of course). The main use case for the remote command framework is to manage ADAMS instances that run on servers as daemons/services, i.e., in the background without any user interface. Launching ADAMS through scripts as daemons/services has been possible for quite a while now, but the major drawback was to interact with these processes. This is now possible to some extent.

Fixes

  • The boolean condition of the Breakpoint dialog can be updated at runtime again.
  • The SetPlotContainerValue transformer now handles non-numeric values for X/Y as well, e.g., when using the SimpleFixedLabelTickGenerator.
  • Added some general post-processing to the actor suggestions in the Flow editor, eliminating silly suggestions.
  • Disabled menu items for running/debugging of a flow that is representing a debug view already.
  • adams-spreadsheet: removed implementation of generates() method in the SpreadSheetExtractArray transformer, as the output can be array or element-by-element (the transformer reported the wrong type).
  • The SequencePlotter sink now centers the only data point on X and Y now.
  • The SimplePlot/JFreeChart plugins for plotting row/column in a spreadsheet table no longer fail when selecting a subset of points while containing non-numeric data points.
  • The SequencePlotter and SimplePlot now take (optional) errors into account when generating min/max of X/Y for data points, fitting the complete plot into the viewport when using no fixed X or Y ranges.
  • adams-weka:
    • Modified loading of datasets in the Investigator to avoid file locking issues on Windows.
    • Introduced new Action button in the Weka Investigator to avoid the Save action being disabled when selecting more than one dataset and blocking all other actions.
    • The WekaNewExperiment now materializes any variables attached (e.g., datasets or classifiers) before forwarding the experiment object.
    • The WekaEvaluationValues transformer now works with fake Evaluation objects generated by the WekaSpreadSheetToPredictions transformer as well.
    • The SimplePlot/JFreeChart plugins for plotting row/column in instances table no longer fail when selecting a subset of points while containing non-numeric data points.

Changes

  • The DumpStorage/DumpVariables sources now support Properties/Map as output type as well.
  • Upgraded JDBC drivers:
    • sqlite to 3.16.1
    • hsqldb to 2.3.4
    • MySQL to 6.0.6
    • PostgreSQL to 42.0.0
  • adams-maps: Upgraded PostGIS JDBC driver to 2.2.1
  • adams-access: Upgraded MS Access driver jackcess to 2.1.6
  • adams-weka: The WekaSpreadSheetToPredictions transformer now allows to recreate fake Weka Evaluation objects for nominal classes, by using columns that store the class distribution.
  • adams-rats: added support for MANUAL mode for Rat actors; default mode is CONTINUOUS. Useful for one-off operations, actionable through the RatControl control interface.

Additions

  • Added conversions for generation key-value pairs: MapToKeyValuePairs and PropertiesToKeyValuePairs.
  • More number conversions: NumberToByte/Int/Long/Float/Double
  • Added transformer for removing a report value based on the supplied boolean expression: DeleteReportValueByExpression
  • Added sink for feeding into Java logging framework: JavaLogging
  • Remote command framework:
    • New interface for managing remote command execution: Remote control center. Available from the main menu Program -> Remote commands sub menu.
    • Added command for sending flow control commands (pause/resume/stop/restart): SendFlowControlCommand.
    • Added command for stopping a remote ADAMS instance gracefully (can stop registered flows first): Stop
    • The RunRemoteFlow command loads a remote flow and executes it, optionally registers it.
  • New color providers for plots:
    • RegExpColorProvider: The plot names are matched against the regexp/color pairs and the regexp that matches determines the color. Useful if the data doesn't arrive in the same order, but you want to get the same color across multiple plots for better comparison.
    • RgbInNameColorProvider: Searches for RGB color definition in the plot name (eg for '#ff0000') which defines the color of the plot.
  • adams-compress: added support for decompressing RAR archives with UnRAR transformer.
  • adams-spreadsheet: added wrappers for array statistics to be applied to numeric spreadsheet row/column.
  • adams-weka:
    • Added the ClassifierCascade meta-classifier, which is inspired by deeplearning, but uses off-the-shelf classifiers as ensemble in each layer.
    • The WekaEvaluationPostProcessor allows the application of post-processors to the collected predictions (eg subrange evaluation).
    • Debugging now has an inspection handler for Weka Evaluation objects.
    • The WekaEvaluationInfo transformer outputs basic information for a Weka Evaluation object.
  • adams-latex: added proper support for generating and compiling LaTeX documents (including BibTeX):
    • NewLatexDocument - creates a new document
    • LatexAppendDocument - appends to the document using a specified code generator
    • LatexCloseDocument - closes a document
    • LatexCompile - copmpiles an LaTeX document (automatically compiles BibTeX files as well)
  • adams-rats: added remote commands GetRatControlStatus and SendRatControlCommand to remotely control Rat actors in flows.
  • adams-random:
    • ArrayRandomize transformer randomizes a copy of an array.
    • The RandomNumberGenerator transformer replaces the incoming token with a random number.
  • adams-spectral-2dim: Added transformer for removing a report value based on the supplied boolean expression: DeleteReportValueByExpression

Have a good weekend!

Mailing list 2017/03/15

The following came through on the ADAMS user mailing list:

Peter, what is your procedure when building ADAMS flows? How do create such novel flows fast? Is there any intuitive steps that help when designing flows in ADAMS to build flows smoothly?

I thought, you might be interested in that, too.

Here is my reply:

Being the main author of ADAMS, it is easy for me to remember what actors I've developed and what they do. ;-)

But here are some strategies:

1. Break down the problem

Like with any other programming language, you need to break down a problem into smaller steps.

E.g., evaluating a Weka model can be broken down into:

  • load dataset -- FileSupplier/SelectFile + WekaFileReader
  • set correct class attribute -- WekaClassSelector
  • evaluate -- WekaCrossValidationEvaluator + CallableActors at start of flow with the classifier you want to use
  • display results -- WekaEvaluationSummary + Display

2. Start with a small flow and grow it

I never set out to write a massive flow from the get go, I always work on little bits (maybe in separate flows) and then combine them, tweak/adapt them. Rearranging actors, encapsulating actors in other control actors is much easier than in other workflow systems, since you don't have to disconnect/reconnect the operators.

3. Make use of variables and internal storage

Variables and storage are extremely powerful tools. Variables can be used for changing actor options on-the-fly or for generating path names and other output. Storage is normally used when you want to re-use the object several times in a flow, e.g., the same dataset or evaluation object.

NB: For non-ADAMS objects (eg Weka classifiers), you can still change parameters, but it is a bit more cumbersome. The UpdateProperties control actor updates a property of the actor below it based on a property path through the object hierarchy, using the value from the associated variable. A property path is the concatenation of the "property names", e.g., for the WekaFilter actor with a PLSFilter, you'd use "filter.numComponents" to change the number of PLS components to use. Arrays can be navigated as well.

4. Use the debugger or output debugging information

Set Breakpoint actors or simply step through the flow to see what the value of the current token is, what values variables to storage items have. Like any other debugger, this is the most powerful tool figuring out what's going on within an application. A workflow is no different there. Outputting debugging or progress information is extremely useful, too. Just "Tee" off the current token and display it in a Display actor.

As a final word, the example flows that come with ADAMS are relatively small flows, demonstrating the usage of certain actors. The idea is to copy/paste the relevant bits into our own flows to build great applications. Even I quite often look up the usage of actors in my example flows, especially if it is stuff that I've written several years ago. ;-)

Updates 2017/03/06

The new year turned out to be extremely busy with loads of projects running parallel. Nonetheless, there are still a number of fixes and improvements happening.

Fixes

  • The HashSetInit transformer now stores the hashset in storage when initializing from array.
  • The EnterValue source uses BaseString now for the message/initial-value options, which updates variable occurrences correctly.
  • adams-weka: The Data and Instance tab now get correctly updated when an undo occurs.
  • adams-webservice: classes implementing the AlternativeUrlSupporter interface only attempt to instantiate the URL if provided parameter neither null nor empty now.
  • Added auto-detect-data-type flag to the SetReportValue transformers, previously on by default, now turned off.
  • adams-spreadsheet:
    • The ColumnSubset row score scheme now uses the correct row-index for applying the base row score algorithm; also uses more efficient views for the subset instead of creating copies.
    • Fixed native use of Object[] in LookUpAdd transformer.
  • adams-rats: The Rat and RatControl standalones now listen to flow state changes, i.e., get notified correctly when pausing/resuming the flow.

Changes

  • Added support for gzip-compressed report files (.csv and .report), including display in Preview browser.
  • SelectArraySubet now has an option to allow search the list.
  • Outsourced dynamic class discovery to jclasslocator library.
  • The storage tab in the control panel of the Breakpoint now has a popup menu for the table listing the storage items. It is now possible to display storage items in multiple dialogs.
  • Filters now get the flow context set if they implement the adams.flow.core.FlowContextHandler interface.
  • The HashSetInit standalone now allows initializing with string values.
  • The HashSet boolean condition now allows specifying of value to check rather than just using token.
  • Standalones now force a variable update in the preExecute method if any variables detected.
  • When debugging a flow, the copy of the debugged flow is no longer editable.
  • The Storage can be viewed in real-time in the Flow editor now as well, just like the variables (in the menu: Run -> Storage).
  • Logging messages that appear in the application's Console window (main menu -> Program), are now also logged to the project's home directory, e.g., $HOME/.adams/log/console.log.
  • The SequencePlotter now allows you to change the margins used on the axis as well, not just the ranges.
  • adams-spreadsheet:
    • SpreadsSheetExplorer now has an additional plot popup menu item for operating on the containers visible in the current viewport.
    • SpreadSheetMerge transformer now ensures that the specified column with the IDs has only unique IDs, throws an exception otherwise (in strict mode only).
  • adams-spectral-dim:
    • SpectrumExplorer now has an additional plot popup menu item for operating on the containers visible in the current viewport.
  • adams-weka:
    • InstanceExplorer now has an additional plot popup menu item for operating on the containers visible in the current viewport.
    • WekaInstancesMerge transformer now ensures that the specified attribute with the IDs has only unique IDs, throws an exception otherwise (in strict mode only).
    • Filters now get the flow context set if they implement the adams.flow.core.FlowContextHandler interface.
    • Weka Investigator
      • If loading of a serialized model (re-evaluate model in Classify and Cluster tab) fails, the error message displayed on the Start is now more expressive, e.g., stating that the file doesn't actually contain a serialized object.
      • Added Insert as dataset action to the data tab's tables to allow inserting selected rows as a new dataset.
  • adams-imaging:
    • added option to image feature generators for specifying a custom prefix for the feature names, e.g., to avoid duplicate names when using the same feature generator twice but with different parameters.
    • The Histogram feature generators can now group by channel rather than just by bin index.
  • adams-imaging-boofcv: reverted BoofCV back to 0.18 for the time being.
  • adams-imaging-imagej: The Histogram feature generator allows grouping of channels now.
  • adams-ml: ActualVsPredictedPlot now has option to supply a plot name, allowing multiple data lots in the same plot.
  • adams-rats: The RatControl standalone now allows stopping/restarting of individual Rat actors, not just pausing/resuming.

Additions

  • Added MergeReport transformer for merging a report with one obtained from storage or a callable source actor.
  • adams-imaging: added new Histogram feature generator that supports Gray, RGB, YUV, YIQ, HSV color models.
  • adams-ml: The ConfusionMatrix transformer generates confusion matrix in spreadsheet format from spreadsheet with actual/predicted labels.
  • adams-spreadsheet: added transformer for sorting spreadsheet columns called SpreadSheetSortColumns
  • adams-weka:
    • Added a file loader for ARFF files (SimpleArffLoader), to avoid file locking issues under Windows with the Weka one. Does not support incremental loading or relational attributes.
    • Added better support for experiments, making use of classes developed for the MultiExperimenter GUI tool (including multi-core support):
      • WekaNewExperiment (source)
      • WekaExperimentFileReader (transformer)
      • WekaExperimentExecution (transformer)
      • WekaExperimentFileWriter (sink)

Have a good week!

Updates 2017/01/27

The new year started mainly with a lot of work and, whenever I had time, with the upgrade of some libraries, like Weka and deeplearning4j.

Fixes

  • The Utils.doubleToStringFixed method now handles NaN and Inf values correctly.
  • Fixed flushing/closing of compressed serialized models (SerializationHelper class).
  • The CheckVariableUsage flow processor now excludes system-supplied variables like flow_dir and flow_id from the check.
  • adams-dl4j:
    • Added icon for DL4JModelReader transformer.
    • RecordReaderDataSetIteratorConfigurator now allows using 0 as minimum for the 0-based indices.
  • adams-weka: The WekaFileReader transformer now handles filenames without extension as long as there is a custom loader defined.

Changes

  • The Display and HistoryPanel sinks now have options for line-wrap and wrap-style-word.
  • The GridView standalone and the DisplayPanelGrid sink now allow the user to change the grid layout at runtime.
  • adams-ml-app:
    • added example flows/scripts for configuring deeplearning4j networks using Groovy and Jython.
    • added adams-imaging dependency to have basic image processing capability
  • adams-spreadsheet: batch import of spreadsheets now output a more detailed error message in case of BatchUpdateException exceptions.
  • adams-weka: Added a popup menu to the dataset table of the Investigator's Preprocess panel and added the Clear action to the menu for removing all datasets at once.
  • Dependency changes
    • Weka 3.9.0 (with patched FilteredClassifier)
    • Apache CXF 3.1.9
    • LIRE 1.0b2
    • deeplearning4j 0.7.2
    • CUDA 8.0 libraries for deeplearning4j
    • ImageJ 1.51h
    • BoofCV 0.26
  • adams-dl4j:
    • DL4JDatasetIterator source now has option to output full dataset instead of batches.
  • adams-spectral-2dim:
    • Added regexp option to CALSpectrumLoader Weka file loader to allow loading of only specific reference value(s).

Additions

  • Added ConditionalSequence control actor, the conditional version of the default Sequence actor.
  • adams-imaging: With the updated version of LIRE, additional feature generators are now available:
    • JointHistogram
    • LocalBinaryPatternsAndOpponent
    • RankAndOpponent
    • SimpleCentrist
    • SpatialPyramidAutoColorCorrelogram
    • SpatialPyramidCEDD
    • SpatialPyramidCentrist
    • SpatialPyramidCentrist
    • SpatialPyramidFCTH
    • SpatialPyramidJCD
    • SpatialPyramidLocalBinaryPatterns
  • adams-dl4j:
    • Added DL4JModelParamsToSpreadSheet conversion for extracting the parameters.
    • Added DL4JModelParamsToSpreadString conversion for extracting the parameters as simple string.
    • Added ImageScaler dataset preprocessor.
    • Added DL4JCrossValidationSplit transformer to generate sequence of train/test set containers.
    • The DL4JCrossValidationEvaluator transformer performs cross-validation on a referenced model using the incoming dataset.
    • The SpreadSheetRecordReaderConfigurator allows to read any spreadsheet that ADAMS can read. However, textual cells get converted to NULLs and date/time ones to their Java epoch equivalent.
    • The DL4JDatasetAppend transformer combines multiple datasets into a single dataset, one after the other
  • adams-rats:
    • Added the Storage and Variable rat inputs, for getting access to the specified storage item/variable.
  • adams-spreadsheet:
    • Added SpreadSheetToNumeric conversion for turning non-numeric cells in a spreadsheet into numeric ones.
    • Added Unique values column action to the SpreadSheetTable column popup menu to display the unique values of the selected column.
  • adams-spectral-2dim:
    • Condition for checking whether spectrum already in database: HasSpectrum.
    • Spectra are now rendered in the Breakpoint and can be exported as well.