Updates 2018/01/12

I hope everyone had a great start to the new year. It's been a busy start here as there were some major changes in terms of Weka support within ADAMS.

The patched versions of Weka that ADAMS uses, are now available as separate branches from the following github repository:

waikato-datamining/weka

The adams-weka module has been cloned into adams-weka-stable, which will continue to use our patched version of Weka 3.9.0. adams-weka will use more up-to-date versions of the developer version of Weka from now on, starting with a patched version of 3.9.2.

The reason for this split was to be able to offer a stable Weka version to our commercial clients (not much fun if your serialized models are not compatible with a newer version and you have to rebuild all your models), but regular ADAMS users can still enjoy using the latest version of Weka. The spectral frameworks will continue to use the 3.9.0 version.

Another change happened, affecting the start up scripts for ADAMS. To avoid collisions with user's local installations of Weka and their respective packages, the Launcher class used by the start up scripts now sets a custom WEKA_HOME environment variable, pointing to a directory below the project's home directory ($HOME/.adams or %USERPROFILE%_adams), taking the Weka version into account. Once again, this avoids problems with conflicting Weka packages between different versions of Weka, when using different versions of ADAMS. This approach of separating Weka is something that I have used for my weka-virtualenv tool as well.

Fixes

  • adams-weka:
    • Fixed making predictions with the PLS1 instances analysis algorithm, used by the PLS supervised filter.

Changes

  • Added support for LONG to IncVariable and IncStorageValue transformers.
  • It is now possible to define a dialog width for the EnterManyValues source actor.
  • adams-weka:
    • Upgraded multisearch-weka-package to 2017.10.1
    • Upgraded partialLeastSquares dependency to 1.0.5
  • adams-groovy: removed Weka (and related) dependencies.
  • adams-jython: removed Weka (and related) dependencies.
  • adams-rsync: upgraded rsync4j to 3.1.2-7

Additions

  • Added better support for collections with the NewCollection source and CollectionInsert transformer.
  • Created copy of adams-weka as adams-weka-stable, with adams-weka now using Weka 3.9.2.
  • Created copy of adams-cntk-weka as adams-cntk-weka-stable, using adams-weka-stable dependencies.

Updates 2017/12/20

Last updates post before I'm getting stuck in making a new release...

A lot of things have happened in the image processing space again and a neat new feature in spreadsheet and instances tables (Spreadheet viewer, Weka Investigator) is being able to set filter strings or regular expressions per table column. This is column specific filtering, whereas the search box filters across the whole table.

Fixes

  • The compact format for saving flows no longer stores the default actors along-side (in some cases).
  • The Preview browser now transfers the search terms as well when opening a new window.
  • Date-based chooser panels: setting the current date value now before displaying the dialog and automatically disposing the dialog once closed.
  • adams-net: SimpleApacheSendEmail can handle attachments now.
  • adams-spreadsheet: In the LookUpUpdate rules, variables can contain "." now as well; fixed replacing of '' around variables (replaced [] instead).
  • adams-weka:
    • fixed rendering of missing values when sorting the data in the data tab of the Weka Investigator,
    • The evaluation statistics Bias, RPD, R^2 and SDR are now skipping missing value predictions.
  • adams-rats: The FileLister input now has a flag to turn on reporting of moving errors.
  • adams-video: The FixedIntervalBufferedImageSampler image sampler now handles the offset correctly when set to "-Inf".

Changes

  • Upgraded snake yaml to 1.19.
  • Menu items can have their category (ie the menu that they appear in) now reassigned via the Main.props file: Category-<classname>=<menu name>
  • Added the following programmatic variables to the flows: project_name, project_home, project_modules.
  • The Continue control actor got renamed to Block, which is a bit more intuitive in the flow environment (though it works like the continue control statement in Java).
  • Added placeholders for TODAY, TOMORROW, YESTERDAY (using the start of day) to BaseDate, BaseDateTime and BaseDateTimeMsec.
  • LocalScopeTransformer can now finish the processing of data before stopping (atomic execution).
  • The SequencePlotter plot popup menu now has a Hits submenu, which allows you to perform operations on the sequences that the mouse is hovering over.
  • The Flow editor has a new window operation for moving the current flow tab to a new editor window.
  • added MATH_EXPRESSION_ROUND enum to the SetVariable and SetManyVariables actors, which rounds the result to a long.
  • Any ADAMS panel with axis now allows copy/paste of axis range (in Range dialog), to speed up transfer of settings.
  • adams-spreadsheet:
    • The LookUpUpdate parser no longer transfers back local variables (ie ones starting with an underscore).
    • Added "CONNECTION" meta-data type for outputting URL, database, user, password to the DatabaseMetaData source.
    • Added support for supplying a filter (simple string or regexp) per column.
  • adams-imaging:
    • GetImageObjectIndices is more flexible now, by using an object finder for locating the objects of interest.
    • The ImageObjectFilter now allows you to apply an object filter to the images that the object finder located.
    • ScaleReportObjects has been deprecated, instead you can use the ImageObjectFilter in conjunction with the Scale object filter.
    • The generators of the SubImages BufferedImage transformer now limit the objects in a sub-image to the ones that fall into that region.
    • ImageObjectInfo transformer now accepts LocatedObject objects as well (eg as output by GetImageObjects).
    • upgraded zxing to 3.3.1 (used for barcode detection/generation)
  • adams-rats: The Rat standalone can now finish the processing of data before stopping (atomic execution).
  • adams-spectral-2dim-core:
    • The Rebase spectrum filter can change the step size between wave numbers now as well.
    • The plot popup menu of the spectrum panel now has a Hits submenu, which allows you to perform operations on the spectra that the mouse is hovering over.
  • adams-weka: Added support for supplying a filter (simple string or regexp) per column to table displaying Instances.

Additions

  • Added boolean condition for matching strings with a regular expression: RegExp.
  • adams-imaging:
    • added new class hierarchy for filtering objects stored in reports (e.g., annotations).
    • With the ChangeImageObjectPrefix transformer, you can replace the prefix of objects in the image's report. Allows to combine annotations and predictions in one report, using separate overlays when displaying the image.
    • GetImageObjects outputs all the objects located by the specified finder.
    • GetImageObjects outputs objects of type LocatedObject identified by the specified finder scheme, instead of their indices like GetImageObjectIndices does. Allows for querying meta-data etc via the ImageObjectInfo transformer.
    • ImageObjectOverlap computes the overlap of objects between two reports.
    • Added the following operations for the Draw transformer: ObjectCentersFromReport, ObjectLocationsFromReport
    • The ImageObjectIndexOffset transformer changes the offset of the objects in the report by the specified amount. Useful when merging multiple reports.
    • MultiQRCode barcode decoder allows detection of multiple QR codes in a single image.
    • The DownSample BufferedImageTransformer is a simple scheme for downsizing images.
    • Added the erode/dilate morphology operations.
  • adams-spectral-2dim-core:
    • The PredictionCheck transformer allows the validaton of the evaluations in an EvaluationContainer, e.g., checking whether they are within a predefined range.
    • spectrum filter for adding the output of a prebuilt Weka filter to the spectrum's report: WekaFilterToReport.

Stay tuned for the release!

Updates 2017/11/10

Currently, I'm working mainly on a commercial project that involves meta-flows, i.e., flows that use their parametrization to generate flows on the fly that do the actual work, aka worker-flows. Hence there being many tweaks and changes based around this approach, e.g., making it easier to load models into memory and use them directly from there.

Fixes

  • The TryCatch control actor now works again as expected within Rat, LocalScopeTransformer and LocalScopeTrigger actors. It also now properly stops its try and catch branches.
  • Fixed rendering of arrays in the Preview browser, now each element has its own renderer determined. Necessary for mixed object arrays. The PlainText handler no longer crashes if it cannot read a file, one with binary content.
  • The BaseDirectoryChooser dialog now correctly retrieves the currently selected directory for adding bookmarks. Also displaying a text field with the currently selected directory, enabling fast copy/paste.

Changes

  • All help screens in the GUI have been centralized into a separate Help frame, which keeps a history of the screens.

  • Remote commands can be sent and received now in JSON as well, using the JsonProcessor instead of the DefaultProcessor.

  • The Serialize transformer and Deserialize sink now take advantage of the object writer/reader class hierarchies for more flexibility in output/input formats.

  • Updated processoutput4j to 0.0.6

  • The Command source actor now has a timeOut option, which kills the process once exceeded.

  • adams-event: added template support for cron schedules: adams/core/base/CronSchedule.props

  • adams-weka:

    • The attribute index is now displayed in the table header of the WekaInstancesDisplay sink and the object renderer at debugging time.
    • Added support for updating properties via variables to: WekaAssociatorSetup, WekaClassifierSetup, WekaClustererSetup, WekaDataGenerator, WekaFilter, WekaStreamFilter
    • simplified model loading, centralizing it in the adams.flow.core.AbstractModelLoader class, now offering loading from file, source actor and internal storage: WekaFilter, WekaClassifying, WekaClustering
    • The evaluations of classifiers in the Weka Investigator no longer require class values to be present on data sets other than the training set. This allows to simply make predictions then.
  • adams-moa: simplified model loading, centralizing it in the adams.flow.core.AbstractModelLoader class, now offering loading from file, source actor and internal storage: MOAClassifying, MOAClustering, MOARegressing

  • adams-meka: simplified model loading, centralizing it in the adams.flow.core.AbstractModelLoader class, now offering loading from file, source actor and internal storage: MekaClassifying

  • adams-spectral-2dim-core:

    • The Cleaner transformer now adds the actual cleaner to the output container now as well.
    • The PostProcessor transformer can output a container now, which contains the input and output data, as well as the actual processor in use.
    • The spetrum's wave-numbers can be used as suffix now instead of the amplitude indices (affecting only the SimpleInstanceGenerator).
    • renamed IntensityImageSpectrumWriter to SpectrumImageWriter, uses new class hierarchy for image generators: adams.data.spectrumimage.AbstractSpectrumImageGenerator
  • adams-rsync:

    • updated rsync4j to 3.1.2-5.
    • added maxTime option to RSync and SimpleRSync sources, which kills the rsync process once exceeded.
  • adams-cntk-weka: The CNTKSaver converter now turns a nominal class attribute

    into CNTK's 1-hot encoding format (aka unsupervised NominalToBinary).

  • adams-imaging: The ImageAnnotator transformer now allows the manual selection of objects via a selection processor.

Additions

  • Added reader/writer for objects implementing adams.core.SerializableObject
  • The object export in the debug's object tree or the Flow editor's debug view now allows you to export any serializable object as well.
  • Added OutlierDetector transformer to directly tap into detection messages returned by detection scheme.
  • Added the SetManyVariables standalone and transformer, which allow to update multiple variables at the same time.
  • The DeserializeToStorage standalone simplifies loading serialized models into storage.
  • adams-cntk:
    • Added handler for CNTK models to the Preview browser
  • adams-cntk-weka:
    • Added Weka loader for CNTK text files: CNTKLoader
    • Added Weka saver for CNTK text file format: CNTKSaver
    • Added classifier for building a CNTK using a Brainscript file and using the final model for making predictions: functions.CNTKBrainscriptModel

Have a good weekend!

Updates 2017/10/13

It is the end of semester and I'm busy marking from my students and working on some commercial projects. However, there was still time to fix and improve things. The modules that received the most attention are centered around Microsoft's deep learning framework CNTK (https://cntk.ai/).

Fixes

  • The LocalScopeTransformer now outputs the name of the last active actor in the error message instead of the first one's, in case the last active actor is not an actual transformer.
  • If you should encounter a problem on MySQL trying to create a table with a timestamp column that uses as DEFAULT '0000-00-00 00:00:00', then remove the NO_ZERO_DATE directive from the sql_mode option in your my.cnf/my.ini. E.g., use the following: sql_mode=IGNORE_SPACE,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_IN_DATE,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
  • adams-weka: The SpreadSheetSaver Weka converter no longer loses the configured writer (setOptions now calls the super method first, as it calls resetOptions).
  • adams-cntk-weka: The CNTKPrebuiltModel classifier now handles the flexible structure of CNTK regression models better.

Changes

  • Upgraded lanterna (terminal applications) to 3.0.0.
  • Bumped up CNTK to 2.2.
  • The Preview browser now uses simple file monitoring to update the display (using the LastModified monitor).
  • Spreadsheet writers are now stoppable (at least the ones that write the data incrementally).
  • Introduced StopRestrictor interface for better stopping of flow or parts of flow, affecting the Stop actor, interactive actors and LocalScopeTransformer/LocalScopeTrigger.
  • Added has(sym) operator that checks whether a certain symbol is present, affecting BooleanExpression, MathematicalExpression and ReportMathExpression.
  • The Revert menu item in the Flow editor now detects if a file was modified on disk.
  • The ProgressBar sink now has an optonal title option for displaying a title string within the panel itself, which is useful when displaying multiple progress bars at the same time.

Additions

  • OptionProducer transformer for turning objects into option strings using the specified OptionProducer scheme.
  • The CloseCallableDisplay control actor allows closing of frames of callable actors.
  • Added the Command source for executing an external command: it is similar to the Exec source, but outputs stdout and/or stderr as continuous stream instead of waiting for the process to finish before outputting them.
  • adams-meta:
    • Added NewFlow for generating a flow using a template.
    • FlowFileReader reads a flow from disk and FlowFileWriter writes it to disk.
    • FlowDisplay simply displays the incoming flow as tree (like the flow editor).
  • adams-cntk:
    • Added a preference panel for CNTK, at this stage only for specifying the binary.
    • The CNTKSetup standalone allows overriding the global settings (ie binary) for CNTK within a flow.
    • With the CNTKBrainScriptExec flow it is possible to execute a brainscript and display the output. With the usual string processing facilities in ADAMS, you can also plot the performance of the network per epoch, e.g., plotting the RMSE.
    • Added reader for CNTK text files: CNTKSpreadSheetReader
    • Added CNTKModelReader and CNTKModelInfo for reading a model and outputting information about a model.

Have a great Friday 13th! ;-)

Move on Github

Apologies, the ADAMS repositories have moved again, but only from one GitHub organization to another. This was done mainly to align the repositories with our commercial activity. The organization account on GitHub is called waikato-datamining.

You can find the relocated repositories now here:

Don't worry, if you have already cloned the repository, you can simply update the URL in the .git/config file from:

url = https://github.com/Waikato/<repo>.git

to:

url = https://github.com/waikato-datamining/<repo>.git

OpenHub is back!

Thanks to the move to GitHub, the metrics on OpenHub are back!

https://www.openhub.net/p/theadamsflow

Over a year ago, OpenHub failed to update their copy of the source code via subversion. For some reason, they couldn't handle the anonymous access that was used by our in-house subversion server.

Anyhow, this is sorted now!

Updates 2017/09/15

The most notable change is that external actors now can have a file change monitor defined, which can trigger reloads of the external flow next time it gets executed. Useful for flows running in the cloud, which need to reload settings stored in external flows.

Fixes

  • The ImageViewer now clears selection listeners, left click listeners and image overlays when displaying an image.
  • Fixed suggestions for actor templates in the Flow editor. EndlessLoop template is now the default one.
  • The MultiPaintlet for sequences now lets the base paintlets set the correct stroke themselves.
  • adams-imaging: The MultiImageOverlay now use allows to select the base overlays again.
  • adams-spectral-2dim-handheld: The ScioLabExportSpectrumReader now handles the updated column names in the CSV export files (eg "spectrum_0 + 740" instead of "spectrum_740.0").
  • adams-webservice: The XMLLoggingInInterceptor now handles multi-part messages when pretty-printing the message itself.

Changes

  • Added a file change monitor to the external actors (ExternalStandalone/Source/Transformer/Sink), reloading the external flow at the next execution if the monitor triggered. The first execution of the actor is used for initializing the monitor.
  • Added on-the-fly flag for external actors using flows that get generated at runtime (eg by a script).
  • Added support for BUSINESSDAY in date/time expressions (BaseDate, BaseDateTime, BaseDateTimeMsec) and in ExtractDateTimeField conversion.
  • Most interactive actors now allow the selection of callable actor to use as parent component (e.g., for relative location); it is also possible to use the enclosing window (frame/dialog) as parent component rather than the callable actor (eg when the actor is inside a GridView/TabView).
  • adams-imaging:
    • The Image processor visualization now applies the same flow to all the tabs. Adding loading, saving of flow snippets. Zoom for all original/processed can be selected from the menu.
  • adams-latex: The LatexSpreadSheetWriter can output the header now in bold face.
  • adams-rats:
    • The FileLister rat input can output files as array now as well.
    • The generator used by LabRat now returns an array of Rat actors.

Additions

  • Added simple, single file change monitors, accessible through the FileChanged transformer.
  • Added transformer for updating the 'last modified' timestamp of files: Touch
  • Added PromptUser actor template.
  • adams-spectral-2dim-core:
    • added spectrum reader for MPS XRF files.
    • simple spreadsheet writer that outputs spectra with the specified spreadsheet writer: SpreadSheetSpectrumWriter
    • added several preview handlers for spectra
  • adams-rats: added generic scripting support for rat generators for LabRat, via Scripted generator.
  • adams-spreadsheet: added SpreadSheetSelectSubset interactive actor that allows the user to select a subset from a spreadsheet.

Move to Github

Finally came across a migration guide (subversion to git), one that actually worked:

https://git-scm.com/book/en/v2/Git-and-Other-Systems-Migrating-to-Git

Main issue with the migration was the sub-division of the repository into base/addons/incubator/etc. However, I decided to skip really old code and just start at a revision after the sub-division.

You can find the repositories now here:

Contributing should be much easier now, so go on, fork the repos! :-)

Updates 2017/09/01

Been working on an image processing project again, so more work done in related modules. Added some common statistics to the Weka module.

Fixes

  • fixed Variables.extractNames(expr) (used by the CheckVariableUsage actor processor and the green tick in the Flow editor's toolbar): no longer ends up in endless loop if invalid variable names are used in a string expression
  • The filter field of the BaseFileChooser now only works on files, not directories
  • The image overlays, left-click-listeners and selection listeners in the ImageViewer sink now get cleared whenever a token arrives (otherwise get multiplied in case of Inspect control actor)
  • The PromptUser boolean flow condition now expands variables in the message string as well.
  • Removed the cleanUp call in the doExecute method of the AbstractFilter transformer, to avoid problems with trainable filters.
  • Improved unique name generation in Flow editor: when copying Actor (2) it will now generate Actor (3) instead of Actor (2) (2).
  • adams-ml: The ConfusionMatrix transformer now always generates a square matrix, taking all the labels (actual/predicted) into account.
  • adams-weka: The Weka package manager is working again, after backporting some changes to the modified Weka 3.9.0 version that is currently in use.
  • adams-imaging: MergeObjectLocations now correctly performs a merge if there aren't any objects in the current report.

Changes

  • Updated jclasslocator to make ADAMS work with Java 9 (for automatic class discovery).
  • Updated jeneric-cmdline.
  • adams-weka:
    • The Predictions table for classification and the Cluster assignments table for clustering in the Weka Investigator are now searchable.
    • The Preprocess tab in the Weka Investigator now automatically selects the last opened/filtered dataset.
    • The WekaFilter transformer now stores the input data in the WekaFilterContainer (when outputting containers) as well.
    • WekaCrossValidationEvaluator, WekaTrainTestSetEvaluator and WekaTestSetEvaluator now store the test data in the container as well (if possible).
    • WekaPredictionsToInstances and WekaPredictionsToSpreadSheet can output the test data along side the measures now (if possible).
  • adams-cntk: now uses CNTK 2.1
  • adams-dl4j: upgraded deeplearning4j to 0.9.1
  • adams-imaging: The ImageReader transformer now can load the meta-data directly (optional).

Additions

  • Added support to the adams.flow.FlowRunner class for installing custom JVM shutdown hooks, like executing a remote command.
  • adams-imaging: added channels splitters for HSV, YUV and YIQ color models, acting as BufferedImageTransformer plugins: SplitChannelsHSV, SplitChannelsYUV, SplitChannelsYIQ.
  • adams-weka:
    • Added actor for nearest neighbor search: WekaNearestNeighborSearch.
    • Added SDR (Standard Deviation of Residuals) statistic to evaluation output.
    • Added RPD (Ratio of Performance to Deviation) statistic to evaluation output.
    • Added RowSum filter for replacing all attributes (except class) with sum of numeric attributes in a row.
  • adams-spectral-2dim-core:
    • reader for Nicolet SPA spectral data files: SPASpectrumReader.
    • trainable batch spectrum filter for multiplicative scatter correction (MSC): MultiplicativeScatterCorrection
    • The ApplyMultiplicativeScatterCorrection amplitude transform scheme allows the application of MSC using slopes and intercepts stored in the report.
    • added SegmentedDownSample Weka filter in conjunction with its SegmentedDownSampleNthPoints Hermione handler.
    • added spectrum writer outputting spectra as images, using the amplitude to determine color of pixels in image: IntensityImageSpectrumWriter
  • adams-cntk: added CNTKModelGenerator source for outputting CNTK model blocks using the specified model generator.
  • adams-basic-app: stripped down version sporting CSV spreadsheet support and Groovy scripting.
  • adams-imaging:
    • added CountObjectsInRegion transformer for counting objects in report that fall within the defined region, e.g., when processing annotated objects in images.
    • added ObjectLocationsFromReport preview to the Preview browser, which displays an image with an object locations overlay, using locations obtained from a report (simple format) with the same name as the image.
    • added ImageObjectFilter which utilizes the new object finder class hierarchy for filtering objects in the report attached to an image.
    • added conversions for handling rectangles: StringToRectangle, RectangleToString, RectangleCenter
    • added feature generator that uses a image transformer as filter before applying the base generator: FilteredBufferedImageFeatureGenerator
    • added convenience actors for handling image objects: GetImageObjectIndices, ImageObjectInfo, RemoveImageObject.
    • added image overlay for displaying object locations as circle/ellipse: ObjectCentersOverlayFromReport.
    • added preview to Preview browser using ObjectCentersOverlayFromReport: ObjectCentersFromReport.
  • adams-imaging-boofcv:
    • added feature generator that uses a image transformer as filter before applying the base generator: FilteredBufferedImageFeatureGenerator
  • adams-imaging-imagej:
    • added feature generator that uses a image transformer as filter before applying the base generator: FilteredBufferedImageFeatureGenerator