Updates 2021/12/02

The lead up to the Weka Conference on November 26th (https://events.waikato.ac.nz/events/2021-international-weka-user-conference/) resulted in a lot of minor bug fixes and improvements. I've also started retiring some of the modules that are no longer being used/useful.

Fixes

  • Fixed issue of the Flow editor locking up (occasionally) when debugging flows; the cause was a race condition of two debugging control panels getting generated, with the incorrect one replacing the other one.

  • Fixed duplicate class cache initialization, resulting in faster start-up time.

  • Position and size of applications launched from main menu now get properly recorded when closing the applications.

  • Reduced startup time of applications that use file choosers, avoiding unnecessary clipboard lookups.

  • adams-imaging:

    • The object annotator now uses the correct zoom when initially displaying the image.

    • Image segmentation annotation no longer has the darkening bug when adding layers.

  • adams-weka and adams-weka-lts:

    • Convert to Date/Nominal/String in the Weka Investigator's Preprocess panel now works with numeric attribute names as well.

    • The SimpleArffSaver and SpreadSheetSaver converters no longer use resetOptions() to initialize their options with default values.

    • The WekaCrossValidationEvaluator now stops execution properly and cleans up without causing NullPointerException.

Changes

  • moved discontinued modules into adams-discontinued repo (https://github.com/waikato-datamining/adams-discontinued):

    • adams-cntk

    • adams-cntk-weka

    • adams-image-webservice

    • adams-imaging-imagemagick

    • adams-imaging-opemimaj

    • adams-jooq

    • adams-meka-webservice

    • adams-mongodb

    • adams-weka-nd4j

    • adams-weka-nd4j-lts

    • adams-weka-webservice

  • The GenericObjectEditor now displays the help within the same dialog/frame to avoid help window disappearing behind other windows when opened from a modal dialog/frame. The drop-down button/menu next to the classname now has a Use default menu which reverts the current settings back to the default ones.

  • adams-webservice: upgraded CXF to 3.4.5, xercesimpl to 2.12.1, jaxb to 3.0.2

  • adams-bootstrapp: upgraded bootstrapp to 0.1.11

  • adams-net: upgraded tika-core to 1.27, requests4j to 0.2.2

  • adams-moa: upgraded MOA to 2021.07.0

  • adams-core:

    • upgraded jclasslocator to 0.0.19

    • jfilechooser-bookmarks updated to 0.1.8 for better clipboard handling

    • removed DocBookProducer for generating DocBook XML from setups (not feasible for large flows)

  • adams-weka and adams-weka-lts: The WekaPredictionsToInstances and WekaPredictionsToSpreadSheet transformers now handle attribute names in the test attributes option.

  • adams-redis and adams-rats-redis: switched to the lettuce.io Redis Java client

Additions

  • adams-weka and adams-weka-lts:

    • The LeaveOneOutByValueGenerator split generator generates pairs using the unique values from a specified attribute, with each unique value being in the test set and the remainder in the training set.

    • The MultiRowProcessor instance filter identifies rows with a row selection scheme and then processes them with a row processing scheme, e.g., for averaging multiple scans of the sample.

Updates 2021-10-15

This time, it took a bit longer compiling this list of updates. First the 2-year old SSD in my laptop died and then parts of NZ went in and out of COVID-19 lockdown several times. And not to forget, I've been busy working remotely on a bunch of projects, with quite a bit more Python/Docker these days. As an aside, I laid the foundations for a Python-based and ADAMS-inspired workflow engine called shallowflow (https://github.com/waikato-datamining/shallowflow). But that is still at a very early stage and without a user interface for the time being.

Fixes

  • adams-spreadsheet:

    • The FixedTabularSpreadSheetReader no longer skips the first line.

    • Formulas now work with Long values as well, not just Double values.

Changes

  • ArraySubset is now superseded by the more flexible ArraySubsetGeneration transformer.

  • The Preview browser now has support for favorite content handlers, which allows one to quickly switch between different views for the same file.

  • adams-meka: Upgraded Meka to 3.9.5

  • adams-r: Upgraded RSyntaxArea to 3.1.2

  • adams-pdf: Upgraded pdfbox to 2.0.24

  • adams-jython: Upgraded Jython to 2.7.2

  • adams-rsync: Upgraded rsync4j to 3.2.3-7

  • adams-weka and adams-weka-lts: Upgraded xgboost4j to xgboost4j_2.12/1.4.1

  • adams-spectral-2dim-core: KennardStone now has an invert flag

  • adams-weka and adams-weka-lts:

    • KennardStone unsupervised instance filter now has an invert flag

    • The SavitzkyGolay and SavitzkyGolay filters now have a -keep-attribute-names flag.

    • When selecting multiple attributes in the Weka Investigator's Preprocess panel, the Attribute summary and visualization sections now show the data for each selected attribute, not just one.

    • Entries in the history panels of the Weka Investigator can be renamed now.

  • The obsolete modules adams-cntk, adams-cntk-weka and adams-cntk-weka-lts have been removed.

  • adams-spreadsheet: added more string functions to the LookUpUpdate parser: substr, left, mid, right, rept, concatenate, lower[case], upper[case], trim, matches, len[gth], find, contains, replace, replaceall, substitute, str

Additions

  • Added a generic framework for generating indexed splits runs from datasets, with concrete implementations for data structures (spreadsheets, Weka Instances). This framework allows to generate splits from datasets, like cross-validation or grouped cross-validation, but not the actual data, only the row indices. That way model performances from different frameworks can be compared directly.

  • Added MultiMapOperation transformer, which allows operations such as: CommonKeys and Merge.

  • Added the ArraySubsetGeneration transformer which has its own plugin hierarchy.

  • Added the adams-db-mysql8 module for accessing MySQL 8 servers

  • adams-compress: Addded the boolean conditions for byte arrays to check for compression formats: IsBzip2Compressed, IsGzipCompressed, IsRarCompressed, IsXzCompressed, IsZipCompressed, IsZstCompressed.

  • adams-spreadsheet:

    • Added column finder ByExactName.

  • adams-weka and adams-weka-lts:

    • Added Detrend unsupervised attribute filter (Mean, RangeBased).

    • Add the AverageSilhouetteCoefficient and MultiClustererPostProcessor clusterer post-processors.

    • Added attribute finder ByExactName.

    • Added unsupervised attribute filter MultiplicativeScatterCorrection.

    • The Weka Investigator can now generate and apply indexed splits.

    • The WekaCrossValidationEvaluator transformer now cleans up internal data structures after finishing to conserve memory.

  • adams-ml: Add the MeanAbsoluteError summary statistic plugin.

  • adams-spectral-2dim-core:

    • added Detrend spectrum filter (Mean, RangeBased).

    • added the SpectrumPaintletNumericField paintlet that uses a numeric field in the report that stores the color index of a color gradient generator.

    • added the SpectrumPaintletStringField paintlet that uses string extracted via regexp from a string field to determine color from color provider.

    • added the UnscramblerSpectrumReader for Unscrambler files.

    • The Evaluator transformer now has an option to discard the evaluator instance after training (e.g., when only wanting to output a trained instance).

  • adams-r: added better support for R using Renjin (https://renjin.org/).

  • adams-redis: new module with some basic support for the Redis in-memory database.

  • adams-rats-redis: new module with some basic support for the Redis in-memory database.

  • adams-imaging:

  • adams-tensorflow: Added the plugin TfliteModelMakerCSV for generating file-based datasets.

Updates 2021/05/28

A bunch of changes and updates happened since last time in relation to spreadsheet handling (general and spectral data related). One thing that I've been trying to sort out was creating screenshots of tables. You can now use Send to -> Export table as image on pretty much any native ADAMS table to create a screenshot of a table.

Fixes

  • The ListVariableUsage actor processor now also inspects VariableNameValuePair (and arrays) and BaseObject arrays for variables.

Changes

  • The search panel received an overhaul, including an added clear button. The Preview browser now performs incremental search for a more efficient search.

  • The spreadsheet handlers in the Preview browser now allow the user to select a custom cell rendering, e.g., for highlighting certain values.

  • Most tables should now offer Send to capability and the new Export table as image action to export the current table as an image in various formats. With the Copy table as image, the table gets copied as image to the clipboard.

  • The SpreadSheetDisplay sink can now be used in conjunction with the CallableActorScreenshot actor.

  • The Preview browser supports export of tables as images now as well.

  • commons-io/commons-io upgraded to 2.7

  • adams-imaging:

    • The ObjectLocationsFromReport operation for the Draw transformer now allows varying the shape color as well, just like in the Preview browser.

    • Object location overlays can now fall back on the bounding box if the polygon is too small (eg from a bad mask). Also applies to the Preview browser and Draw transformer plugin.

  • adams-matlab: improved handling of character cells in the MatlabArrayToSpreadSheet conversion.

  • adams-net: The HttpRequest transformer now has an option for defining the mime-type of the data, default is application/octect-stream.

  • adams-weka and adams-weka-lts: The WekaInstancesDisplay sink can now be used in conjunction with the CallableActorScreenshot actor now.

  • adams-spreadsheet: The SpreadSheetInsertRow transformer can interpret the value now as blank or comma-separated values as well. The value(s) can now be forced to be STRING as well.

  • adams-spectral-2dim-core: the JSON format for spectra has been overhauls, using separate arrays for wave numbers and amplitudes now to reduce file size. Also, either the full report or specific reference+meta-data values can be written/read.

Additions

  • The HeatmapCellRenderingCustomizer cell rendering customizer for spreadsheet tables allows one to color in the background of numeric cells based on the global min/max and the specified color generator. Makes it easier to spot extreme values.

  • The HasElements boolean flow condition checks whether the incoming object is an array and has at least specified number of elements.

  • The HasSize boolean flow condition checks whether the incoming collection has at least the specified size.

  • The ReportArrayToMap conversion turns the a report array into a map using the specified report field as key.

  • The MergeReportFromMap transformer allows the reports passing through (either Report or MutableReportHandler objects) to be merged with reports available from a map in storage. This way, meta-data can be read separately and stored in a map in storage and then easily attached.

  • adams-imaging:

    • Added general support for point annotations, which can be used for pose estimation.

    • Added support for reading/writing DeepLabCut CSV annotation files

  • adams-spectral-2dim-core:

    • The Flatliner outlier detector picks up on spectra that only consist of one value.

    • The SpreadSheetColumnsToSpectra conversion generates multiple spectra from a spreadsheet, more flexible than SpreadSheetColumnToSpectrum. The SpreadSheetRowsToSpectra conversion works similar, but row-wise instead of column-wise.

    • The SpreadSheetColumnsToSampleData conversion generates sample data objects (= reports) from columns from a spreadsheet. The SpreadSheetRowsToSampleData conversion does the same using rows from a spreadsheet.

    • The SampleDataArrayToMap conversion generates a map from the array, using the sample ID as key for the map.

    • The SpreadSheetRowGenerator transformer allows you to generate spreadsheet rows from spectra using various generators.

  • adams-heatmap: Added the MultiHeatmapOperation transformer, which allows applying an operation to an array of heatmaps.

  • adams-spreadsheet:

    • The SpreadSheetUseRowAsHeader conversion replaces the values in the header row with the ones from the specified data row.

    • The SpreadSheetColumnsToReport and SpreadSheetRowsToReport conversions convert columns/rows in a spreadsheet containing meta-data into Report objects.

Updates 2021/04/12

More work was done on the adams-matlab module and on image annotation support.

Fixes

  • The SelectArraySubset transformer now updates its JList correctly when getting subsequently called, not just the first time round.

  • Very long cell values in tables (> 1000 characters) now get excluded from calculating the optimal column width to avoid slow calculation and extremely wide tables.

  • adams-meta: The InactiveStandalone/Source/Transformer/Sink actors now copy the actors provided to the constructors rather than just using the reference.

  • adams-spreadsheet: Formulas in spreadsheets now resolve values of cell references to their native value, not just trying to convert it to double.

  • adams-weka and adams-weka-lts:

    • The WekaAttributeSelection transformer no longer puts the ranked order of attributes back into sorted order.

    • Changing the attribute weights in an Instances table now gets applied to the correct column when displaying the instance weights column.

    • Exporting the output to PDF no longer duplicates the content of text files (due to two exporters available).

    • The Data generator source in the Weka Investigator works again.

    • String and relational values no longer get rendered as HTML as this turned out to be too slow when displaying a table.

Changes

  • adams-weka: upgraded multisearch-weka-package to 2021.2.17

  • adams-spreadsheet: deprecated SpreadSheetCommonIDs, SpreadSheetDifference in favor of MultiSpreadSheetOperation

  • adams-timeseries: upgraded timeseriesForecasting Weka package to 1.1.27

  • adams-matlab: the Mat5SpreadSheetReader can now retrieve arrays from struct objects as well

  • adams-weka and adams-weka-lts:

    • Attribute weights now get displayed in the header row when showing weights.

    • The Revert action in the Weka Investigator now monitors files for changes (last modified timestamp) and gets enabled when either the file on disk got changed or it has been modified in the Investigator itself.

    • The SimpleArffLoader and SimpleArffSaver now allow the user to specify the file encoding.

Additions

  • The MapToStorageValues transformer transfers a map object into internal storage.

  • adams-imaging:

    • The ConfusionMatrix plugin for the ImageSegmentationContainerOperation transformer generates a confusion matrix of the annotated pixels versus predicted ones and the CountPixels one simply outputs a spreasdheet with counts per layer (of non-black pixels).

    • The IndexedPNGImageHandler for the Preview browser allows overriding the colors of PNG files that use an indexed color palette, e.g., when output by image segmentation models.

    • The BlueChannelImageHandler (using the new BlueChannelColorizer image transformer under the hood), interprets the values in the blue channel of an image as color indices and applies the colors from the supplied color provider accordingly.

  • adams-spreadsheet: added the MultiSpreadSheetOperation transformer which applies the specified operation to the incoming spreadsheet array (CommonIDs, Sum, Difference, Merge).

  • The HasProperty boolean condition can be used to check whether objects have a certain property name, e.g., when updating object values via the UpdateProperty transformer.

  • adams-matlab:

    • The MatlabStructToMap conversion turns a Matlab struct object into a map (field/array relation).

    • The IsMatlabStruct boolean condition checks whether the token represents a Matlab struct object.

    • The MatlabStructInfo transformer allows outputting information on a Matlab struct object.

    • With the flow adams-matlab-inspect_file.flow you can inspect .mat files interactively.

  • adams-weka and adams-weka-lts: The Text directory source in the Weka Investigator allows the loading of directories with plain text documents as datasets.

Updates 2021/01/28

Since I'll be out of the office the next few weeks, just a quick update on what has happened since the last post:

Fixes

  • Restricted stopping is now working as expected in conjunction with LocalScopeTransformer and LocalScopeTrigger.

  • adams-event: The (internal) schedulers are now per flow rather than global, i.e., the same flow can be run in parallel and stopped without affecting the other instance.

Changes

  • Switched to using Java 11 for builds, but source and target version are still Java 8 for the time being.

  • The LineByLineTextReader now has a -max-lines option to limit the number of lines read.

  • adams-webservice: upgraded Apache CXF to 3.3.9

  • adams-excel: upgraded Apache POI to 3.17

  • adams-event: updated quartz-scheduler to 2.3.2

  • adams-imaging: The ObjectCentersOverlayFromReport, ObjectLocationsOverlayFromReport and ObjectLocationsOverlayFromSpreadSheet overlays (and corresponding plugins for the Preview browser) now allow varying the color of the shape as well. This is useful when displaying mostly one type of object in the image.

Additions

  • The ZipArrays source actor allows the row-based iteration through arrays of the same length that are available in storage (like Python's zip operation).

  • adams-weka and adams-weka-lts: added FFT filter (unsupervised, attribute).

  • Added adams-matlab module for reading/writing binary Matlab .mat files.

  • adams-imaging:

    • The ApplyMask multi-image operation applies an image mask to another image (pixels in image get set to black if mask pixel is black).

    • The ObjectAnnotationsMask BufferedImage transformer applies the object annotations in an image's report to the image and sets all other pixels to