updates from a busy week

It's been a busy week and debugging support has seen quite some improvements.

Debugging

  • PathBreakpoint now allows you to select the actor through a dialog rather than having to enter the path manually

  • due to some API changes to input consumers, it is now possible to inspect the current token when stepping through the flow manually, rather than only when reaching a Breakpoint control actor

  • instead of only outputting a textual representation of tokens (inspection panel), a class hierarchy for custom renderers has been implemented plain text (used so far; fallback), images, spreadsheets, reports, Weka Instance(s), timeseries

  • these rendering classes are now also used when previewing items from the internal storage ("storage" tab)

  • started work on adding ability to edit items in internal storage; at this stage primitives and arrays of primitives are supported, as well as objects that can be manipulated through the GenericObjectEditor (more will come)

Here is the summary of the remaining updates:

Fixes

  • The vertical marker paintlet used by the SequencePlotter now uses the actual minimum/maximum of the canvas, not just the one derived from the data

Changes

  • control charts plots can now be initialized with a container from the ControlChart transformer and subsequently display data using the last known limits; furthermore, added support for post-processsing, e.g., limiting the number of data points visible at any given time

  • spreadsheets are now editable in the viewer; the SpreadSheetDisplay sink offers the new "readOnly" option (default = readonly)

Additions

  • OpenFile sink opens the incoming file with the application associated with it as determined by the operating system

  • MissionControl control actor is a minimalistic interface for pausing/resuming/stopping a flow; useful when starting a flow from the commandline using the FlowRunner class instead of from the GUI

  • InputOutputListener control actor allows you to listen in on the tokens go into and coming out from a sub-flow by forwarding them to callable actors for further processing

Thanks for the feedback regarding the debugging functionality and have a good weekend!

updates 5/6

It's been a short, but busy week - Monday was Queen's Birthday holiday. Not only lots of data mining related work, but also lots of changes happening to ADAMS. The changes mainly improved usability in the Flow editor.

Fixes

  • MetaPartitionedMultiFilter now handles duplicate regular expressions, e.g., when filtering the same subset of attributes with several different filters

Changes

  • The notification area in the flow editor (eg displaying errors) now offers a "Jump to" menu item from its right-click popup menu. It lists all the actor names that were listed in the message, allowing you to quickly get to an actor that generated the error.

  • adams-spreadsheet: the LookUpAdd transformer now accepts spreadsheets as well for updating the look up table, not just pairs of objects

  • CopyFile/MoveFile now support the option for multiple attempts, in case the I/O operation fails (useful on Windows with its file locking)

  • adams-spreadsheet: the LookUpAdd transformer now accepts spreadsheets as well for updating the look up table, not just pairs of objects

Additions

  • Control charts are now available through "Chart" functionality of the spreadsheet viewer tool (adams-spreadsheet module).

Breakpoints

The breakpoint/debugging support in the flow has been reworked extensively, in order to make it much more user-friendly and efficient. This was possible by combining and extending the functionality of the "Debug" flow execution listener, which offered similar functionality to the "Breakpoint" control actor. However, setting it up was a bit tedious.

The new Breakpoint actors no longer have a GUI at all, they simply configure the "Debug" flow execution listener in a complete automatic fashion. By moving the control panel from the Breakpoint actor into the Debug listener, the added benefit is now that you only have a single window for all your breakpoints. Large flows with multiple breakpoints quickly became hard to debug, with many breakpoints simply called "Breakpoint" (uhm.. which one was this again?). Furthermore, the actor that is currently suspended, it now highlighted in the flow, so you can see where you are at. A new functionality is that you can now step through a flow, one actor at a time. This allows you to quickly check how variables/storage change, without having to stop the flow and add a new breakpoint at another location.

Breakpoints can be added/removed/modified/disabled/enabled at runtime very easily, using the new "Breakpoints" tab in the control panel (see attachment breakpoints_controlpanel.png).

The line-wrap bug in the token inspection panel has finally been fixed. When right-clicking in the object tree, you can now either copy the string representation to the clipboard or export it to various file formats. At the moment, spreadsheets, images and Weka datasets are supported apart from plain text.

Another minor, but significant modification: I added a "Run (debug)" menu item to the flow editor, which allows you to step through a flow, actor by actor, without having to have any Breakpoint control actors present.

../../images/breakpoints_flow.thumbnail.png ../../images/breakpoints_controlpanel.thumbnail.png

That's all for this week!

weekly roundup 29/5

Time to let you in on the latest changes...

Something that has been in the making for a bit, but was accelerated through the bug-hunt for locked files under Windows (a feature I always hated!):

ADAMS now requires Java 8 for building and running.

And here is the rest:

Fixes

  • SpreadSheetAggregate is now more robust in regards to failures, also handles empty input spreadsheets now correctly (no longer adds a phantom row)

Changes

  • TimeseriesDisplay now offers a "plot updater" option, which allows you to specify at what interval to update the display (or only at the end)

  • StringToValidVariableName conversion now has the additional "pad" option to output a full variable like "@{cool_variable}" instead of "cool_variable"

  • Quote conversion now offers "force" (always quote it, even if not necessary) and "double-up" (double quotes rather than using backslashes for escaping them) options

  • UnQuote conversion offers the "double-up" option like Quote as well, reversing the doubling up, of course

  • SpreadSheetToDouble/StringMatrix conversions now allow user to specify the range of rows/columns to turn into a matrix

  • data container writers (= transformers) now allow the specification on how to generate the file: AUTOMATIC, DATABASE_ID, ID, SUPPLIED current behavior was AUTOMATIC, which either uses DB-ID or if that is not available the ID of a container. With SUPPLIED you can create your own filename (without path) Affected: TimeseriesFileWriter, HeatmapFileWriter

  • EnterManyValues now has support for displaying a help via a tip text hint (when hovering over the field) and allows the output of a spreadsheet (default so far) and key value pairs

Additions

Have a good weekend!

../../images/performance.thumbnail.png ../../images/uchart.thumbnail.png

weekly roundup 22/5

Time flies by and so do the updates...

Additions

  • adams-timeseries: new TimeseriesAutocorrelation filter

  • added array statistic: ArrayLinearRegression computes slope/intercept

  • adams-weka now has a simple wizard for batch-filtering datasets

  • added overlays: LinearRegressionOverlay and LOWESSOverlay

Fixes

  • further fixes regarding locked files under Windows, reworked the text file readers in the process (used by the TextFileReader transformer)

Changes

  • Weka/MOA/Meka/Classifying/Clustering transformers now have a variable for monitoring purposes. Whenever this variable changes, the model gets reset and needs to be loaded again. Useful when using multiple models.

  • upgraded JSch library to 0.1.52 (used for ssh, scp, ...)

  • System property "adams.io.tmpdir" allows overriding of Java's default temp directory

  • added support for intercept/slope formulas in spreadsheets

  • the SimplePlot sink now allows to specify an overlay as well

  • adams-weka: the FixedClassifierErrorsPlot plugin for the Weka Explorer now allows the specifying of the meta-data attribute range and whether to overlay a "trend" (= a linear regression line derived from the data points)

Have a good weekend!

changes

As always, there have been a number of changes again.

Additions

  • adams-imaging can generate UPC-A barcodes now

  • adams-timeseries now has FixedTimestampPaintlet, FixedTimestampRangePaintlet paintlets and TimeseriesShiftTimestamp filter

Changes

  • upgraded Groovy to 2.4.0, to bring it in line with Weka packages

  • prefixed timeseries filters, baseline correction, outlier detectors, smoothers with "Timeseries" to avoid potential name clashes

  • added support for "flow" aware paintlets: TextOverlayPaintlet and MultiPaintlet -> affected: Canvas, SimplePlot, SequencePlot

  • TimedTee/TimedTrigger/TimedSubProcess control actors now forward a TimingContainer object instead of just the milli-second value (contains prefix and original as well)

  • SpreadSheetAggregate/Query transformers now support "RANGE" function, which is just a shortcut for MAX-MIN

Fixes

  • "Scripted" feature generators now work with Groovy (Groovy didn't like generics in return value of method)

  • code review for fixing potential locked files on Windows (explicitly closing FileInputStream/FileOutputStream)

round-up of changes

It's that time of the week again: an overview of what's happened since the last update...

Additions

  • new BufferedImage feature generators Min and Dimensions in the adams-imaging module

  • new TextAreaPage wizard component

  • new "Mark location" plugin for Image viewer, allows you to highlight x/y positions in an image

  • it is now possible to retrieve the setup of a flow that is running on a remote server. You attach the RemoteFlowListener to the remote flow before executing it and the use the new "Open remote flow" menu in the flow editor on your local machine to retrieve the setup (you may have to remove your FlowEditor.props file in the $HOME/.adams or %USERPROFILE%_adams directory to make this menu item show up)

Fixes

  • added distributionsForInstances method to FilteredClassifierExt meta-classifier

  • DumpFile sink should no longer leave lock files on Windows (writer.close() got moved into a finally{} clause)

  • SpreadSheetDbReader now forces cells to be strings for string types loaded from database

  • Image viewer "Barcode" plugin no longer requires an unmodified to be loaded

  • the output buffer for predictions didn't get reset properly when evaluating multiple datasets, simply appended results affected: WekaCrossValidationEvaluator, WekaStreamEvaluator, WekaTestSetEvaluator, WekaTrainTestSetEvaluator

Changes

  • the JAI histogram feature generator received the option "numBins" that allows you to generate less than the default 256 bins

  • RelativeCrop now allows to use the x/y location as the anchor point as well, placing the crop rectangle relative to this location

Have a good weekend!

update 30/4

It's time for a round-up of all the changes that have happened...

Fixes

  • the ORDER BY clause of the SpreadSheetQuery transformer got applied to the input spreadsheet rather than final one, screwing up the WHERE clause

  • the abstract class AbstractProcessWekaInstanceWithModel did not clear the internal model when its options got changed; affected transformers: MekaClassifying, Weka/MOAClassifying/Clustering

Modifications

  • SpreadSheetTable now uses plugin architecture for processing/plotting cols/rows/cells in its popup menus

  • Min/Max heatmap feature generators can output the position now as well

  • spreadsheet plot generators can generate plot name now also based on range of cell values

  • Excel and ODF spreadsheet readers now support "no" or "custom" headers

  • ForLoop source can output integers now as array as well

  • the message of the Stop control actor can contain variables now as well

Additions

  • added NewHeatmap source to adams-heatmap

  • added HeatmapGetValue/HeatmapSetValue transformers to adams-heatmap

  • added VotedImbalance meta-classifier to adams-weka classifier generates X number of resampled datasets from input data, limiting size to smallest class, building base-classifier on them and then uses Vote meta-classifier for making predictions

  • added PromptUser boolean flow condition, asking user whether to execute conditional actor or not

  • added "Append datasets" and "Merge datasets" to the main menu of the adams-weka modulel; uses wizards to guide the user for combining datasets

  • added specialized pages to the wizard: SelectFilePage, SelectMultipleFilesPage, WekaSelectDatasetPage, WekaSelectMultipleDatasetsPage

  • added new (optional) Look'n'Feel: PgsLookAndFeel, which can be activated in the GUIHelper.props file

  • new source for outputting integers: IntegerRange uses a range string and a maximum to generate these integers (eg "3-5,7,12,18-last")

update

Quite a number of changes have made their way into ADAMS since the last post...

Libaries

  • MEKA upgraded to 1.7.5 (module: adams-meka)

  • WekaExcel upgraded to 1.0.5 (module: adams-weka) ExcelLoader always omitted last row in spreadsheets

Fixes

  • Weka Explorer plugins FixedClassifierErrorPlot, ThresholdCurve now free up memory when the dialog gets closed

  • SpreadSheetColumnRange/Index sometimes failed in conjunction with variables

  • WhileLoop control actor did not re-apply variables once the looping started, ie conditions couldn't use variables that got updated within the loop

Extensions

  • the adams-heatmap module received a fair number of changes:

    • heatmaps no longer have the restriction of having a minimum of 0

    • heatmap filters are now prefixed with "Heatmap" to avoid name clashes

    • HeatmapInfo transformer for outputting some info about a heatmap

    • added number of feature generators

    • removed instance generators (and adams-weka dependency)

  • Weka filter "Scale" (unsupervised/instance) allows you to scale the values of a row eg to interval 0 to 1

  • any spreadsheet table should now allow you to not only display a histogram of a row/column, but also simply plot the row/column values

  • math/boolean expressions now offer additional functions: cbrt (cube root), sinh, cosh, tanh, atan, atan2, hypot, log10, signum

  • SimplePlot sink - a simplified version of the SequencePlotter sink

That's it for now. Have a nice weekend!

fixes and additions

Time for a little round-up of what's happened to ADAMS lately...

Fixes

  • sluggish behavior of flow editor (opening/saving/undo) due to regression in EnumOption option handling class (finding/registering property editor on demand was very slow)

  • obtaining subsets from Notes objects only returned the first element of each type

  • TryCatch control actor properly flushes the tokens of its subflow when an error occurs

  • Upper/LowerCase conversions take locale into account

  • ImageProcessor tool works properly with the revamped ImageFileChooser dialog

  • PreviewBrowser tool displays arrays in a more sensible fashion

  • WekaFileReader didn't output an Instances object if no rows present

Addtions

  • print support got added to the PDF viewer (tool and sink)

  • ImageHistogram sink added to adams-imaging

  • additional actors for the adams-heatmap module * HeatmapArrayStatistic transformer * HeatmapLocateObjects transformer * HeatmapHistogram sink

  • Heatmap viewer can display a histogram of current heatmap