|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectweka.classifiers.AbstractClassifier
weka.classifiers.functions.SMO
public class SMO
Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier.
This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (In that case the coefficients in the output are based on the normalized data, not the original data --- this is important for interpreting the classifier.)
Multi-class problems are solved using pairwise classification (1-vs-1 and if logistic models are built pairwise coupling according to Hastie and Tibshirani, 1998).
To obtain proper probability estimates, use the option that fits logistic regression models to the outputs of the support vector machine. In the multi-class case the predicted probabilities are coupled using Hastie and Tibshirani's pairwise coupling method.
Note: for improved speed normalization should be turned off when operating on SparseInstances.
For more information on the SMO algorithm, see
J. Platt: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, 1998.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy (2001). Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637-649.
Trevor Hastie, Robert Tibshirani: Classification by Pairwise Coupling. In: Advances in Neural Information Processing Systems, 1998.
@incollection{Platt1998,
author = {J. Platt},
booktitle = {Advances in Kernel Methods - Support Vector Learning},
editor = {B. Schoelkopf and C. Burges and A. Smola},
publisher = {MIT Press},
title = {Fast Training of Support Vector Machines using Sequential Minimal Optimization},
year = {1998},
URL = {http://research.microsoft.com/\~jplatt/smo.html},
PS = {http://research.microsoft.com/\~jplatt/smo-book.ps.gz},
PDF = {http://research.microsoft.com/\~jplatt/smo-book.pdf}
}
@article{Keerthi2001,
author = {S.S. Keerthi and S.K. Shevade and C. Bhattacharyya and K.R.K. Murthy},
journal = {Neural Computation},
number = {3},
pages = {637-649},
title = {Improvements to Platt's SMO Algorithm for SVM Classifier Design},
volume = {13},
year = {2001},
PS = {http://guppy.mpe.nus.edu.sg/\~mpessk/svm/smo_mod_nc.ps.gz}
}
@inproceedings{Hastie1998,
author = {Trevor Hastie and Robert Tibshirani},
booktitle = {Advances in Neural Information Processing Systems},
editor = {Michael I. Jordan and Michael J. Kearns and Sara A. Solla},
publisher = {MIT Press},
title = {Classification by Pairwise Coupling},
volume = {10},
year = {1998},
PS = {http://www-stat.stanford.edu/\~hastie/Papers/2class.ps}
}
Valid options are:
-D If set, classifier is run in debug mode and may output additional info to the console
-no-checks Turns off all checks - use with caution! Turning them off assumes that data is purely numeric, doesn't contain any missing values, and has a nominal class. Turning them off also means that no header information will be stored if the machine is linear. Finally, it also assumes that no instance has a weight equal to 0. (default: checks on)
-C <double> The complexity constant C. (default 1)
-N Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-L <double> The tolerance parameter. (default 1.0e-3)
-P <double> The epsilon for round-off error. (default 1.0e-12)
-M Fit logistic models to SVM outputs.
-V <double> The number of folds for the internal cross-validation. (default -1, use training data)
-W <double> The random number seed. (default 1)
-K <classname and parameters> The Kernel to use. (default: weka.classifiers.functions.supportVector.PolyKernel)
Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
-D Enables debugging output (if available) to be printed. (default: off)
-no-checks Turns off all checks - use with caution! (default: checks on)
-C <num> The size of the cache (a prime number), 0 for full cache and -1 to turn it off. (default: 250007)
-E <num> The Exponent to use. (default: 1.0)
-L Use lower-order terms. (default: no)
| Nested Class Summary | |
|---|---|
class |
SMO.BinarySMO
Class for building a binary support vector machine. |
| Field Summary | |
|---|---|
static int |
FILTER_NONE
filter: No normalization/standardization |
static int |
FILTER_NORMALIZE
filter: Normalize training data |
static int |
FILTER_STANDARDIZE
filter: Standardize training data |
static Tag[] |
TAGS_FILTER
The filter to apply to the training data |
| Constructor Summary | |
|---|---|
SMO()
|
|
| Method Summary | |
|---|---|
String[][][] |
attributeNames()
Returns the attribute names. |
double[][] |
bias()
Returns the bias of each binary SMO. |
void |
buildClassifier(Instances insts)
Method for building the classifier. |
String |
buildLogisticModelsTipText()
Returns the tip text for this property |
String |
checksTurnedOffTipText()
Returns the tip text for this property |
String[] |
classAttributeNames()
|
String |
cTipText()
Returns the tip text for this property |
double[] |
distributionForInstance(Instance inst)
Estimates class probabilities for given instance. |
String |
epsilonTipText()
Returns the tip text for this property |
String |
filterTypeTipText()
Returns the tip text for this property |
boolean |
getBuildLogisticModels()
Get the value of buildLogisticModels. |
double |
getC()
Get the value of C. |
Capabilities |
getCapabilities()
Returns default capabilities of the classifier. |
boolean |
getChecksTurnedOff()
Returns whether the checks are turned off or not. |
double |
getEpsilon()
Get the value of epsilon. |
SelectedTag |
getFilterType()
Gets how the training data will be transformed. |
Kernel |
getKernel()
Returns the kernel to use |
int |
getNumFolds()
Get the value of numFolds. |
String[] |
getOptions()
Gets the current settings of the classifier. |
int |
getRandomSeed()
Get the value of randomSeed. |
String |
getRevision()
Returns the revision string. |
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on. |
double |
getToleranceParameter()
Get the value of tolerance parameter. |
String |
globalInfo()
Returns a string describing classifier |
String |
kernelTipText()
Returns the tip text for this property |
Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(String[] argv)
Main method for testing this class. |
int |
numClassAttributeValues()
|
String |
numFoldsTipText()
Returns the tip text for this property |
int[] |
obtainVotes(Instance inst)
Returns an array of votes for the given instance. |
String |
randomSeedTipText()
Returns the tip text for this property |
void |
setBuildLogisticModels(boolean newbuildLogisticModels)
Set the value of buildLogisticModels. |
void |
setC(double v)
Set the value of C. |
void |
setChecksTurnedOff(boolean value)
Disables or enables the checks (which could be time-consuming). |
void |
setEpsilon(double v)
Set the value of epsilon. |
void |
setFilterType(SelectedTag newType)
Sets how the training data will be transformed. |
void |
setKernel(Kernel value)
sets the kernel to use |
void |
setNumFolds(int newnumFolds)
Set the value of numFolds. |
void |
setOptions(String[] options)
Parses a given list of options. |
void |
setRandomSeed(int newrandomSeed)
Set the value of randomSeed. |
void |
setToleranceParameter(double v)
Set the value of tolerance parameter. |
int[][][] |
sparseIndices()
Returns the indices in sparse format. |
double[][][] |
sparseWeights()
Returns the weights in sparse format. |
String |
toleranceParameterTipText()
Returns the tip text for this property |
String |
toString()
Prints out the classifier. |
void |
turnChecksOff()
Turns off checks for missing values, etc. |
void |
turnChecksOn()
Turns on checks for missing values, etc. |
| Methods inherited from class weka.classifiers.AbstractClassifier |
|---|
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebug |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final int FILTER_NORMALIZE
public static final int FILTER_STANDARDIZE
public static final int FILTER_NONE
public static final Tag[] TAGS_FILTER
| Constructor Detail |
|---|
public SMO()
| Method Detail |
|---|
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic void turnChecksOff()
public void turnChecksOn()
public Capabilities getCapabilities()
getCapabilities in interface ClassifiergetCapabilities in interface CapabilitiesHandlergetCapabilities in class AbstractClassifierCapabilities
public void buildClassifier(Instances insts)
throws Exception
buildClassifier in interface Classifierinsts - the set of training instances
Exception - if the classifier can't be built successfully
public double[] distributionForInstance(Instance inst)
throws Exception
distributionForInstance in interface ClassifierdistributionForInstance in class AbstractClassifierinst - the instance to compute the probabilities for
Exception - in case of an error
public int[] obtainVotes(Instance inst)
throws Exception
inst - the instance
Exception - if something goes wrongpublic double[][][] sparseWeights()
public int[][][] sparseIndices()
public double[][] bias()
public int numClassAttributeValues()
public String[] classAttributeNames()
public String[][][] attributeNames()
public Enumeration listOptions()
listOptions in interface OptionHandlerlistOptions in class AbstractClassifier
public void setOptions(String[] options)
throws Exception
-D If set, classifier is run in debug mode and may output additional info to the console
-no-checks Turns off all checks - use with caution! Turning them off assumes that data is purely numeric, doesn't contain any missing values, and has a nominal class. Turning them off also means that no header information will be stored if the machine is linear. Finally, it also assumes that no instance has a weight equal to 0. (default: checks on)
-C <double> The complexity constant C. (default 1)
-N Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-L <double> The tolerance parameter. (default 1.0e-3)
-P <double> The epsilon for round-off error. (default 1.0e-12)
-M Fit logistic models to SVM outputs.
-V <double> The number of folds for the internal cross-validation. (default -1, use training data)
-W <double> The random number seed. (default 1)
-K <classname and parameters> The Kernel to use. (default: weka.classifiers.functions.supportVector.PolyKernel)
Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
-D Enables debugging output (if available) to be printed. (default: off)
-no-checks Turns off all checks - use with caution! (default: checks on)
-C <num> The size of the cache (a prime number), 0 for full cache and -1 to turn it off. (default: 250007)
-E <num> The Exponent to use. (default: 1.0)
-L Use lower-order terms. (default: no)
setOptions in interface OptionHandlersetOptions in class AbstractClassifieroptions - the list of options as an array of strings
Exception - if an option is not supportedpublic String[] getOptions()
getOptions in interface OptionHandlergetOptions in class AbstractClassifierpublic void setChecksTurnedOff(boolean value)
value - if true turns off all checkspublic boolean getChecksTurnedOff()
public String checksTurnedOffTipText()
public String kernelTipText()
public void setKernel(Kernel value)
value - the kernel to usepublic Kernel getKernel()
public String cTipText()
public double getC()
public void setC(double v)
v - Value to assign to C.public String toleranceParameterTipText()
public double getToleranceParameter()
public void setToleranceParameter(double v)
v - Value to assign to tolerance parameter.public String epsilonTipText()
public double getEpsilon()
public void setEpsilon(double v)
v - Value to assign to epsilon.public String filterTypeTipText()
public SelectedTag getFilterType()
public void setFilterType(SelectedTag newType)
newType - the new filtering modepublic String buildLogisticModelsTipText()
public boolean getBuildLogisticModels()
public void setBuildLogisticModels(boolean newbuildLogisticModels)
newbuildLogisticModels - Value to assign to buildLogisticModels.public String numFoldsTipText()
public int getNumFolds()
public void setNumFolds(int newnumFolds)
newnumFolds - Value to assign to numFolds.public String randomSeedTipText()
public int getRandomSeed()
public void setRandomSeed(int newrandomSeed)
newrandomSeed - Value to assign to randomSeed.public String toString()
toString in class Objectpublic String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClassifierpublic static void main(String[] argv)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||