SvmLightLib

A Windows DLL wrapper for Thorsten Joachims’ SVM implementations.

Support vector machines (SVMs) are a set of supervised learning methods used for classification and regression. One of the most well-known implementations available freely on the Web is Thorsten Joachims’ SVMlight (accompanied with SVMperf and variants of SVMstruct).

The original library is implemented in C. Several command-line utilities are implemented on top of it to expose the functionality to end users. The utilities are easy to use in contrast to the underlying library which exhibits a relatively steep learning curve. This motivated us to develop a DLL wrapper for the library. The developed DLL interface looks much like that of the command-line utilities. In short, the DLL wrapper has the following properties:

Downloads

Source Code

Binaries

Change Log

Version 0.9 (October 2008)

The first release. Supports binary induction, binary transduction, regression (not tested), and multiclass induction. The binary models are based on SVMlight. The multiclass model is based on SVMstruct and is thus much faster. Saving and loading binary models is not (yet) supported.

Version 1.0 (April 2009)

A C# usage example is included. It is now possible to save and load binary models. Also, it is easier to build 64-bit applications: a 64-bit target platform option is available in the configuration menu.

License

The license statement is included in the corresponding downloadable packages (look for the file named License.txt). In short, SvmLightLib is available for non-commercial use only. It must not be modified and distributed without prior permission of the author of SVMlight and SVMstruct (Thorsten Joachims). None of the authors is responsible for implications from the use of this software.

18 Comments »

  1. Milan said

    Hi.
    I do not know if you have time for this – but I try.
    I know only programming in Visual Basic 6 (or in VBA). I know a little how to declare dll in it, but as I know I should declare functions. For instance function ENSolveH from dll epanet2.dll like this:
    Declare Function ENsolveH Lib “C:\Program Files\Epanet2\epanet2.dll” () As Long
    But I do not know which are the functions or what I should call from SvmLightLib…
    Please just give me direction – maybe I find then.
    Thanks,
    Milan

    • Miha said

      Hi Milan!

      I prepared a code snippet that works in Visual Basic 2005 (with some minor modifications, it should also work in VBA). You will probably need to change the path to the DLL file. Note that this snippet doesn’t do anything useful – it just creates a short feature vector and gets its size back from the DLL. To see how to use these functions for some real action, please take a look at SvmLightLib1.0Src-1.zip\SvmLightLib\SvmLightLibDemo\SvmLightLibDemo.cs. It is in C# but should give you directions on how to use the library in VBA. If you get stuck, let me know :-)

      *** Code snippet: ***

      Option Explicit On

      Module SvmLightLib

      Private Declare Function NewFeatureVector Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_count As Integer, ByVal features As Integer(), ByVal weights As Single(), ByVal label As Double) As Integer
      Private Declare Sub DeleteFeatureVector Lib “c:\svmlib\svmlightlib.dll” (ByVal id As Integer)
      Private Declare Function GetFeatureVectorFeatureCount Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer) As Integer
      Private Declare Function GetFeatureVectorFeature Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer, ByVal feature_idx As Integer) As Integer
      Private Declare Function GetFeatureVectorWeight Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer, ByVal feature_idx As Integer) As Single
      Private Declare Function GetFeatureVectorLabel Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer) As Double
      Private Declare Sub SetFeatureVectorLabel Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer, ByVal label As Double)
      Private Declare Function GetFeatureVectorClassifScoreCount Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer) As Integer
      Private Declare Function GetFeatureVectorClassifScore Lib “c:\svmlib\svmlightlib.dll” (ByVal feature_vector_id As Integer, ByVal classif_score_idx As Integer) As Double

      Private Declare Sub _TrainModel Lib “c:\svmlib\svmlightlib.dll” (ByVal args As String)
      Private Declare Function TrainModel Lib “c:\svmlib\svmlightlib.dll” (ByVal args As String, ByVal feature_vector_count As Integer, ByVal feature_vectors As Integer()) As Integer
      Private Declare Sub SaveModel Lib “c:\svmlib\svmlightlib.dll” (ByVal model_id As Integer, ByVal file_name As String)
      Private Declare Function LoadModel Lib “c:\svmlib\svmlightlib.dll” (ByVal file_name As String) As Integer
      Private Declare Sub SaveModelBin Lib “c:\svmlib\svmlightlib.dll” (ByVal model_id As Integer, ByVal file_name As String)
      Private Declare Function LoadModelBin Lib “c:\svmlib\svmlightlib.dll” (ByVal file_name As String) As Integer
      Private Declare Sub _Classify Lib “c:\svmlib\svmlightlib.dll” (ByVal args As String)
      Private Declare Sub Classify Lib “c:\svmlib\svmlightlib.dll” (ByVal model_id As Integer, ByVal feature_vector_count As Integer, ByVal feature_vectors As Integer())
      Private Declare Sub DeleteModel Lib “c:\svmlib\svmlightlib.dll” (ByVal id As Integer)

      Private Declare Sub _TrainMulticlassModel Lib “c:\svmlib\svmlightlib.dll” (ByVal args As String)
      Private Declare Function TrainMulticlassModel Lib “c:\svmlib\svmlightlib.dll” (ByVal args As String, ByVal feature_vector_count As Integer, ByVal feature_vectors As Integer()) As Integer
      Private Declare Sub SaveMulticlassModel Lib “c:\svmlib\svmlightlib.dll” (ByVal model_id As Integer, ByVal file_name As String)
      Private Declare Function LoadMulticlassModel Lib “c:\svmlib\svmlightlib.dll” (ByVal file_name As String) As Integer
      Private Declare Sub SaveMulticlassModelBin Lib “c:\svmlib\svmlightlib.dll” (ByVal model_id As Integer, ByVal file_name As String)
      Private Declare Function LoadMulticlassModelBin Lib “c:\svmlib\svmlightlib.dll” (ByVal file_name As String) As Integer
      Private Declare Sub _MulticlassClassify Lib “c:\svmlib\svmlightlib.dll” (ByVal args As String)
      Private Declare Sub MulticlassClassify Lib “c:\svmlib\svmlightlib.dll” (ByVal model_id As Integer, ByVal feature_vector_count As Integer, ByVal feature_vectors As Integer())
      Private Declare Sub DeleteMulticlassModel Lib “c:\svmlib\svmlightlib.dll” (ByVal id As Integer)

      Sub Main()
      Dim id As Integer
      Dim features() As Integer = {1, 2, 3}
      Dim weights() As Single = {4, 5, 6}
      id = NewFeatureVector(3, features, weights, 0)
      Console.WriteLine(id) ‘ should say 1
      Console.WriteLine(GetFeatureVectorFeatureCount(id)) ‘ should say 3
      End Sub

      End Module

      • Milan said

        Hi Miha.
        I tried your snipet, but I must do some adjustments, because of VB6. For one function my code looks like this:

        Private Declare Function NewFeatureVector Lib “c:\svmlib\svmlightlib.dll” (ByVal featurecount As Integer, features() As Integer, weights() As Single, ByVal label As Double) As Integer

        Sub Main()
        Dim id As Integer
        Dim Features1(1 To 3) As Integer
        Dim Weights1(1 To 3) As Single

        ‘assigning some values to vectors above:
        For id = 1 To 3
        Features1(id) = id
        Weights1(id) = id + 3
        Next

        id = NewFeatureVector(3, Features1, Weights1, 0)

        End Sub

        It always say that are bad DLL calling conventions. Maybe it is because you had features and weights declared ByVal, but this is not possible in VB6 for arrays. I also tried declare this 2 variables as variant and then in Sub Main assign to this variant the same arrays as above. I also give dimensions of arrays from 0 to 2. But nothing works.
        If you have time look what could be wrong I would be happy.
        Thanks,
        Milan

      • Miha said

        Hi Milan!

        I have recompiled the library to change the calling convention to stdcall and to remove dependencies to MFC DLLs. You can download the new DLL from here.

        The code snippet below now works in VB6. Note that I had to change all “Integer” to “Long”. You will have to do the same. Let me know if this works for you :-)

        Private Declare Function NewFeatureVector Lib “e:\svmlightlib.dll” (ByVal featurecount As Long, features() As Long, weights() As Single, ByVal label As Double) As Long
        Private Declare Function GetFeatureVectorFeatureCount Lib “e:\svmlightlib.dll” (ByVal vecid As Long) As Long

        Sub Main()
        Dim id As Long
        Dim Features1(1 To 3) As Long
        Dim Weights1(1 To 3) As Single

        ‘assigning some values to vectors above:
        For id = 1 To 3
        Features1(id) = id
        Weights1(id) = id + 3
        Next

        id = NewFeatureVector(3, Features1, Weights1, 0)
        MsgBox (Str$(id)) ‘ should say 1

        MsgBox (Str$(GetFeatureVectorFeatureCount(id))) ‘ should say 3
        End Sub

  2. DH said

    I understand how to make newfeaturevectors.
    but it’s hard to make a train model.
    If I have a feature1 = 1(t=1), feature2 =3(t=-1).
    how I can make a decision line or train model?
    and I want classify input = 4 into t=-1.
    could you show me a simple example?
    I used VC++.

    • Miha said

      Hi!

      Sorry for this very late reply. I don’t fully understand your question. In your example, I can see that feature1 = 1 and feature2 = 3 but what are those t’s? Targets? Features do not have targets (labels), feature vectors do. So, if your feature vector is as follows: vec = (feature1 = 1)(feature2 = 3), it can be labeled with a particular target class, e.g. t = 1.

      I think you misunderstand how machine-learning classifiers work. Please take a look at Thorsten’s page http://svmlight.joachims.org/ and read under “How to use” about the input file format. You might also want to take a look at the C# example provided with the latest release of SvmLightLib.

      Hope this helps.

      Best regards,
      mIHA

  3. darel said

    I’m doing an Image retrieval task for my project and settled on your library and in extension T. J.’s SVM light as I believe it’s the most popular and consequently tested. It will give me better results than I can get if I were to develop my own within the time constraints I have. Please advise, since I will need to parallelize my system in a grid, and considering the costliness of classification (which means it should be a candidate too), how will I do this with your library? Any recommendations whatsoever?

    Regards,
    D.

    • Miha said

      Hi Darel!

      Yes, you should avoid implementing your own SVM’s : ) What do you plan to parallelize? The training or the classification? If it’s the training – SvmLightLib unfortunately does not support using multiple machines/cores for building a single model. It only supports building multiple models in parallel on a multi-core system. For training SVM’s in a multi-core or grid environment, please check out these references (note that I don’t know the state this s/w is in):

      * PSVM: http://code.google.com/p/psvm/
      * Bootstrapped SVM for Hadoop: http://www.analytics1305.com/documentation/hadoop_svm.html

      But if you only want to speedup the classification phase, this is relatively straightforward and possible also with SvmLightLib (although never tried in practice ; )). In a multi-core environment, you just need to create several threads and load-balance unlabeled feature vectors for classification. In a multi-machine environment, you need to distribute the trained model to all the machines and then each machine is able to do the classification.

      Good luck with your project,
      mIHA

  4. Vasileios Anagnostopoulos said

    I have downloaded svmlightlib and used it successfully under c#+monodevelop in windows. But I have a question. When I complete the learning phase I would like to have the value of the objective function to be minimized ( I want to implement a PSO algorithm for feature selection). Is it possible to do it? any ideas how to proceed?

    • Miha said

      Hi Vasileios!

      PSO, I guess, stands for the particle swarm optimization method? So, the objective function should reflect the quality of the model. One possibility is to perform evaluation (e.g. in a 10-fold cross validation setting) and measure accuracy, F1, or any other standard evaluation metric as your objective function for PSO.

      Let me know if I misunderstood your inquiry.

      Best regards,
      mIHA

      • Vasileios Anagnostopoulos said

        The idea behind my work is that I am tackling a computer vision problem. I have a big feature vector x with entries from combined descriptors for a region. I also have a function f(x,b) where b is a binary variable which transforms features to a new feature, b is the parameters of the function.

        so I feed the new descriptors to an svm and the idea is to find the “best” b vector that gives me the biggest margin (equivalently the lowest w^{T}*w)

        I wanted the c# version since my code is in emgucv. Anyway, I save everything in a file and I am developing a java version with libsvm-java which i “tweaked” to provide me the objective function. PSO is indeed particle swarm optimization.

        Maybe the I am wrong but the measure accuracy is on test data so it is not an option to optimize -> overfitting, the margin is better.

        Regards
        Vasileios

      • Miha said

        Hi Vasileios!

        There are at least two good reasons why you should optimize your model (i.e. do your feature selection) on *the validation set* (i.e. part of the training set that you don’t use for training):

        1.) To avoid overfitting. Overfitting means that the model performs well on the training set, but (much) worse on the test set. So, you need to select those features that increase the model’s performance on the validation set therefore avoiding overfitting to the training set.
        2.) The model is useless if it doesn’t perform well on the test set : ) You can get 100% accuracy on the training set and it doesn’t mean anything if the accuracy is below 50% on the test set. If you optimize your model’s performance on the validation set, you make sure that your model will perform well in practice.

        And I have another doubt about your idea on margin maximization. In my opinion, if you build one model with one feature set (model A) and then another one with another feature set (model B), the size of the margin is not directly comparable. It can happen that the model A performs better in practice despite the fact that its margin is smaller than that of the model B. I’m not 100% sure about this, but you should definitely check in the appropriate literature.

        Best regards,
        mIHA

  5. hutao said

    hi Miha,

    I wonder how to run the demo (in c#) in Linux. I have no knowledge about C# at all.

    Also, I’d like to know how to make use of your library in C/C++ (in Linux environment).
    Could you give an example?

    Thanks a lot!!

    • Miha said

      Hi!

      If you have no knowledge of C# and you plan to use SvmLightLib on Linux, I suggest you simply use C++ (alternatively, you could try C# under Mono but I guess this would only complicate things). The core of SvmLightLib is implemented in C/C++ anyhow, so it should compile with gcc (probably with minor modifications). I don’t have the time right now to prepare a C++ usage example. But I will do it as soon as the time permits.

      Let me know if you manage to compile it with gcc : )

      Best regards,
      mIHA

  6. yinggong zhao said

    hi, i want to use the dll file in a C# project, but i find that each time when i add this file, the VS IDE always show a dialog, “a reference to svmlightlib.dll could not be added. please make sure that the file is accessible, and that is a valid assembly or COM component” Thanks very much!

    • Miha said

      Hi!

      I guess you get this error message when you try to reference svmlightlib.dll from the project. You shouldn’t explicitly reference it at all. You simply put the file in a folder on your disk (e.g. c:\svmlightlib\bin) and add the folder to the environmental variable PATH. To do this, you need to go to Control Panel/System/Advanced/Environmental variables and look up PATH under System variables. Double-click the entry and append something like “;c:\svmlightlib\bin” to the variable value. You might need to restart your computer; you will definitely need to restart the IDE.

      Best regards,
      mIHA

  7. Vivek said

    Hey Miha,

    Thanks for a wonderful wrapper for SVMLight. The API is very useful and easy to work with. The featureVector unloading routine has a memory leak (detected by VLD). In “void DeleteFeatureVector(int id)” call, you forgot to delete the feature_vector itself. Just adding “delete feature_vector;” before calling “feature_vectors.erase” did the job.

    Hope this helps
    cheers
    Vivek

    • Miha said

      Thanks, Vivek! : )

      I will include your bug fix into the code.

      Thanks again and best regards,
      mIHA

RSS feed for comments on this post

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: