1999 ConvolutionKernelsOnDiscreteStructures

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Convolution Kernel Function

Notes

Cited By

Quotes

Abstract

We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on a infinite set from kernels involving generators of the set. The family of kernels generated generalizes the family of radial basis kernels. It can also be used to define kernels in the form of joint Gibbs probability distributions. Kernels can be built from hidden Markov random fields, generalized regular expressions, pair-HMMs, or ANOVA decompositions. Uses of the method lead to open problems involving the theory of infinitely divisible positive definite functions. Fundamentals of this theory and the theory of reproducing kernel Hilbert spaces are reviewed and applied in establishing the validity of the method.

Many problems in statistics and pattern recognition demand that discrete structures like strings, trees, and graphs be classified or clustered based on similarity. To do this, it is desirable to have a method to extract real-valued features [math]\displaystyle{ \phi_1(x), \phi_2(x), ... }[/math] from any structure [math]\displaystyle{ x }[/math] in a class [math]\displaystyle{ X }[/math] of discrete structures. If finitely many features are extracted, the feature extraction process can be represented by a mapping from [math]\displaystyle{ X }[/math] into d-dimensional Euclidean space [math]\displaystyle{ R^d }[/math] and if infinitely many features are extracted, by a mapping into the Hilbert space of all square-summable sequences, [math]\displaystyle{ l_2 }[/math] In the latter case we get an infinite series representation of [math]\displaystyle{ x }[/math] much like the Fourier series representation of a function in [math]\displaystyle{ L_2 }[/math]. Here we present some methods for defining series representations for discrete structures using a general type of kernel function we call a convolution kernel.

References

  • C. Berg, J. Christensen, and P. Ressel. HarmonicAnalysisonSemi­groups: Theory of Positive Definite and Related Functions. Springer, (1984).
  • C.J.C. Burges. Atutorialonsupportvectormachinesforpattern recognition.DataMiningandKnowledgeDiscovery,2(2):121{167,1998.
  • D.Dudley.RealAnalysisandProbability.Wiley,1989.
  • R.Durbin,S.Eddy,A.Krogh,andG.Mitchison.BiologicalSequence Analysis:ProbabilisticModelsofProteinsandNucleicAcids.Cambridge UniversityPress,1998.
  • R.Elliott,L.Aggoun,andJ.Moore.HiddenMarkovModels,Estimation andControl.SpringerVerlag,1995.
  • WilliamFeller.AnIntroductiontoProbabilityTheoryanditsApplica­tions,volume1.JohnWiley,1971.
  • WilliamFeller.AnIntroductiontoProbabilityTheoryanditsApplica­tions,volume2.JohnWiley,1971.
  • C.FitzgeraldandR. Horn. Onfractionalhadamardpowersofpositive de.nitematrices.JournalofMathematicalAnalysisandApplications, 61:633{642,1977.
  • K.S.Fu.Syntacticpatternrecognitionandapplications.Prentice-Hall, EnglewoodCli.s,NJ,1982.
  • King-SunFuandTaylorL.Booth.Grammaticalinference:Introduc­tionandsurvey{parti.IEEETransactionsonSystems,Man, and Cybernetics,SMC-5(1):95{111,January1975.
  • King-SunFuandTaylorL.Booth.Grammaticalinference:Introduc­tionandsurvey{partii.IEEETransactionsonSystems,Man, and Cybernetics,SMC-5(4):409{423,July1975.
  • SGemanandDGeman.Stochasticrelaxations,Gibbsdistributionsand theBayesianrestorationofimages.IEEETrans.onPatternAnalysis andMachineIntelligence,6:721{742,1984.
  • F. Girosi. Anequivalencebetweensparseapproximationandsupport vectormachines.NeuralComputation,10(6):1455{1480,1998.
  • J.E.HopcroftandJ.D.Ullman.IntroductiontoAutomataTheory, Languages,andComputation.Addison-Wesley,Reading,Massachus­setts,1979.
  • T.Jaakkola,M.Diekhans,andD.Haussler.UsingtheFisherkernel methodtodetectremoteproteinhomologies.InProceedings oftheSev­enthInternationalConferenceonIntelligentSystemsforMolecularBi­ology,Aug1999.
  • T.JaakkolaandD.Haussler.Exploitinggenerativemodelsindiscrimi­nativeclassi.ers.InAdvancesinNeuralInformationProcessingSystems 11,SanMateo,CA,1998.MorganKau.mannPublishers.Toappear.
  • T.JaakkolaandD.Haussler.Probabilistickernelregressionmodels. InProc.oftheSeventhInt.WorkshoponAIandStatistics,1998.To appear.
  • S.Lauritzen.GraphicalModels.OxfordUniversityPress,1996.
  • M.Linial,N.Linial,N.Tishby,andG.Yona.Globalself-organizationof allknownproteinsequencesrevealsinherentbiologicalsignatures.jmb, 268(2):539{556,1997.
  • N.Linial,E.London,andY.Rabinovich. Thegeometryofgraphs andsomeofitsalgorithmicapplications.Combinatorica,15(2):215{245, (1995).
  • D.J.C.MacKay.Introductiontogaussianprocesses, 1997. Available fromhttp://wol.ra.phy.cam.ac.uk/mackay/.
  • L.M.at.e.HilbertSpaceMethodsinScienceandEngineering. Adam Hilger,1989.
  • T.PoggioandF.Girosi.Asparserepresentationforfunctionapproxi­mation.NeuralComputation,10(6):1445{1454,1998.
  • LawrenceR.Rabiner.AtutorialonhiddenMarkovmodelsandselected applicationsinspeechrecognition.Proc.oftheIEEE,77(2):257{286, February1989.
  • L.Rosen.Positivepowersofpositivepositivede.nitematrices.Cana­dianJournalofMathematics,48:196{209,1996.
  • DavidSanko.andJosephB.Kruskal.Timewarps,stringedits, andmacromolecules:thetheoryandpracticeofsequencecomparison. Addison-Wesley,1983.
  • B.Sch.olkopf,K.Sung,C.Burges,F.Girosi,P.Niyogi,T.Poggio, and Vladimir N. Vapnik. ComparingsupportvectormachineswithGaussiankernelsto radialbasisfunctionclassi.ers.IEEETransactionsonSignalProcessing, 45(11):2758{2765,November1997.
  • C. Scholkopf,J.C.Burges,andA.J.Smola.AdvancesinKernelMethods: SupportVectorLearning.MITPress,1999.
  • GilbertStrang.LinearAlgebraanditsApplications.HarcourtBrace Jovanovich,1986.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1999 ConvolutionKernelsOnDiscreteStructuresDavid HausslerConvolution Kernels on Discrete Structureshttp://www.cbse.ucsc.edu/sites/default/files/convolutions.pdf1999