Accurate and efficient gesture spotting via pruning and subgesture reasoning

更新时间:2023-06-09 21:33:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

AccurateandE cientGestureSpottingvia

PruningandSubgestureReasoning

JonathanAlon,VassilisAthitsos,andStanSclaro

ComputerScienceDepartment

BostonUniversity

Boston,MA02215,USA

Abstract.Gesturespottingisthechallengingtaskoflocatingthestartandendframesofthevideostreamthatcorrespondtoagestureofinter-est,whileatthesametimerejectingnon-gesturemotionpatterns.Thispaperproposesanewgesturespottingandrecognitionalgorithmthatisbasedonthecontinuousdynamicprogramming(CDP)algorithm,andrunsinreal-time.Tomakegesturespottinge cientapruningmethodisproposedthatallowsthesystemtoevaluatearelativelysmallnum-berofhypothesescomparedtoCDP.Pruningisimplementedbyasetofmodel-dependentclassi ers,thatarelearnedfromtrainingexamples.Tomakegesturespottingmoreaccurateasubgesturereasoningprocessisproposedthatmodelsthefactthatsomegesturemodelscanfalselymatchpartsofotherlongergestures.Inourexperiments,theproposedmethodwithpruningandsubgesturemodelingisanorderofmagnitudefasterand18%moreaccuratecomparedtotheoriginalCDPalgorithm.1Introduction

Manyvision-basedgesturerecognitionsystemsassumethattheinputgesturesareisolatedorsegmented,thatis,thegesturesstartandendinsomereststate.Thisassumptionmakestherecognitiontaskeasier,butatthesametimeitlimitsthenaturalnessoftheinteractionbetweentheuserandthesystem,andthereforenegativelya ectstheuser’sexperience.Inmorenaturalsettingsthegesturesofinterestareembeddedinacontinuousstreamofmotion,andtheiroccurrencehastobedetectedaspartofrecognition.Thisispreciselythegoalofgesturespotting:tolocatethestartpointandendpointofagesturepattern,-monapplicationsofgesturespottingincludecommandspottingforcontrollingrobots[1],televisions[2],computerapplications[3],andvideogames[4,5].Arguably,themostprincipledmethodsforspottingdynamicgesturesarebasedondynamicprogramming(DP)[3,6,7].Findingtheoptimalmatchingbetweenagesturemodelandaninputsequenceusingbrute-forcesearchwouldinvolveevaluatinganexponentialnumberofpossiblealignments.Thekeyad-vantageofDPisthatitcan ndthebestalignmentinpolynomialtime.Thisis ThisresearchwassupportedinpartthroughU.S.grantsONRN00014-03-1-0108,NSFIIS-0308213andNSFEIA-0202067.

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

10

20

30

40

50677 (5S)

750 (5E)807 (6S)873 (6E)1020304050677 (5S)

750 (5E)807 (6S)873 (6E)

(a)(b)(c)

Fig.1.Pruning(a,b):exampledynamicprogrammingtableformatchinginputstream(xaxis)toamodelgestureforthedigit“6”(yaxis).Likelyobservationsarerepresentedbyblackcellsinthetable(a).Thecellsremainingafterpruning(b).Inthisexample87%ofthecells(showninwhite)werepruned.Subgesturereasoning(c):examplefalsedetectionofthedigit“5”,whichissimilartoasubgestureofthedigit“8”.

achievedbyreducingtheproblemof ndingthebestalignmenttomanysubprob-lemsthatinvolvematchingapartofthemodeltopartsofthevideosequence.Themainnoveltyofourmethodisapruningtechniquethateliminatestheneedtosolvemanyofthesesubproblems.Asaresult,gesturespottingandrecog-nitionbecomebothfasterandmoreaccurate:fasterbecauseasmallernumberofhypothesesneedtobeevaluated;moreaccuratebecausemanyofthehy-pothesesthatcouldhaveledtofalsematchesareeliminatedatanearlystage.InFigure1(b)thenumberofhypothesesevaluatedbytheproposedalgorithmisproportionaltothenumberofblackpixels,andthenumberofhypothesesthatareevaluatedbyastandardDPalgorithmbutareprunedbytheproposedalgorithmisproportionaltothenumberofwhitepixels.

paringthematchingscoresandusingclassspeci cthresholds,asistypicallydone[3,6],isofteninsu cientforpickingouttherightmodel.Weproposeidentifying,foreachgestureclass,thesetof“subgesture”classes,i.e.,thesetofgesturemodelsthataresimilartosubgesturesofthatclass.Whileagestureisbeingperformed,itisnaturalforthesesubgestureclassestocausefalsealarms.Forexample,intheonlinedigitrecognitionexampledepictedinFigure1(c),thedigit“5”maybefalselydetectedinsteadofthedigit“8”,because“5”issimilartoasubgestureofthedigit“8”.Theproposedsubgesturereasoningcanreliablyrecognizeandavoidthebulkofthosefalsealarms.

2RelatedWork

Gesturespottingisaspecialcaseofthemoregeneralpatternspottingproblem,wherethegoalisto ndtheboundaries(startpointsandendpoints)ofpatternsofinterestinalonginputsignal.Patternspottinghasbeenappliedtodi erenttypesofinputincludingtext,speech[8],andimagesequences[6].

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

Therearetwobasicapproachestodetectionofcandidategestureboundaries:thedirectapproach,whichprecedesrecognitionofthegestureclass,andthein-directapproach,wherespottingisintertwinedwithrecognition.Methodsthatbelongtothedirectapproach rstcomputelow-levelmotionparameterssuchasvelocity,acceleration,andtrajectorycurvature[5]ormid-levelmotionpara-meterssuchashumanbodyactivity[9],andthenlookforabruptchanges(e.g.,zero-crossings)inthoseparametersto ndcandidategestureboundaries.

Intheindirectapproach,thegestureboundariesaredetectedusingtherecog-nitionscores.Mostindirectmethods[3,7]arebasedonextensionsofDynamicProgramming(DP)algorithmsforisolatedgestures(e.g.,HMMs[10]andDTW

[11]).Inthosemethods,thegestureendpointisdetectedwhentherecognitionlikelihoodrisesabovesome xedoradaptive[3]threshold,andthegesturestartpointcanbecomputed,ifneeded,bybacktrackingtheoptimalDPpath.Onesuchextension,continuousdynamicprogramming(CDP),wasproposedbyOka

[7].InCDP,aninputsequenceismatchedwithagesturemodelframe-by-frame.Todetectacandidategesture,thecumulativedistancebetweenthemiscom-paredtoathreshold.

Afteraprovisionalsetofcandidateshasbeendetected,asetofrulesisappliedtoselectthebestcandidate,andtoidentifytheinputsubsequencewiththegestureclassofthatcandidate.Di erentsetsofruleshavebeenproposedintheliterature:peak ndingrules[6],spottingrules[12],andtheuserinteractionmodel[13].

Oneproblemthatoccursinpracticebutisoftenoverlookedisthefalsede-tectionofgesturesthataresimilartopartsofotherlongergestures.Toaddressthisproblem[3]proposedtwoapproaches.Oneislimitingtheresponsetimebyintroducingamaximumlengthofthenongesturepatternthatislongerthanthelargestgesture.Another,istakingadvantageofheuristicinformationtocatchone’scompletionintentions,suchasmovingthehandoutofthecamerarangeorfreezingthehandforawhile.The rstapproachrequiresaparametersetting,andthesecondapproachlimitsthenaturalnessoftheuserinteraction.Wepro-poseinsteadtoexplicitlymodelthesubgesturerelationshipbetweengestures.Thisisamoreprincipledwaytoaddresstheproblemofnestedgestures,whichdoesnotrequireanyparametersettingorheuristics.

3GestureSpotting

Inthissectionwewillintroducethecontinuousdynamicprogramming(CDP)algorithmforgesturespotting.Wewillthenpresentourproposedpruningandsubgesturereasoningmethodsthatresultinanorderofmagnitudespeedupand18%increaseinrecognitionaccuracy.

3.1ContinuousDynamicProgramming(CDP)

LetM=(M1,...,Mm)beamodelgesture,inwhicheachMiisafeaturevectorextractedfrommodelframei.Similarly,letQ=(Q1,...,Qj,...)beacontinuous

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

streamoffeaturevectors,inwhicheachQjisafeaturevectorextractedfrominputframej.Weassumethatacostmeasured(i,j)≡d(Mi,Qj)betweentwofeaturevectorsMiandQjisgiven.CDPcomputestheoptimalpathandtheminimumcumulativedistanceD(i,j)betweenthemodelsubsequenceM1:iandtheinputsubsequenceQj :j,j ≤j.Severalwayshavebeenproposedintheliteraturetorecursivelyde nethecumulativedistance.Themostpopularde nitionis:

D(i,j)=min{D(i 1,j),D(i 1,j 1),D(i,j 1)}+d(i,j).(1)Forthealgorithmtofunctioncorrectlythecumulativedistancehastobeinitializedproperly.Thisisachievedbyintroducingadummygesturemodelgframe0thatmatchesallinputframesperfectly,thatis,DM(0,j)=0forallj.Initializingthiswayenablesthealgorithmtotriggeranewwarpingpathateveryinputframe.

IntheonlineversionofCDPthelocaldistanced(i,j)andthecumulativedistanceD(i,j)neednotbestoredasmatricesinmemory.Itsu cestostoreforeachmodel(assumingbacktrackingisnotrequired)twocolumnvectors:thecurrentcolumncoljcorrespondingtoinputframej,andthepreviouscolumncolj 1correspondingtoinputframej 1.EveryvectorelementconsistsofthecumulativedistanceDofthecorrespondingcell,andpossiblyotherusefuldatasuchasthewarpingpathlength.

3.2CDPwithPruning(CDPP)

TheCDPalgorithmevaluatesEq.1foreverypossibleiandj.Akeyobservationisthatformanycombinationsofiandj,eitherthefeature-baseddistanced(i,j)orthecumulativedistanceD(i,j)canbesu cientlylargetoruleoutallalign-mentsgoingthroughcell(i,j).Ourmaincontributionisthatwegeneralizethispruningstrategybyintroducingasetofbinaryclassi ersthatarelearnedfromtrainingdatao ine.Thoseclassi ersarethenusedtoprunecertainalignmenthypothesesduringonlinespotting.Inourexperiments,thispruningresultsinanorderofmagnitudespeedup.

TheproposedpruningalgorithmisdepictedinAlgorithm1.Theinputtothealgorithmisinputframej,inputfeaturevectorQj,asetofmodeldependentclassi ersCi,andtheprevioussparsecolumnvector.Theoutputisthecurrentsparsecolumnvector.

Theconceptofmodeldependentclassi ersCithatarelearnedfromtrainingdatao ine,andareusedforpruningduringonlinespottingisnovel.Di erenttypesofclassi erscanbeusedincluding:subsequenceclassi ers,whichprunebasedonthecumulativedistance(orlikelihood);transitionclassi ers,whichprunebasedonthetransitionprobabilitybetweentwomodelframes(orstates);andsingleobservationclassi ers,whichprunebasedonthelikelihoodofthecurrentobservation.Inourexperimentsweusesingleobservationclassi ers:

+1ifd(i,j)≤τ(i)Ci(Qj)=,(2) 1ifd(i,j)>τ(i)

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

input:inputframej,inputfeaturevectorQj,classi ersCi,and

previoussparsecolumnvector<indj 1,listj 1>.

output:currentsparsecolumnvector<indj,listj>.

1i=1;

2ptr=indj 1(0);

3whilei≤mdo

4ifCi(Qj)==+1then

5nl=newelement;//nlwillbeappendedtoendoflistj

6nl.D=min{indj(i 1).D,indj 1(i 1).D,indj 1(i).D}+d(i,j);7nl.i=i;

8append(listj,nl);

9indj=&listj(i);//&istheaddress-ofoperator,asinC

10i=i+1;

11else

//previouscolumnempty

12ifisempty(listj 1)then

13break;

14ifindj 1(i)==NULLthen

15whileptr→next!=NULLandptr→next→i≤ido

16ptr=ptr→next;

17end

//reachedtheendofpreviouscolumn

18ifptr→next==NULLthen

19break;

20i=ptr→next→i;

21else

22i=i+1;

23end

24end

25end

Algorithm1:TheCDPPalgorithm.whereeachτ(i)de nesadecisionstumpclassi erformodelframei,andis

estimatedasfollows:themodelisaligned,usingDTW,withallthetrainingexamplesofgesturesfromthesameclass.Thedistancesbetweenobservationiandalltheobservations(inthetrainingexamples)whichmatchobservationiaresaved,andthethresholdτ(i)issettothemaximumdistanceamongthosedistances.Settingthethresholdsasspeci edguaranteesthatallpositivetrain-ingexampleswhenembeddedinlongertestsequenceswillbedetectedbythespottingalgorithm.

Inordertomaximizee ciencywechoseasparsevectorrepresentationthatenablesfastindividualelementaccess,whilekeepingthenumberofoperationsproportionaltothesparsenessoftheDPtable(thenumberofblackpixelsinFig.1(b)).Thesparsevectorisrepresentedbyapair<ind,list>,whereindisavectorofpointersofsizem(themodelsequencelength),andisusedtoreference

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

elementsofthesecondvariablelist.Thevariablelistisasinglylinkedlist,whereeachlistelementisapairthatincludesthecumulativedistanceD(i,j)andtheindexiofthecorrespondingmodelframe.ThelengthoflistcorrespondstothenumberofblackpixelsinthecorrespondingcolumninFig.1(b).

WenotethatintheoriginalCDPalgorithmthereisnopruning,onlylines5-10areexecutedinsidethewhileloop,andiisincrementedby1.Incontrast,inCDPPwhenevertheclassi eroutputs 1andahypothesisisprunedtheniisincrementedbyano set,suchthatthenextvisitedcellinthecurrentcolumnwillhaveatleastoneactiveneighborfromthepreviouscolumn.

Algorithm1isinvokedseparatelyforeverygesturemodelMg.Forillustrationpurposesweshowitforasinglemodel.Afterthealgorithmhasbeeninvokedforthecurrentinputframejandforallthemodels,theend-pointdetectionalgorithmofSec.3.3isinvoked.

3.3GestureEndPointDetectionandGestureRecognition

Theproposedgestureendpointdetectionandgesturerecognitionalgorithmcon-sistsoftwosteps:the rststepupdatesthecurrentlistofcandidategesturemodels.Thesecondstepusesasetofrulestodecideifagesturewasspotted,i.e.,ifoneofthecandidatemodelstrulycorrespondstoagestureperformedbytheuser.Theendpointdetectionalgorithmisinvokedonceforeachinputframej.Inordertodescribethealgorithmwe rstneedthefollowingde nitions:–Completepath:alegalwarpingpathW(M1:m,Qj :j)matchinganinputsubsequenceQj :jendingatframejwiththecompletemodelM1:m.–Partialpath:alegalwarpingpathW(M1:i,Qj :j)thatmatchesaninputsubsequenceQj :jendingatthecurrentframejwithamodelpre xM1:i.–Activepath:anypartialpaththathasnotbeenprunedbyCDPP.–Activemodel:amodelgthathasacompletepathendinginframej.–Firingmodel:anactivemodelgwithacostbelowthedetectionacceptancethreshold.

–Subgesturerelationship:agestureg1isasubgestureofgestureg2ifitisproperlycontaineding2.Inthiscase,g2isasupergestureofg1.

Atthebeginningofthespottingalgorithmthelistofcandidatesisempty.Then,ateveryinputframej,afteralltheCDPcostshavebeenupdated,thebest ringmodel(ifsuchamodelexists)isconsideredforinclusioninthelistofcandidates,andexistingcandidatesareconsideredforremovalfromthelist.Thebest ringmodelwillbedi erentdependingonwhetherornotsubgesturereasoningiscarriedout,asdescribedbelow.Foreverynewcandidategesturewerecorditsclass,theframeatwhichithasbeendetected(ortheendframe),thecorrespondingstartframe(whichcanbecomputedbybacktrackingtheoptimalwarpingpath),andtheoptimalmatchingcost.Thealgorithmforupdatingthelistofcandidatesisdescribedbelow.Theinputtothisalgorithmisthecurrentlistofcandidates,thestateoftheDPtablesatthecurrentframe(theactivemodelhypothesesandtheircorrespondingscores),andthelistsofsupergestures.

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

Theoutputisanupdatedlistofcandidates.Stepsthatinvolvesubgesturerea-soningareusedinthealgorithmCDPPwithsubgesturereasoning(CDPPS)only,andaremarkedappropriately.

1.Findall ringmodelsandcontinuewithfollowingstepsifthelistof ringmodelsisnonempty.

2.CDPPSonly:conductsubgesturecompetitionsbetweenallpairsof ringmodels.Ifa ringmodelg1isasupergestureofanother ringgesturemodelg2thenremoveg2fromthelistof ringmodels.Afterallpairwisecom-petitionsthelistof ringmodelswillnotcontainanymemberwhichisasupergestureofanothermember.

3.Findthebest ringmodel,i.e.,themodelwiththebestscore.

4.Forallcandidatesgiperformthefollowingfourtests:

(a)CDPPSonly:ifthebest ringmodelisasupergestureofanycandidate

githenmarkcandidategifordeletion.

(b)CDPPSonly:ifthebest ringmodelisasubgestureofanycandidategi

then agthebestmodeltonotbeincludedinthelistofcandidates.(c)Ifthescoreofthebest ringmodelisbetterthanthescoreofacandidate

giandthestartframeofthebest ringmodeloccurredaftertheendframeofthecandidategi(i.e.,thebest ringmodelandcandidategiarenon-overlapping,thenmarkcandidategifordeletion.

(d)Ifthescoreofthebest ringmodelisworsethanthescoreofacandidate

giandthestartframeofthebest ringmodeloccurredaftertheendframeofthecandidategi(i.e.,thebest ringmodelandcandidategiarenon-overlapping,then agthebest ringmodeltonotbeincludedinthelistofcandidates.

5.Removeallcandidatesgithathavebeenmarkedfordeletion.

6.Addthebest ringmodeltothelistofcandidatesifithasnotbeen aggedtonotbeincludedinthatlist.

Afterthelistofcandidateshasbeenupdatedthenifthelistofcandidatesisnonemptythenacandidatemaybe”spotted”,i.e.,recognizedasagestureperformedbytheuserif:

1.CDPPSonly:allofitsactivesupergesturemodelsstartedafterthecandi-date’sendframej .Thisincludesthetrivialcase,wherethecandidatehasanemptysupergesturelist,inwhichcaseitisimmediatelydetected.

2.allcurrentactivepathsstartedafterthecandidate’sdetectedendframej .

3.aspeci ednumberofframeshaveelapsedsincethecandidatewasdetected.Thisdetectionruleisoptionalandshouldbeusedwhenthesystemdemandsahardreal-timeconstraint.Thisrulewasnotusedinourexperiments.Onceacandidatehasbeendetectedthelistofcandidatesisreset(emptied),andallactivepathhypothesesthatstartedbeforethedetectedcandidate’sendframearereset,andtheentireprocedureisrepeated.Tothebestofourknowledgetheideaofexplicitreasoningaboutthesubgesturerelationshipbetweengestures,asspeci edinsteps2,4a,and4bofthecandidatesupdateprocedureandstep1oftheend-pointdetectionalgorithm,isnovel.

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

Fig.2.Palm’sGra tidigits

[14].

Fig.3.Examplemodeldigitsextractedusingacoloredglove.

4ExperimentalEvaluation

WeimplementedContinuousDynamicProgramming(CDP)[7]withatypicalsetofgesturespottingrules.Inparticular,weusedaglobalacceptancethresholdfordetectingcandidategestures,andweusedthegesturecandidateoverlapreasoningdescribedinSec.3.3.Thisisthebaselinealgorithm,towhichwecompareourproposedalgorithms.TheproposedCDPwithpruningalgorithm(CDPP),isimplementedasdescribedinSec.3.2,withthesamegesturespottingrulesusedinthebaselinealgorithm.Thesecondproposedalgorithm,CDPPwithsubgesturereasoning(CDPPS),includestheadditionalstepsmarkedinSec.3.3.

Wecomparethebaselinealgorithmandtheproposedalgorithmsintermsofe ciencyandaccuracy.Algorithme ciencyismeasuredbyCPUtime.Accuracyisevaluatedbycountingforeverytestsequencethenumberofcorrectdetectionsandthenumberoffalsealarms.Acorrectdetectioncorrespondstoagesturethathasbeendetectedandcorrectlyclassi ed.Agestureisconsideredtohavebeendetectedifitsestimatedendframeiswithinaspeci edtemporaltoleranceof15framesfromthegroundtruthendframe.Afalsealarmisagesturethateitherhasbeendetectedwithintolerancebutincorrectlyclassi ed,oritsendframeismorethan15framesawayfromthecorrectendframeofthatgesture.

Toevaluateouralgorithmwehavecollectedvideoclipsoftwousersgesturingtendigits0-9insequence.ThevideoclipswerecapturedwithaLogitech3000Procamerausinganimageresolutionof240×320,ataframerateof30Hz.Foreachuserwecollectedtwotypesofsequencesdependingonwhattheuserwore:threecoloredglovesequencesandthreelongsleevessequences;(atotalofsixsequencesforeachuser).Themodeldigitexemplars(Fig.3)wereextractedfromthecoloredglovesequences,andwereusedforspottingthegesturesinthelongvideostreams.Therangeoftheinputsequencelengthsis[1149,1699]frames.

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

Therangeofthedigitsequencelengthsis[31,90]frames.Therangeofthe(inbetweendigits)non-gesturessequencelengthsis[45,83]frames.

Fortheglovesequencesthehandwasdetectedandtrackedusingtheglovecolordistribution.Fortheothersequencesthehandwasdetectedandtrackedusingcolorandmotion.Ahandmaskwascomputedusingskinandnon-skincolordistributions[15],andwasappliedtoanerrorresidualimageobtainedbyablock-basedoptical owmethod[16].Foreveryframewecomputedthe2Dhandcentroidlocationsandtheanglebetweentwoconsecutivehandlocations.Thefeaturevectors(MiandQj)usedtocomputethelocaldistanced(i,j)arethe2Dpositionsonly.Theclassi erusedforpruningwascombinationoftwoclassi ers:onebasedonthe2Dpositionsandtheotherbasedontheanglefeature.Thoseclassi ersweretrainedonthemodeldigitsintheo inestep.Toavoidoverpruningweadded20pixelstothethresholdsofallpositionclassi ersandanangleof25degreestoallangleclassi ers.

Fortheend-pointdetectionalgorithmwespeci edthefollowingsupergestureliststhatcapturethesubgesturerelationshipbetweendigits:“1”

“4”

“5”

“7”{“4”,“7”,“9”}{“2”,“5”,“6”,“8”,“9”}{“8”}{“2”,“3”,“9”}

TheexperimentalresultsaresummarizedinTable1.ForthebaselineCDPalgorithmweobtained47correctdetectionsand13falsematches.Forthepro-posedCDPPalgorithmwithoutsubgesturereasoningweobtained51correctdetectionsand9falsematches,and paredtoCDPPwithoutsubgesturereasoning,theproposedCDPPwithsubgesturereasoningcorrectedasingleinstanceofthedigit“3”initiallyconfusedasitscorrespondingsubdigit“7”,fourinstancesofthedigit“8”initiallyconfusedasitscorrespondingsubdigit“5”,andtwoinstancesofthedigit“9”initiallyconfusedasitscorrespondingsubdigit“1”.

MethodCDPCDPPCDPPS

FalseMatches1392

parisonofgesturespottingaccuracyresultsbetweenthebaselineandtheproposedgesturespottingalgorithms.Theaccuracyresultsaregivenintermsofcorrectdetectionratesandfalsematches.Thetotalnumberofgesturesis60.

Abstract. Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting a

InourexperimentsCDPPexecuted14timesfastercomparedtoCDPintermsofCPUtime,assumingfeatureextraction.Theoverallvision-basedrecog-nitionsystemrunscomfortablyinreal-time.

5ConclusionandFutureWork

Thispaperpresentedanovelgesturespottingalgorithm.Inourexperiments,thisnovelalgorithmisanorderofmagnitudefasterand18%moreaccuratecomparedtocontinuousdynamicprogramming.Ourcurrentworkexploresotherclassi ersthatcanbeusedforpruning.Inordertofurtherimproveoursystem’saccuracy,weplantoincorporateamodulethatcanmakeuseoftheDPalignmentinformationtoverifythatthecandidategesturethathasbeendetectedandrecognizedindeedbelongstotheestimatedclass.Thisiscommonlyknownasveri cationinwordspottingforspeech[8].Finally,ratherthanspecifyingthesubgesturerelationshipsmanuallyweplantolearnthemfromtrainingdata.References

1.Triesch,J.,vonderMalsburg,C.:Agestureinterfaceforhuman-robot-interaction.In:AutomaticFaceandGestureRecognition.(1998)546–551

2.Freeman,W.,Weissman,C.:Televisioncontrolbyhandgestures.TechnicalReport1994-024,MERL(1994)

3.Lee,H.,Kim,J.:AnHMM-basedthresholdmodelapproachforgesturerecognition.PAMI21(1999)961–973

4.Freeman,W.,Roth,M.:Computervisionforcomputergames.In:AutomaticFaceandGestureRecognition.(1996)100–105

5.Kang,H.,Lee,C.,Jung,K.:Recognition-basedgesturespottinginvideogames.PatternRecognitionLetters25(2004)1701–1714

6.Morguet,P.,Lang,M.:SpottingdynamichandgesturesinvideoimagesequencesusinghiddenMarkovmodels.In:ICIP.(1998)193–197

7.Oka,R.:Spottingmethodforclassi cationofrealworlddata.TheComputerJournal41(1998)559–565

8.Rose,R.:Wordspottingfromcontinuousspeechutterances.In:AutomaticSpeechandSpeakerRecognition-AdvancedTopics.Kluwer(1996)303–330

9.Kahol,K.,Tripathi,P.,Panchanathan,S.:Automatedgesturesegmentationfromdancesequences.In:AutomaticFaceandGestureRecognition.(2004)883–888

10.Starner,T.,Pentland,A.:Real-timeamericansignlanguagerecognitionfromvideo

usinghiddenMarkovmodels.In:SCV95.(1995)265–270

11.Darrell,T.,Pentland,A.:Space-timegestures.In:Proc.CVPR.(1993)335–340

12.Yoon,H.,Soh,J.,Bae,Y.,Yang,H.:Handgesturerecognitionusingcombined

featuresoflocation,angleandvelocity.PatternRecognition34(2001)1491–1501

13.Zhu,Y.,Xu,G.,Kriegman,D.:Areal-timeapproachtothespotting,representa-

tion,andrecognitionofhandgesturesforhuman-computerinteraction.CVIU85(2002)189–208

14.Palm:Gra ttialphabet.(/us/products/input/)

15.Jones,M.,Rehg,J.:Statisticalcolormodelswithapplicationtoskindetection.

IJCV46(2002)81–96

16.Yuan,Q.,Sclaro ,S.,Athistos,V.:Automatic2Dhandtrackinginvideosequences.

In:WACV.(2005)

本文来源:https://www.bwwdw.com/article/s0t1.html

Top