Error correction via a postprocessor for continuous speech recognition

更新时间:2023-05-21 16:33:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

This paper presents a new technique for overcoming several types of speech recognition errors by post-processing the output of a continuous speech recognizer. The post-processor output contains fewer errors, thereby making interpretation by higher-level mo

ERORR OCRRETIONC IVA A OPT-SPORCESSO FRORC OTINUONU SSPEEHCR COENGTIOINEircK. Riggenr JmeasF. Alle nDe prtaemnt o fomCpuet Sriencce;U ivnrseiyt o Rfcoeshert; Rohceters N,weY ro 1k42760-226

rfingegr j,aesm@gsc.rcoehset.rdeu

Tih sppea rrepestns naew echntquiefo orervcoing msveearlt pesy of peech srcognetiioner ors rb pyostp-ocrseingsth eoutptuo fa co tinunus sopeec rhecgonizr. Tehep stop-orcesor ostpuut cnoaitn sewerferr ors,th reey makbni igterpnrtetaoinb yihhge-rlvelemo dles, uschu sa aaprers,ina pseeh cnudrsetndiang sytsmem re oerialle.b hTep rmary aidanvagt te ohte opstpr-cession agpropah coev rxeitsng aippoachesrf o rvorceoingmS R rroer siles n iits bilatyit o nitrdouce otionsptha ater ot anailavle bint e Sh Romudels'o uptu.t his Twor prkovdes eviiendce fo treh laci mhatta moerndc notnious usepce hrcogeinez rancb eused scuecssufll iyn bl\ak-cobx"fa siohn fr roobsuty lniertrptine gsontapnoues tteurnaes ci n diaaoluegwit h auham. nEixtinsgme htodsf or cntoniuuos peesc hrcogniteonid noto erpfom arswe llo ns potanenus speochea sw ewoud hlpe. Evoens tta eofth era rectgonzerissuch a SspihnxI- I]71 an da rceonizgerb ulti usngi HT K1] 24ac hive eles shta 6n0% ord awcuraccyon ue n tpescehcol lecetdf romco vensationrs abuto a pecs i cropbemlwit hthe T ranis-59sy tes m]1 He.era era few xaemlesp fo the kidn sfoe rorrsth a octcur hwnerecog inzngis pnotneoas utuetances. rTehya r drawnefr m poorbel-mosvilg dnalogieu sthatw ehavecol ecled tfor msursein treacingt wti hhe Ttrais-n95s ystm.e 3 oSe merorr asers impl oee-fnr-ono eerlacepenmt,ss chua shti sno:e ere Hi sa ntutreancew th a iepraclemen of at isglnew rod y bultipmel smlalerw odsr:Th e fllowonig tuetrnaecco ntiansa mo r comelepxe axpmel n iwich hdjaaectnwo rs dra miesecrgnizoed na di whicn hteh hpythoseizd woerds oerval pte hobndaryu beweent th erfeeercn weords:REF GO:F OMR HICAGOC T OOTEDOLH Y: PO FROMGCHI CGAO TOTO LE AEV A TRFE:IRGH TENSDTH ET ARI NFROMMO TREAL TONC ARHLESTONH Y: RPAT EESD NHTTA TAIRNFR OMMON TRALE OTCHARL STENOABTRSCAT1 .NTROIDCTUIO

TNIHS WRO WKSA SPPUROED TY BTHE NUIEVSITR YOF ROCHSTERE SCDEPAR MTNE ANTDO RNAR/PAR ESAREH GCANR NUTBME RN0001-429--J152. 11 For htsi xeepimret nivonvilgn ShpixnI-I,hte coasticum odel nd thea cals-sasbde laguangem dol eewre tainre dno ATIS adta .Hncee,osmeof the e rorr s atirtbitublaeto t e hmoerdaetoccurrence o fo t-uof-vocbaluray( OV)O wrods . Fo2 trhi sexepimentr niolvvngi te hHK-Tbsade recgonizr, theea ouscit modce aln tdh word-besed aanlugge aodelmwe e rraitendon t e hTairn DiasoglueC orpsu 6] (olcecletd prio tort he reactio on fteh raTns-i95s ytse)m.3 Int ehexa mpel,st he HY tagPi nidcats teheSR sy tems' hypstoehis, sadnthe REF ta gidincate tseh rfeernceet ransrcpitoni.

nIaddi itno,sp ece hrecgonzeis rre inacerasnigylbe in gsedu as\lbca-bkoxe,s h"aing v aclarey lspci ed funecitno an wdelld- ened inutp and sotputsub ut tohewise provriindg o hnoks ofr alotrien ogr tuningi tnrnea lpeoatrins,owith th e noatlebexc etp

ion o fth eabliiy tt oddaw rds ot the recoogizne's rocavbular.y sA a neaxplemo sfpece hercgnioiotna s abl akc-bo, sevxrael ersearh labs hacve annonceudp alsnto m kaes peec rehocgitnio avnilaablet toe rhseaech rocmmnityu byr unnni pgubicll aycecsible spseec shrevesro nth Intereent Su.c sehvres rowudll keiyl emlopy a engralepurpoe slagnuae mgdel ond aaouscic tomel.d I nroer dtoem ply toehm fora t sak ivonlvign wordsn o taaivlaleb t othes everr's algnaugemode, a remloet ser wuuod lnedes moeway t ocorrcett eh reror cosmimted ty thb ebalc-kbx oR serSevr. Tih psaer preseptn s nawet cenihuq efo rvoecrmingo evesar tlyps oe fpesehc ecognriitno erorrs y bospt-proecsisngth e utoupto f canotiuons supeceh rcoeginzer .Te hpotpsocrssor euopttu cntaons fiwee rrerros, therbe myainkgi netrretaptionb hiygerh-lvele mdoules such,sa ap rsae,r ina speech ndersutnaing dystsme ome rreiallbe. Te goha oflth siw rk os tio cnoritbuett o sucecssfl undursteadinn gfo spnotanoes spuoke uttnraeces inn hmanu-omcptue rdailougeby a con evsatronai lplannngia ssisatnt cllaedthe Tr ian-9s sys5et.m Our objectie is vto reuce speedch reogcniiotnerrors byr e nign r eveo modnfiiyn thege e ticv vecoabulay rfo speaec recohgnierz To .achevi teish, w regear dht channee lrfm the soeapek tr otheo tpuut f oteh RS mdoue as la nois cyhnnae, lnadwe daotpst tiatiscl atcheiqnues( smo ofe htme obrower dfro msatitstcali amcihe translanitno for) odmliength t caahnnel nior edrto ocrerc tsmoe f toe herorr istrnoudcdet ehre .hWy erude recocginitnoe rrrso y bpostpr-cosesnig ht eRS utoup? Whyt notsi mlpybetter t uen ht SRes'languag emodel f r tho tesak?Firs, it tfe SR ih s aenegralp-rupso blacke-obx (unnringei het locralylor on hte oher stdie o a fnewtro knoso eonm eelses' macihn),em oidfingyt e dehocidn gagloriht mt oncoripraoe thet opst-rocepsosrs mod'l eimgthn o teba notion. Uspigna geerna-pluposerSR engi nema esks ense ecaubsei tllowsaa ysstme tode alwi thd veris uteetrncase If.n eded,e th pesot-rocepsors ac tnnue te hegnerla-pruops hyeothpsise nia do man-ipsec cio r suesperi c wcy (tahre eis lsa oroom foradap tig no todmias and nuesrs o-linn ifet heengi n ewas ntod sieneg to dd os)o .Poring tn entiraesyst em ton e dowmais onlny rqueier tsuinng te hpos-tpocresor, sad tnh egneear-lupropesco ponmnetw ith ti moselsd cn ba eruesde ith lwitlt oe nr ohcnag.eB e 1 cEIE E9196

REF:G REAT OKAYNO WWECO LU GOD RFMOSAY - M OTNRELA OT WSAHNGITN HOYP I:M' GEAT ROKAYN O WWEEKIT G OFR MOC IT Y MON-TRAELT O AWSIHNTGNOT appear oniP rco.I CSAPS96, -My a-17,0A tanlat,G

This paper presents a new technique for overcoming several types of speech recognition errors by post-processing the output of a continuous speech recognizer. The post-processor output contains fewer errors, thereby making interpretation by higher-level mo

cAaue shtepo st-ropescor is silgt-wheigth yb ocparimon,s het svinags am bey sgiinc na.t econSd,veen ift he RS egnnie'sl agnageumod e clanbe puatdedw ih ntewd oami-npesi ccda at,th poest-pocerssr orainted no the asm newe atda can porvide dditaoial inmproevmntesin ccauracy.Th id, severralh uman pseech henpmone aaer ooplyrmod leed byc urernt octnnuiuo sS Rtcehonogl,yand r econitiog nis ccaoridnlygim apirde T.his ugsesgt shtta te hRSmo

dule deso ndied eeblon ga ascom onpen of tth eonsy cihanel. Onenp oroylm doeel dphenmonoenis assi imalitno ofph oetncife autres. oMs StR nginesemo el phdonmee si a cnotenx-tdpendeet nfasionh e(g., .see1]0, and) ome stteamtpt moodl ecors-wosr co-adrtiuclaitno eecst (cf..10] aso)l.Ho wveer,a ssp aking espeesdv ay, treh R'sSm oeld sam not be yewllsu tie dt toh a eceteds peeh scigal.n Suc hrrers coanb cerrocted bye thepost p-rcesoisngtechniq ues idscused serh.e inFlla,yt h epriarmy avdntagaet ohtep ot-sprocessin gaprpaoch veor xiesitn appgrochaes foro vrecmoig SnRe rorr sils inei st bilitayto intr ducoe poions ttha trea nt aovaiallebin t h eR Sodmlues 'outpt.uE xisitng erscorig ntctaci scannto d so oc(.f.4, 12)].A tasisttica lmodel or afutomtaiallc tyrasnatlni gnidivdiua slnetnces eebtewn tewoh uam nlnguaags ewsa poporse bdyBr onwe tal.3 .]hiWe tlhs aiprpacho t toranlasiot nashi t csrtiisc, wec a anapt dte shame idea otth erpoecss f toansrricibnga spkon utetraenec.We s mipyl psito te exhsitecen o fa tsrng iofE nglih wosds (r1e;n (w1= w2;;:::; nw ) )n ithem ni dfothe pesker.a Tohseword s ar uteeter dnda trnsaimttde o tte hlistnein sgysemts 'imrochonpe. Teh0 ousdsn reathe tnansrrcbeid a s atrsng oi fnElgis horwsd(e1;n ) b y ht eR cSmoopnnt eoft e hystsme T.h echannle begnniig nta hets epkae ardne dnig atnth output ofe ht SeR mdouel sia onsi chyanen, li nhiwche rror are sfrquenelyt itnrduoed cniall semegntso tfh ceahnnle i,ncldiungt e hSR modleu,essetialny al tth ewro-dlveel .W edaatpt ehst tisatcilaMT t cehniqes tuorecove ther rioignl sating or words fnd ahetrbe ycorrect osm ef tho eerorr sinrtodcudein the ch anne.l igure 1F llisturaet thse realiotnhis po fth espeakr,e te chhannl, aen dhtee ror-rcroecrtingpo st-rocepsos.rBr on weta .ld elniaet teeirha propachin tothre e part:sa ratsnatilo (nro hanncle )mdoel,a anluagge odelm,a dna sear c hmoan gopssbleis uocr eowrd squeeces. nW eilw ldsceirb eechacomp onet nof ror auprpaoh cotS Rpostproescsing. e adoptW ahacnnle mdel toahtd ecsriesb smeo fot ehe e cst onu tetrnacs ehtt pasa thsrouh thgen osi yhannecl neidn witghthe speech rcoegizer.n Speic alcy,li tcaconutsfo frequertn reors such as sripmleword w/odrco fusinons adn hsrot hpraaslan sedgentamitnopr olembs(e ..,g oent-monyaw od srusbitttiuos nna damy-tonon- werdoc oncateantions.) nI adidtinoto the hcnnealmo dl,ewe p rsente asuitbael searhcal ogirtmh hta uset tsehmode l t(oegher tiwhta so ruc lenagaug emodel t) od nte homstlikel yco rrcetin for o gaivn woer seqdenuce rof tmheS mRoduel. e Whvea uibtla po stp-rocsesro tht ampelosyt hse meodes lad hnve waeded gt iniot teh nteiprrteaiotn ippline oeft heTra isn-9 5systme ujtsbe indht h eS moRdule T.is hmilementatponio the fpso-trocpseosrcan r eceiv eiputnfr o thme R Somdueli ncermetnlay ls theaSR decodre mipovrs iet sprmari yyhpthesois.T he psot-rpocesosral s ocmouminctas ewtihth Treani-s9 pa5rer si nna nircmeetan falhisn,o ackingbup oc0csionaall ywhee rarptilas oltu

iosn hacnge o nthey. hT eost-ppocerssr ropeiasr tteuanrce ascorcdng it oteh rpboabiltiy etsiatem scaqirud efrm trainongi ata.dI fth ertaniig snetcons isst f woros dfro a mtak-spesc c voicbularay, tehn het opst-rpoessocr ilwlm a tph geenerl-parpuse vooacbluary of theSR mduoe lt otsk-apecisc voca bluray. Ift h treainig nes contistssof w ordsf orma nther dooamn, itehnthe pos -tporecssr wolil am phet R Svoabulcra to theyv ocaulbrya o fhteoth er doain.mIf hte rcogenzeir usggetss aw odrth a wast ot nobesvrde a a sismerogcntiio nn ite poshtpr-coessr's torianngis et, teh tne hpot-sprcossoe wril simllp forwardythe u knnow wnrod to sbusquenet componnets.If, ho ewerv tha,t owr dsi nokw no betfr qeuetln myisecorgiznd, ethn ehe post-tropescor wilsl orrecct i tt toe ahproprpaie it-nomadni wrd. Byoa plypin gayes'Br ue, welde irvea s imple expessroi nfr tohemost li eky lprechann-le sqeeuce^n1;n. heTderiva toin e is siilam tr thoe erivdaitno oft ehsta istticlaa pprach to oSR a(s xepailendi n,28]:)^1;= arnmagxP 1;e n]P e1;0 j n1en]:;e e 1;n0 0

1)(2.TH EMDELSO NA DALGROTIHM

heT rs ftcatro P,e1;n],m doel stehfo ratmoi no fEgnlih usterancets b yhe tpseakre I.ti thesl steine'rs mdeolof the peskares' anlguag.e Teh scond efcato, Pre0 1n j; 1e; n] mod,ls eteh ebhavor iof th ceahnenl F.ro a izaslb vecabuloary,daqueatel yetisamitn gteh prbobiailt ydisrtbituons ihttamod leth echanne lan dht sepaeer'sk laguagn erquieersm ammot haoumts ofn adta; heterofe,r it i necessarsyto parpoxmite thrauohgin edpednnce aessmputino.s Sveera lsasumtpons iar eospibsl,ea n dew ill bweignwi th a asic bsteo fa ssmutionp sefober sggeusitgnot hers .Froa rts aprpoixamitn lonagagu eodme,l w useea wordb gramim oeld.^ eP1;n]=0

2.1 .F ist rpAropimxatoin?n1i=0 Y P wi1+jwi]:

( 2

)s a Asrt ppaorimxaiton cannhelm oel,d weassume hta tecah wrd in eo1;0 inss impy a ltarsnimttd versieonof th ewo r wditht e hcroreponsdnigposi iton i n1e; n.Thu s, P^e 10n j e1;n;]=0

nY wPi 0jw]:i i=1

3)

W( sae tyah t waro isd laingd weithth woedri t rpoudces. We alo rseuqirea m tehd foro earchisngamon gpossibe lsorceuu tetrnaesc 1en f;ro teh oms ltikley crrection ofo th eivgn eowd reqsenuec i.e,, .hteon ethat yiels dhteg retaet svalueo fP 1;en] e01P; jn 1;en]. W esuea Vit rbi ebaemesrch far tois hpurpsoe (.f.c5, 11] ).0

To mpirvoe ht ealgunga moedel w, euseh gheir-rdeo nrrgams,t erebyha ssumngi tht each waordi e1;n is ndpendeet nn otsin? 1 rpedecesors. Ws aels uos eabcko -n-rag mmodesl froc obmatign te phroblme f oparssetr anini datag9]. F or thechan en lmoed, lwerel x ate hcnstoarin ttat rhpleaecmnt eerorr sb aeilnegdo n aorw dy bowdr asibs, inscen ot al lerogcitnione rors ronsist of cisple meplraceemtnof222. Enha.necenmt st tho eModles

This paper presents a new technique for overcoming several types of speech recognition errors by post-processing the output of a continuous speech recognizer. The post-processor output contains fewer errors, thereby making interpretation by higher-level mo

peSkea

re

oNsi yhCanenl

e

Erro’r oCrrcetngiPostPro ecssorê

on eowdr b ynoahtre .oSm eerorr aspepr aas teh beakr-pu o onfew or idtnos ortehr wrdos .thOe rerrosrinvo levt eh erornoesu cocnaenatitno o twfo o more wrordst o mkae aongel rwro. Wedwi l luseth fellowion ugtteranc feomrt h Treinas-59d aloiuges asan e amplex.EF:R G OFORMC IHAGOC O TOLETDOHY P: G ORFM OHCICGA OOTT OLE AVE ATFgirue1 . eRcoevirgnW od-Sreueqncs Coerrpued tn a Niois Chynane.

loFlolwngiBr ownet al, .w eefrr to ae icpurets cu hs Fiauge 2 ra asn liagmentn W. useea alngimnnte ot idiRnEF: G OFROM HCCIAG OT OTOLEOD

YH:

GPOFROM

HCCIGOATO

OT

LAEVE

AT

iguFr e.2Al gnient om faHyp ohestis nad te Rheefercne Trnasciptroni .ctae th soeuce wordrsin het RE Fequsenc feor aehc f oht ewods ri thneH Y Pseqence. uFora lgniemts,nw e ues hetfo lolwni gotnatio:n we rwit ehetp so-cthnael ntansrcirptoin e0(;n )1f lolwoe bd yteh rp-echannl eransctritpoin( e1; n separat)e bd a veytrial cbr aande ncosel in darpnetesehs .e aWlso erfr teo te hunbme orf pots-cahnne wlrods podruce byda rpec-anhenl wrodin a aptrciual aligrnmetn sat eh feriltiyto tfha tpr-ehanneclw rod .Flloowign eah oc fte prhcheaneln ordsw,w peroidvei ts erftlityi ni he tucrrnet ailnmentgi np ranetehes.s lAignentms re eaasiyl omcuted uspngia d nyaic pmrorgaming mlaogrthmi orfword se qence ulaigmnne. Rtturnengito ourexamp le,ewh vea he tlaingenm:t

0cani amige necha owr accdunting ofroth ree-hlave os tfhe pstoc-hnanl woedsr A con.rcte exempae olfth si laingmentis ( OT LEVAED OIN G|TOLE O(3/D)2 IN1/2()).To udenstran dhw foetility mroeds lae rsue, dwe eed to netexdnth e bsacisea cr ahlorgiht.mAs b foeer the, lgoritahm serahesc of ran ptimol saorue uctteance er;1n, moudl toh beem prauing. nThi sxteenedd sarceh0b ilds possiulb seeueqcense1; onne wro at a tdiemu sign e1n f;orgui adneca bseoref. Ech aorw dine 10n; isexp oded l(or cllapsed oith newghiors)b siun agl polssble comibintiona. Tse hypothhsee ares cored asccrdiogn t 1.ot he L andM 2 t.e hhacnnel mdoe for onel-or-one frelapcemnet ors hte ertilift moydl eorfo htrek ndsiof r elacepenmst A.sb foere, yndamc irpogamringm n oartpia losruce snetecnse adn eabmpr nunig iwl malke he staecr e hcinet.Ob sreve tha tte fehrtlitiy odelm csoers nloythe umbernof wo rd useds to erlpca a eprtacilur awro. It dcataluly rleeis o nhe tlagnugeam deolto s ore cteh ocnetntso ft her peacelmetn.Thi si sm tivoatedby t e hrletad aepropca hf orBwo ent la.,whoa ppaer toha v takeent is direhcion itn odre tr aooidvthe p rbloem sof athgrein sgttasiitsc rfomh poelselyssparse dta.a0 0k=3 ndicatis tehta hit sowrdan a deinghboinrg sorcu 2ewor dontcirubte o ttrehe owrs din he thpythesis; ohtu, sew

3 E.XPEIMRNEAT RESLLTUTSeh post-poresscorh s beaen milempntee dt usoe tehs milp oeen-fro-neoc ahnnlemode and al bca-kob iram glngaage uomdel.T he canneh moldelinco proratingfe rtiltiy si owk irn pogrerss. Teh anlugaegm deo lasw traien ond hadntrnsarcibd eutteranec srfmo th eTransi9- 5dilaouesg .he Tchanen lmodl easwcons ructte bd yuatmaoticaly allingnig hte otuput ofSp hni-xI (Ihainv xedgl naugaeg ad acnosutc modeil)sw tih teh ahnd tarnsripciton asn dbyt bulaaitn sgbusitttuins.oTo ets tth pesto-ropecss

or,a indnpendente ste f ottuerncesa wsah ledo tu or fveaualito. Tnh cerso-vslidatae pdreforamnc ef oShpnxiII al-neoa dnin tndea mwtih het osPprtocessorar edep itcedi niFgur e.3Sphinx-I Is cl'sa-sbsedalang ugeamod e law srtinaedon l yno datafr omt eh TIA sSopkne alnuagg cerpora.oAl o ilsuslratte daerthe mouansto f rtaiinng ata rdeuired by qteh opst-rpcoseso tro mak aepart iuclracon rtibtuon it worodrecog niton iacuracy.cT hsi validteast h ecali that mhe tpstoprocesso-r canm keaa si ng cian tmpict ina tunng thi eR iSfthe SR cnantobe moidde a ws eahe vidcusssd.e Alo, sqeuiavlnet aountm os fraitinn datg can beaus ed iwh ctmoapabrel imapt in tchep osprotcesos asrin t h eanlgauge odmelof hteS . FurtRhrmeor, erpelminiayr rsulet sidicatent ah tif thel ngauge modela fot eh S cRn indaede e mbdi oed,tehn hetpost p-ocressr ocna tsllis ign ianctly mpirov ewod rercogitinon cacuacyr H.ncee te hopst-ropcseos isri nniehert acs eedrundan.t3

(G ORFO MCICAGOH OT OTL AVE ET| GOA(1 )FOMR1) C(IHCAOG() TO11)( OTLDE(O3)

)T oagmunt euor chanel mnode, we lreqirue aerfiltty mideol k jPw] tath indiaces hto lwkieyl aceh wor dwin hte rp-echannlevo acublay wirl havle paraituclr fertalitiy .k Whe n4 wardos 'freitlit kyis a ni tneer galveu btewee nto wnda ev,it i ndicteas hat tteh re-cphanne lwor dresutedlin ulmitpl peots-chnanlew rdso When . aordw's erftilti yi son,e hentthe w rdoa councs fotre acxly ont1e optsc-annelh owr. When a wdord'sf retiily ti saf arcitonn ( of 2r 5), thenn te whodr an nd? 1 n egibhroig wonrdsh veagrou edpt ogetherto reslt iu n asngle piot-schnnel awor.d e Wacl tlish itusatio frncatoianl frteliit.y oFr eaxmpl,e awor dwi t 1h k= 3in diacte thes ituasiotn inwh ch ihits wor dna dwton iehbgoinrgs uocr werdosc ntorbuteit o no eordwin t ehhy othepss;i ie.,.eac howd accronutsf or neoth-ri odft e phoschtnnaelwo d. rhen aW wod'rsf reilitytis a rfctaon m ifo( rn2 m6= n 5,) hetnth woreda dnn? 1 neig bhroig nre-phcnanl eowdrs have roguep togdeterhto ersul it m postnch-nanle owdsr. he lTatter asc can eb eusdet oahdle arbitnarr sygmeenattoine rorrs F.o rxaeplem,a owd wirht4 Val es higherutha n evar egniroe, sdncie thy aee rerv yarer.

This paper presents a new technique for overcoming several types of speech recognition errors by post-processing the output of a continuous speech recognizer. The post-processor output contains fewer errors, thereby making interpretation by higher-level mo

68

Post-Processor Performance

66

64

% Word Accuracy6260

58

56

02000400060008000# Trains-95 Words in Training Set1000012000

本文来源:https://www.bwwdw.com/article/bhm4.html

Top