Empirical project monitor A tool for mining multiple project data
更新时间:2023-08-18 19:34:01 阅读量: 资格考试认证 文档下载
- empirical推荐度:
- 相关推荐
Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita
EmpiricalProjectMonitor:AToolforMiningMultipleProjectData
MasaoOhira ,ReishiYokomori ,MakotoSakai ,Ken-ichiMatsumoto ,KatsuroInoue ,KojiTorii
NaraInstituteofScienceandTechnology
ohira@empirical.jp,{matumoto,torii}@is.aist-nara.ac.jp
GraduateSchoolofInformationScienceandTechnology,OsakaUniversity
{yokomori,inoue}@ist.osaka-u.ac.jp
SRAKeyTechnologyLaboratory,Inc.
sakai@sra.co.jp
Abstract
Projectmanagementforeffectivesoftwareprocessim-provementmustbeachievedbasedonquantitativedata.However,becausedatacollectionformeasurementrequireshighcostsandcollaborationwithdevelopers,itisdif culttocollectcoherent,quantitativedatacontinuouslyandtoutilizethedataforpracticingsoftwareprocessimprove-ment.Inthispaper,wedescribeEmpiricalProjectMoni-tor(EPM)whichautomaticallycollectsandmeasuresdatafromthreekindsofrepositoriesinwidelyusedsoftwaredevelopmentsupportsystemssuchascon gurationman-agementsystems,mailinglistmanagersandissuetrackingsystems.Providingintegratedmeasurementresultsgraphi-cally,EPMhelpsdevelopers/managerskeepprojectsundercontrolinrealtime.
1Introduction
Insoftwaredevelopmentinrecentyears,improvementofsoftwareprocessisincreasinglygainingattention.Itsprac-ticeinsoftwareorganizationsconsistsofrepeatedlymea-suringthedevelopmentactivities, ndingpotentialprob-lemsintheprocesses,assessingimprovementplans,andprovidingfeedbackintotheprocesses.Projectmanage-mentforeffectivesoftwareprocessimprovementmustbeachievedbasedonquantitativedata.
Manysoftwaremeasurementmethodshavebeenpro-posedtobetterunderstand,monitor,control,andpredictsoftwareprocessesandproducts[4].Forinstance,theGoal-Question-Metric(GQM)paradigm[2]providesasophisti-catedmeasurementtechnique.GQMguidestosetupmea-surementgoals,createquestionsbasedonthegoals,andde-terminemeasurementmodelsandproceduresbasedonthe
questions.ThemeasurementbasedonGQMisalogicalandreasonablemethod.
However,initspractice,memberswhoparticipateinmeasurementactivitiesneedtostriveforthemeasurementprocessesoneverylastdetail.Datacollectionformeasure-mentingeneralrequireshighcostsandcollaborationwithdevelopers.Itisdif culttocollectcoherent,quantitativedatacontinuouslyandmoreovertoutilizethecollecteddataforpracticingsoftwareprocessimprovement.Fewstudieshaveproposedmeasurementtoolsfordealingwithanumberofprojectdataespeciallyintermsofalarge-scalesoftwareorganization.
Asameasurement-basedapproachtotheaboveis-sues,wehavebeenstudyingempiricalsoftwareengineer-ing[1,3]whichevaluatesvarioustechnologiesandtoolsbasedonquantitativedataobtainedthroughactualuse.Ourgoalistodevelopanenvironmentcomposedofavarietyoftoolsforsupportingmeasurementbasedsoftwareprocessimprovement,whichwecallEmpiricalsoftwareEngineer-ingEnvironment(ESEE).
Inthispaper,weintroduceEmpiricalProjectMonitor(EPM)asapartialimplementationofESEE,whichau-tomaticallycollectsandmeasuresquantitativedatafromthreekindsofrepositoriesinwidelyusedsoftwaredevel-opmentsupportsystemssuchascon gurationmanagementsystems,mailinglistmanagersandissuetrackingsystems.Collectingsuchthedatainsoftwaredevelopmentautomat-icallyandprovidingintegratedmeasurementresultsgraph-ically,EPMhelpsdevelopers/managerskeeptheirprojectsundercontrolinrealtime.
2EmpiricalProjectMonitor(EPM)
WehavedevelopedEmpiricalProjectMonitor(EPM)[9]whichautomaticallycollectsandanalyzesdatafrommulti-
Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita
Figure1.ThearchitectureofEPMintheESEEframework
plesoftwarerepositories.Figure1showsthearchitectureofEPMintheESEEframework.TheESEEframeworkisde-signedforsupportingmeasurementbasedprocessimprove-mentinsoftwareorganizationsbyprovidingvariousplug-gabletools.EPMconsistsoffourcomponentsaccordingtotheESEEframework:datacollection,formattranslation,datastore,anddataanalysis/visualization.Thissectionde-scribesanoverviewofEPMandthebasicdata owthroughEPM.
Automaticdatacollection:EPMautomaticallycollectsmultipleprojectdatafromthreekindsofrepositoriesinwidelyusedsoftwaredevelopmentsupportsystems.Forinstance,EPMcollectsversioninghistoriesfromcon gura-tionmanagementsystems(e.g.CVS1),mailarchivesfrommailinglistmanagers(e.g.Mailman2,Majordomo3,fml4),andissuetrackingrecordsfrom(bug)issuetrackingsys-tems(e.g.GNATS5,Bugzilla6).Becausethesedataareaccumulatedthrougheverydaydevelopmentactivitiesus-ingcommonGUItools(e.g.SourceShareTM7,WinCVS8),developers/managersdonotneedadditionalworkfordatacollection.Also,itdosenottakehighcoststointroduceEPMintoprojects/organizationsbecausethesystemsasthesourcesofdatacollectionareopensourcefreeware.
Formattranslationanddatastore:EPMconvertsthecollecteddataintotheXMLformatcalledthestandardizedempiricalsoftwareengineeringdata,sothatEPMcandeal
/
/
3Majordomo,/majordomo/4fml,/index.html.en
5GNATS,/software/gnats/6Bugzilla,/
7SourceShareTM,/8WinCVS,/
2Mailman,1CVS,
withnotonlytheabovethreekindsofsoftwarerepositoriesbutalsovariouskindsofrepositoriesaccordingtopurposesformeasurement.Datafromothersystemsareavailablebysmalladjustmentsofparameters.ThedataconvertedintotheXMLformatisstoredinthePostgreSQL9database.Analysisandvisualization:EPManalyzesthedatastoredinthePostgreSQLdatabase.Forinstance,inordertoanalyzedatarelatedtoCVS,EPMextractstheprocessdataabouteventssuchascheckin/checkout,transitionsofsourcecodesize,versionhistoriesofcomponents,andsoforth.Then,EPMvisualizesvariousmeasurementresultssuchasthegrowthoflinesofcodeandtherelationshipbe-tweencheckinandcheckout.EPMalsoprovidessummariesofeachrepositorysuchasinformationofCVSlogs.Allthemeasurementresultsareavailablethroughusingcommonwebbrowsers(e.g.seeFigure2),sothatusersareeasytosharetheresults.
Inthisway,EPMsupportsuserstoobtainquantitativedataatlowcostinrealtimeandprovidesthemwithvariousmeasurementresultsforunderstandingthecurrentdevelop-mentstatus.Thiswouldhelpuserskeeptheirprojectsundercontrol.
3Visualizationsofmeasurementresults
Dataminingtechniquesforsoftwarerepositorieshavebeenproposedtounderstandreasonsofsoftwarechanges[7],toidentifyhowcommunicationdelayamongdevel-opersinphysicallydistributedenvironmentshaveeffectsonsoftwaredevelopment[8],todetectpotentialsoftwarechangesandincompletechanges[11],andsoforth.Incon-trasttothesetools,thefeaturesofEPMaretovisualize
9PostgreSQL,
/
Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita
Figure2.Measurementresultsthroughwebbrowsers
combinationsofmeasurementresultsfromthreekindsofsoftwarerepositoriesandtobeabletodealwithdatafrommultipleprojectssimultaneously.
3.1Combinationsofmeasurementresults
Inadditiontoprovidingvisualizationsofmeasurementresultsfromeachsoftwarerepository,EPMalsovisualizescombinationsofmeasurementresultsfromthreekindsofrepositories.Thefollowingsshowtwoexamplesofthem.Bugissuesandcheckins:Figure3representsthere-lationshipbetweenthetransitionofthecumulativetotalofissues(thelinegraph)andthetimeofcheckins(thegrayedverticallinesontheX-axis)inourEASEproject[6].Thenumberofissuesandcheckinsaremeasuredfromtherepos-itoryinGNATSandCVSrespectively.Acheckinoftenoc-cursafterbugissuesarereportedbecausedeveloperstrytomodifyorresolvetheissues.Thegraphhelpsusers(de-velopers/managers)rememberthesituationwhereissuesbyevery leversionswereraised.Tothecontrary,the leit-selfwhichischeckedinCVSmayincludesomebugsifthegraphindicatesthatthereareissuesaftercheckins.
Bugissuesande-mailsamongdevelopers:Figure4illustratesthecommunicationhistoryamongdevelopersintheEASEproject.Theblacklinegraphisthetransitionofthecumulativetotalofe-mailsexchangedthroughusingMailman.Theverticalshorter/longerdashedlinesrepre-sentswhenbugissueswereraised/resolved.Thelight-grayverticallinesmeanwhenthechecked-in lesbydeveloperswereuploadedtoCVS.Fromthegraph,userscancon rmthestateofthecommunicationamongdevelopersandiden-tifythe leversionswhichmighthaveproblems.Becausediscussionsonissuesbecomeactiveusuallywhenissues
are
Figure3.Relationshipbetweenissuesandcheckins
reportedtoanissuetrackingsystem,municationproblemsamongdevelopersbringthede-creaseofsoftwareproductivityandreliability
[8].
Figure4.Historyofbugissuesande-mailsamongdevelopers
Theintegratedmeasurementresultsbasedondatafromcon gurationmanagementsystems,mailinglistmanagers,andissuetrackingsystemshelpdevelopersunderstandcur-rentandpasteventsindevelopmentactivities.
3.2Visualizationsofmultipleprojectdata
paringcurrentprojectswithpastoneswouldbehelp-fulformanagerstoestimatetheprogressofprojectsandtodetecttheunusualstatusinprojects.
Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita
Comparisonofmeasurementresultsamongmulti-pleprojects:EPMmakesmeasurementresultscompara-blewithmultipleprojects.Figure5representstherelation-shipofthegrowthoflinesofcodebetweentwoproject(theupperline:SPARS[10],thelowerline:EASE).Thebothprojectshavebeenproceedingunderthecollaborativere-searchwithauthors’universitiesandsomesoftwarecom-panies.Someresearchersanddevelopershavebeenpar-ticipatinginthebothprojects.Actuallyalthoughthebothhavedifferentpurposesandaspects,supposeherethattheyhavebeendevelopingsoftwaresystemsrespectivelyundersimilarconditions.Theprojectmanagerscancon rmsomecommoncharacteristicsandroughlyestimatetheprogressofthelaterproject(EASE)fromthegraph.Forinstance,SPARShasthetwophasesinwhichithaveevolvedrapidlyforreleasingmajorversions.EASEhasjustreleasedthe rstmajorversion.ThemanagersareeasytoguessthenearfutureoftheprogressofEASE:thedevelopmentofEPMwillstopforawhiletotesttheEPM,toreconsiderthede-sign,andsoforth.
parisonoftwoprojects
Distributionmapsofmultipleprojects:Usingmea-surementresultsfromthreekindsofrepositoriesinmulti-pleprojects,EPMcangeneratedistributionmaps.Figure6isadistributionmapusing100OpenSourceSoftwareDevelopment(OSSD)10,11,whichrepresentstherelationshipbetweenlinesofcode(theX-axis)andnumberofcheckins(theY-axis).Supposeherethattheseprojectsaremanagedbyonesoftwareorganization.Thegraphcanbeusedforhelp-ingmanagersidentify“unusual”projectswhichindicateex-tremehighorlowvalues.
,
/
11We
selectedthe100projectsinFigure6randomlyfromthelistof
mostactiveprojectsin
.
Figure6.Distributionmapof100OSSDprojects
3.3Customizationsofmeasurementparameters
EPMcurrentlyprovidesuserswiththe ingthedatabaseschemaforEPMwhichisopentothepublic,usersareabletoinputSQLsequencesandtocreatebargraphs,linegraphs,anddistributionmapssuchasFigure6.Becausewewouldliketosupportvar-iousprojectsandorganizationswhichhaveownproblemsrespectively,wedecidedtoprovidetheminimumtypesofgraphsandsummaryinformationratherthantoprovidealotoftheminadvance.Afterfeedbackfromsoftwareorga-nizationsusingEPM,wewilladdothertypesofgraphsinthenearfuture.CurrentlyEPMcanbeviewedasatoolforexploratorydataanalysis[5].
4Discussion
Inthissection,wereportacasestudyofapplyingEPMtoourprojectitself,inordertoobservetheactualusageofthepre-de ned5typesofgraphsmentionedabove.Wehaveinterviewedfourdevelopersontheadvantagesandthedis-advantagesofusingEPM.ThedevelopmentenvironmentofthisprojectissummarizedinTable1.
Oneoftheadvantagesisthatthegraphsmakedeveloperseasytounderstandthestatusoftheprojectbyidentifyingdistinctivepartsindicatedinthegraphs.Forinstance,thepartofthe atlineintheLoCgraphremindedthemwhythedevelopmentseemedtobestopped.Infact,alldeveloperswereonabusinesstripatthetime.Thiscouldhelpthem
Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita
Table1.EPMdevelopmentprojectincreasetheaccountabilityfortheirmanagers.Otheroneisthatthegraphsgeneratedinrealtimemotivateddevelopersto xbugs,sincetheycouldbeawarethattherewerestillunresolvedissues.
Incontrasttotheseadvantages,someproblemsrelatedtotheusageofEPMhavebeenfound.Oneisthatvisual-izationsaretoocomplicatedtounderstandthestatusoftheprojectinsomecases.Forinstance,developerscouldnotdistinguishwhich leversionscorrespondedtowhichver-ticallinesinFigure3,sinceonedevelopercheckedinCVSforbackupofhis leseverydayandthereforeanumberofcheckinsoccurred.Inthiscase,developersmightneedtousetwoCVS(e.g.oneisforsoftwarereleaseandanotherisforbackup).
TheaboveresultsarestilltheinitialevaluationsforEPM.EPMwillbeintroducedinsomesoftwarecompaniesinthenearfuture.WeintendtoevaluatetheusefulnessofEPMwithrespectto(1)theeffectsonsoftwaredevelop-mentandprocessimprovementbyprovidingmeasurementresultsfrommultiplesoftwarerepositories,and(2)theben-e tofgivingthecapabilitytomanagemultipleprojects.
5ConclusionandFutureWork
ThegoalofthisresearchistoconstructanenvironmentforsupportingmeasurementbasedsoftwaredevelopmentaccordingtotheESEEframework.Inthispaper,weintro-ducedEmpiricalProjectMonitor(EPM)asapartialimple-mentationofESEE,whichhelpsdevelopers/managerskeepprojectsundercontrolbyprovidingvariousvisualizationsofmeasurementresultsrelatedtoprojectactivities.Nowa-days,wecangatherandanalyzemassivedataonsoftwaredevelopmentinalargescaleusingrapidlygrowinghard-warecapabilities.Byanalyzingsuchthehugedatacol-lectedfromthousandsofsoftwaredevelopmentprojects,wewouldliketoprovideusefulknowledgeandbene tnotonlytoindividualdevelopers/managersbutalsotoorganizations.Empiricalstudyonsoftwaredevelopmentisanactiveareainthe eldofEmpiricalSoftwareEngineering(ESE).ButtheapproachesofESEhavenotbeensuf cientlyap-pliedtosoftwaredevelopmentinsoftwareindustryalthoughcompaniesholdmanyproblems.Thedatarelatedtosoft-waredevelopmentfromtheindustrialworldhasseldom
beenprovidedwithuniversity’sresearch.Wearecollabo-ratingwithsomesoftwaredevelopmentcompaniesasthe
EASEproject.Therefore,itwouldbeastrongtriggerforgoingbeyondtheobstacleofthetechnicalprogressinsoft-wareengineering.
Acknowledgment
ThisworkissupportedbytheComprehensiveDe-velopmentofe-SocietyFoundationSoftwareprogramoftheMinistryofEducation,Culture,Sports,ScienceandTechnology.WethankSatoruIwamura,EijiOnoandTairaShinkaiforsupportingthedevelopmentofEmpiricalProjectMonitor.
References
[1]A.Aurum,R.Jeffery,C.Wohlin,andM.Handzic.Manag-ingSoftwareEngineeringKnowledge.Springer,Germany,2003.
[2]V.Basili.GoalQuestionMetricParadigm,inEncyclopedia
ofSoftwareEngineering(J.Marciniaked.),pages528–532.JohnWeilyandSons,1994.
[3]V.Basili.Theexperimentalsoftwareengineeringgroup:A
perspective.ICSE’00awardpresentation,June2000.Lim-erick,Ireland.
[4]L.Briand,C.Differding,andD.Rombach.Practicalguide-linesformeasurement-basedprocessimprovement.Techni-calReportISERN-96-05,DepartmentofComputerScience,UniversityofKaiserslautern,Germany,1996.
[5]S.Card,J.Mackinlay,andB.Shneiderman.Readingsin
InformationVisualization:UsingVisiontoThink.Morgan-KaufmannPublishers,SanMeteo,CA,1999.
[6]EASE.TheEASE(EmpiricalApproachtoSoftwareEngi-neering)project,http://www.empirical.jp/intex-e.html.
[7]D.GermanandA.Mockus.Automatingthemeasurementof
opensourceprojects.InProceedingsofthe3rdWorkshoponOpenSourceSoftwareEngineering,pages63–67,Portland,Oregon,2003.
[8]J.D.Herbsleb,A.Mockus,T.A.Finholt,andR.E.Grinter.
Anempiricalstudyofglobalsoftwaredevelopment:Dis-tanceandspeed.InProceedingsofthe23rdinternationalconferenceonSoftwareengineering(ICSE’01),pages81–90,Toronto,Canada,2001.
[9]M.Ohira,R.Yokomori,M.Sakai,K.Matsumoto,K.Inoue,
andK.Torii.Empiricalprojectmonitor:Automaticdatacol-lectionandanalysistowardsoftwareprocessimprovement.InProceedingsof1stWorkshoponDependableSoftwareSystem,pages141–150,Tokyo,Japan,2004.[10]SPARS.TheSPARS(SoftwareProductArchiving
andRetrievingSystem)project,http://iip-lab.ics.es.osaka-u.ac.jp/SPARS/index.html.en.
[11]T.Zimmermann,P.Weissgerber,S.Diehl,andA.Zeller.
Miningversionhistoriestoguidesoftwarechanges.InPro-ceedingsofthe26thInternationalConferenceonSoftwareEngineering(ICSE’04),Edinburgh,Scotland,UK,2004(toappear).
正在阅读:
Empirical project monitor A tool for mining multiple project data08-18
西南科技大学计算机科学与技术学院学生会校园文化活动部2009年十月份工作计划05-17
旅游对社会文化的影响及对策12-03
湖北省襄阳市第四十七中学八年级历史上册《第9课 新文化运动》教学案 人教新课标版05-17
硅胶发热线项目可行性研究报告05-15
多路红外遥控灯12-05
2017-2022年中国蟹行业深度分析及投资前景研究报告目录08-06
2015教师资格证考试《幼儿综合素质》最新考题及答案(2)12-20
乒乓球理论复习资料12-31
- 梳理《史记》素材,为作文添彩
- 2012呼和浩特驾照模拟考试B2车型试题
- 关于全面推进施工现场标准化管理实施的通知(红头文件)
- 江西省房屋建筑和市政基础设施工程施工招标文件范本
- 律师与公证制度第2阶段练习题
- 2019-2020年最新人教版PEP初三英语九年级上册精编单元练习unit6训练测试卷内含听力文件及听力原文
- 小升初数学模拟试卷(十四) 北京版 Word版,含答案
- 认识创新思维特点 探讨创新教育方法-精选教育文档
- 00266 自考 社会心理学一(复习题大全)
- 多媒体在语文教学中的运用效果
- 派出所派出所教导员述职报告
- 低压电工作业考试B
- 18秋福建师范大学《管理心理学》在线作业一4
- 中国铝业公司职工违规违纪处分暂行规定
- 13建筑力学复习题(答案)
- 2008年新密市师德征文获奖名单 - 图文
- 保安员培训考试题库(附答案)
- 银川市贺兰一中一模试卷
- 2011—2017年新课标全国卷2文科数学试题分类汇编 - 1.集合
- 湖北省襄阳市第五中学届高三生物五月模拟考试试题一
- project
- Empirical
- multiple
- monitor
- mining
- tool
- data
- 从友文槟榔看湖南槟榔演化简史
- COMP5318 Knowledge Discovery and Data Mining_2011 Semester 1_week3chap6_basic_association_analysis
- 日本艺伎
- 集体备课教案
- 2009—2010学年度第二学期六年级第一次模拟测试语文科试卷
- 思科3750交换机系列堆叠配置实例
- 实验四 数据库试验-单表查询
- 超低失真正弦波发生器电路
- 实习报告定稿
- 2012年辽宁省实验高一数学期末考卷
- 人教版二年级语文上册生字描红字帖
- 西城区高二物理选修3-2模块测试
- 剖析课堂管理内涵,建构新型政治课堂论文
- PN结正向伏安特性曲线随温度的变化
- PORT AND TERMINAL MANAGEMENT
- 学习“铁人精神”心得体会
- 新闻价值比新闻道德更重要
- 现场浪漫婚礼婚纱照
- 2010年会计证财经法规与会计职业道德模拟试题(5)-中大网校
- 构建和谐数学课堂 提高学生学习效率