Intelligent Classification and Clustering Techniques for Text Data Mining
Small Business Information
500 West Cummings Park, Suite 3000, Woburn, MA, 01801
Abstract"This SBIR effort will develop an integrated information classification anddocument management system, applicable to complex weapons systems software.Currently, software engineers at Army's Tank-automotive & Armaments Commandrely on Software Trouble Reports (STRs) that contain unstructured text describing operational problems filed by soldiers fortroubleshooting of computer-controlled weapons systems.Past STRs and maintenance records provide a valuable source ofinformation that can help software engineers to understand newproblems, identify the faulty modules, and eventually provide valuableguidance on how to fix the problem.The overall objective of the Phase II effort is to develop aprototype Software Report Management System (SRMS) that will automaticallymanage STRs and associated maintenance records,extract useful information from the document archive, and discoverpreviously unknown domain knowledge that will assist maintenance of the system.It will also facilitate focused and accurate search forproblems/solutions/case-studies.To achieve the above objective, we propose to develop advancedclustering, information extraction, and data fusion algorithmsfor the document collection using textual analysis and machine learningtechniques. Such algorithms will be used to group the STRsinto meaningful clusters and extract useful information fromthem to build a knowledge base for software problems. We willthen integrate these algorithms in
* information listed above is at the time of submission.