BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Leveraging Text Content for Management of Construction Project Doc
 uments - Amr A. Kandil\, Assistant Professor\, School of Civil Engineering
 \, Purdue University 
DTSTART:20140509T140000Z
DTEND:20140509T150000Z
UID:TALK52531@talks.cam.ac.uk
CONTACT:Lorna Everett
DESCRIPTION:The construction industry is a knowledge intensive industry. T
 housands of documents are generated by construction projects. Documents\, 
 as information carriers\, must be managed effectively to ensure successful
  project management.  The fact that a single project can produce thousands
  of documents and that a lot of the documents are generated in a textual/u
 nstructured format greatly complicates the task of information management.
  Conventionally\, project documents are organized based on classifying doc
 uments according to fixed/predefined classes and document metadata\, e.g. 
 according to document type\, originator\, project attribute\, specificatio
 n division\, date\, etc. While such classification method is easy to imple
 ment\, it is only advantageous for document search and retrieval if the do
 cument seeker has prior knowledge of the content of the document corpus. I
 n many cases and for various project management activities this is not the
  case\, resulting in frustration of the search task with delayed or incomp
 lete search results. \n\nAn alternative framework for organizing project d
 ocuments based on document content is proposed. The framework takes into a
 ccount important characteristics of construction project documents and lev
 erages such characteristics to facilitate document search and retrieval. T
 he premise for the framework is the fact that documents are not produced h
 aphazardly\, but are generated as a result of certain events or circumstan
 ces occurring in the project. As such documents can be linked to each othe
 r on the semantic level\; a point that is overlooked by document managemen
 t systems which generally manage documents in vacuo by disregarding or fai
 ling to utilize such semantic connections between the documents. Organizin
 g project documents based on the semantic relations that exist between the
 m (revealed from the document content and not just the document attributes
 ) facilitates information retrieval and retains the knowledge of the actua
 l project participants\, thereby supporting knowledge reuse. \n\nAnother a
 spect of this research investigates the use of document content analysis t
 o enable automated document management. If textual similarities between do
 cuments correlate with what human users recognize through their semantic a
 bilities\, then content analysis of documents can be used to automatically
  organize documents according to the proposed framework. Text classifiers 
 based on machine learning techniques were evaluated to determine their per
 formance in identifying which group of semantically-similar documents a te
 st document belongs. Also\, an unsupervised learning method was adapted an
 d evaluated for the task of clustering documents based on textual similari
 ty into sets of documents that are semantically related. The purpose of su
 ch evaluations is to equip electronic document management systems with con
 tent analysis capabilities that facilitate document search and retrieval.
LOCATION:Cambridge University Engineering Department\, LR3B
END:VEVENT
END:VCALENDAR