Knowledge and result engineering
|Knowledge and result engineering|
|Master:||project within::Information Sciences|
|Student name:||student name::Youssef Ibrahimi|
|Supervisor:||Marieke van Erp|
|Second reader:||has second reader::Tina Mioch|
In 2011, approximately 60 projects will be carried out within the Perceptual and Cognitive Systems department of TNO.
Within these projects, methodologies, approaches, problems, solutions, and obtained results can differ from each other. This content-specific project informationcan however be used again as a starting point for other (new) projects or as an information source forresearchers.By content-specific project information; project information such as methodologies, obtained results, and approaches are meant. Project information such as budget and timeframe are out of scope of this study.
In the present situation, researchers can use a wide range of information systems to find generalprojectrelated information. These systems include; (1) network shares with hundreds of documents;(2) a digital library containing primarily publications;(3) “TNO Spider/City”, an intranet portal focusing on general project information, such as project members, internal project numbers and the possibility to look up financial information regarding projects; (4) “Yammer”, a Facebook like internal communication system; and (5) “ScienceDirect/Scopus”, searchable databases with scientific content.The documents on the network shares can be final version documents (e.g. deliverables) or documents in progress (e.g. draft versions of work documents). However, if a researcher wants to easily and quickly find content-specific information about a project, he or she does not have an accessible way in doing this in the present situation.For example, questions such as -what are the obtained results and which approaches contributed to the results? Which projects are similar to my project? –are relevant for researchers.
Within TNO applied science is the core business, therefore each project milestone, idea, method, or intermediate result,can be valuable to the entireorganisation. Insight in what has been done in a project is essential for other projects, for examplesoemployees do not “invent the wheel again”.
The present situation lacks content-specific information access: researchers do not have a quick and easy way to access this kind of information. A lack of access to relevant previous work can result in developing the same method, approach or idea multiple times. In turn, developing the same method or approach multiple times can have a negative effecton time, effort andmoney spent within a project.
Absence of a formal model to structure content-specific project information may pose the danger of developing the same method or approach multiple times. Therefore, we formulate the following challenge for this project:
Can we capture project information in such a way it becomes more accessibleand thus reusable for researchers?
This study extends previous work on structuring research information within the Information Retrieval domain [1, 2, 3]. In particular, we will explore how research information can be structured within TNO, by using the Information Retrieval techniques: automatic document classification and – segmentation .
The derived sub-questions from the main problem statement are:
1a. Can we automatically identify units of project information among documents? 1b. Can we automatically identify units of project information within documents?
We consider units of project information on two levels. The first level is the document level, in which we consider documents as a whole. The second level is the content level, in which we define units of project information within one document. These two levels will present two different, but parallel phases within this master project.
2. Can we automatically map units of project information to an ontology? We consider a map of project information units as a collection of core concepts. Identifying and defining these core concepts, and mapping them to an ontology is the aim of this sub-question.