Temporary title: Measuring the quality of source code in whole company's portfolio using tolerant parser and XQuery
|has title::Temporary title: Measuring the quality of source code in whole company's portfolio using tolerant parser and XQuery|
|Master:||project within::Software Engineering|
|Student name:||student name::Michal Masiarek|
|Second reader:||has second reader::Lukasz Kwiatkowski|
Any attempt at automated software analysis or modification must be preceded by a comprehension step, i.e. parsing. This task, while often considered straightforward, can in fact be made very challenging depending on the source code in question. When one tries to parse source code, he/she can follow either syntactical or lexical approach. First of them parses the source code according to baseline grammar and creates (abstract) syntax trees. Then, traversing those trees derives relevant data. Advantage of such approach is precise and complete information about source code. Disadvantage is that such method is complex and requires a lot of engineering and time. Moreover, usually syntax trees are created per file or class, what questions applying them on big software portfolio. Latter approach, lexical one, analyzes source code with predefined expressions in order to extract informations we are interested in (one of the examples of such approach is to capture all includes). Advantage of lexical analysis is that it is lightweight (requires only text analyzer) and scalable. Drawback of it is that obtaining more detailed informations becomes burdensome (like getting nested structures). The solution presented in this thesis will somehow combine both approaches. First, tolerant parsing will generate XML file. Later, generated file will be queried using XQuery and XPath to get required informations.