Professional Documents
Culture Documents
Program comprehension
Program comprehension is the study of how software engineers understand programs. Program comprehension is needed for:
Debugging Code inspection Test case design Re-documentation Design recovery Code revisions
Programming languages Computing environment Programming principles Architectural models Possible algorithms and solution approaches Domain-specific information Any previous knowledge about the code Code functionality Architecture Algorithm implementation details Control flow Data flow
New knowledge:
Comprehension techniques
Reading by step-wise abstraction
Determine the function of critical subroutines, work through the program hierarchy until the function of the program is determined.
Checklist-based reading
Readers are given a checklist to focus their attention on particular issues within the document. Different readers were given different checklists, therefore each reader would concentrate on different aspects of the document.
Defects are categorized and characterized (e.g., data type inconsistency, incorrect functionality, missing functionality, etc.) A set of steps (a scenario) is then developed for each defect class to guide the reader to find those defects. Similar to defect-based reading, but instead of different defect classes, readers have different roles (tester, designer and user) to guide them in reading.
Defect-based reading
Perspective-based reading
Sources of variation
Aside from the issue of how comprehension occurs, comprehension performance and effectiveness are affected by many factors:
Program
Maintainer characteristics
Familiarity with code base Application domain knowledge Programming language knowledge Programming expertise Tool expertise Individual differences
Program
Program characteristics
Application domain Programming domain Quality of problem to be understood Program size and complexity Availability and accuracy of documentation
10
Task characteristics
Task type
11
Models
Mental models
Cognitive models
Program
CognitiveMod el
Mental Model
12
Mental models
Static elements
Dynamic elements
Supporting elements
13
Text structure
The program text and its structure
Control structure: iterations, sequences, conditional constructs Variable definitions Calling hierarchies Parameter definitions
14
Chunks
Contain various levels of text structure abstractions. Also called macrostructure. Can be identified by a descriptive label. Can be composed into higher level chunks.
15
Plans (objects)
Knowledge elements for developing and validating expectations, interpretations, and inferences. Include causal knowledge about information flow and relationships between parts of a program. Programming plans
Based on programming concepts. Low level: iteration and conditional code segments. Intermediate level: searching, sorting, summing algorithms; linked lists and trees. High level All knowledge about the problem area. Examples: problem domain objects, system environment, domainspecific solutions and architectures.
Domain plans
16
Hypotheses
Conjectures that are results of comprehension activities that can take seconds or minutes to occur. Three types:
Why hypothesize the purpose/rationale of a function of design choice. How hypothesize the method for accomplishing a certain goal. What hypothesize classification.
Hypotheses are drivers of cognition. They help to define the direction of further investigation. Code cognition formulates hypotheses, checks them whether they are true or false, and revises them when necessary. Hypotheses fail for several reasons:
Cant find code to support a hypothesis. Confusion due to one piece of code satisfying different hypothesis. Code cannot be explained.
17
Supporting elements
Beacons
Cues that index into existing knowledge. A swap routine can be a beacon for a sorting function. Experienced programmers recognize beacons much faster than novice programmers. Used commonly in top-down comprehension.
Rules of discourse
Rules that specify programming conventions. Examples: coding standards, algorithm implementations, expected use of data structures.
18
Actions
Episodes
Sequences of actions.
Processes
Aggregations of episodes.
19
Strategies
Guide the sequence of actions while following a plan to reach a goal. Match programming plans to code.
Shallow reasoning do not perform in-depth analysis; stop upon recognition of familiar idioms and programming plans. Deep reasoning perform detailed analysis.
Chunking Cross-referencing
20
Chunking
Creates new, higher-level abstraction structures Labels replace the detail of the lower level chunks.
21
Cross-referencing
Map program parts to functional descriptions
temp = a; a = b; b = temp;
swap
sequential search
22
Cognitive models
Letovsky Shneiderman and Mayer Brooks Soloway, Adelson and Ehrlich Pennington Mayrhauser and Vans (Integrated)
23
24
Letovsky model
Shneiderman model
25
26
Brooks model
27
Soloway model
28
Pennington model
29
Integrated model
Distributed cognition
Traditional cognitive models deal the cognitive processes inside one persons brain. On real projects, software developers:
Work in teams Can ask people questions Can surf the web for answers
Program
30
Program Slicing
SOEN 6431
31
32
Solution?
33
34
35
More descriptively, it is a decomposition technique that extracts statements relevant to a particular computation from a program. Slicing Criterion <s, v> Program Slices as Originally introduced by Weiser[1] are known as executable backward static slices
36
37
Given: (1) A program (2) A variable v at some point P in the program Goal: Finding the part of the program that is responsible for the computation of variable v at point P.
Basic Idea
38
Program Debugging: thats how slicing was discovered! Testing: reduce cost of regression testing after modifications (only run those tests that needed) Parallelization Integration : merging two programs A and B that both resulted from modifications to BASE
Reverse Engineering: comprehending the design by abstracting out of the source code the design decisions Software Maintenance: changing source code without unwanted side effects Software Quality Assurance: validate interactions between safety-critical components
39
40
41
42
Static Backward Program Slicing was original introduced by Weiser in 1982. A static program slice consists of these parts of a program P that potentially could affect the value of a variable v at a point of interest.
Program P For all possible program inputs (executions) v = v Static Slice v
43
Slicing Properties:
44
Static Slicing
Statically available information only No assumptions made on input Computed slice can never be accurate (minimal slice) Problem is undecidable reduction to the halting problem Current static methods can only compute approximations Result may not be usefull
45
46
47
Creating a PDG
1 2 3 4 5 6 7 8 9 10 11 12 input (n,a); max := a[1]; min := a[1]; i := 2; s:= 0; while i n do begin if max < a[i] then begin max := a[i]; s := max; end; if min > a[i] then begin min := a[i]; s := min; end; output (s); i := i +2; end; output (max) ; output (min);
Data Dependence:
Represents a data flow (definition-use chain).
=> Data dependence between 2 and 7 but not between 2 and 8.
Control Dependence:
The execution of a node depends on the outcome of a predicate node. => Control dependence between node 6 and 8, but not between 6 and 15.
13 14
15 16
48
13
14
output (s);
Slicing Example
50
1 main( ) 2{ 3 int i, sum; 4 sum = 0; 5 i = 1; 6 while(i <= 10) 7 { 8 sum = sum + 1; 9 ++ i; 10 } 11 cout<< sum; 12 cout<< i; 13 }
50
11
12
Slice Point 8 9
51
new
52
new
53
Loops
1. 2. 3. 4. 3 read (n); i :=n; sum :=0; product:= 1; while (i>0) { 4 sum:= sum+i 5 product:= product*i; 6 i:=i -1; } 7 write(sum); 8 write (product);
SOEN 6431 54
55
Note: It is not necessarily value preserving - meaning the value for the variable in the Slice might not be the same as in the original program.
56
Objective: what parts of a program are affected by a modification to the the variable specified in the slicing criterion.
57
58
59
60
Controversial statement:
61
Slicing classifications
62
Types of slices
Direction of slicing
Static Dynamic
Executabiliy of slice
Backward Forward
Levels of slices
Executable Closure
Intraprocedural Interprocedural
62
63
64
65
v = v
Slicing Properties
66
Dynamic Slicing
Computed for a single input scenario Deterministic instead of probabilistic Useful for applications that are input driven (debugging, testing) Slicing criterion <i, p, v>
66
67
68
1311
1412 613 1514 1615
69
Traces the execution trace backwards to derive dynamic data and control dependencies.
Create individual node in the PDG for each executed statement.
70
Backward Algorithm
Program Execution for n=2, a[1,2] at statement 15
71
16
15
72
11 12
13 14
73
new
74
new c
75
new
76
new
Any problems?
Q: How many nodes do we have in a Dynamic Dependency Graph? A: ???
Q: How many dynamic slices can we compute? A: ???
77
78
79
input (n,a); max := a[1]; min := a[1]; i := 2; s := 0; i< n max < a[i]; max := a[i]; s := max; min > a[i]; output(s); i := i + 2; i < n output (max];
Sample program
80
Please note variables, a[] and n are omitted to reduce the complexity of the table
new
81
82
Explicit control transfer statements (goto, return, exit, break, continue) complicate the construction of control set A conservative solution: if goto statement has a nonempty relevant set, include goto and its target in the slice An alternative approach: look for labeled statements in the slice, then include goto statements that branch to these labels
Arrays:
Simple approach: treat each array assignment as both definition and use. Problem: too conservative To determine if use of a[g(j)] depends on definition of a[f(i)], we need to test whether f(i) can be equal to g(j)
Undecidable in general but can be solved for some expression types The solutions are one sided: can determine if f(i) and g(j) cannot be equal, but no information otherwise
Records:
Pointers:
83
new
84
new
85
new
86
new
87
Example Procedural
new
88
new
89
new
90
new
91
new
92