Major contributor of complexity is handling of state
Others: code volume, concern with flow of control
Typical approaches
Object oriented: couple state w/ relational behavior
Functional: no state, no side-effects
II. Complexity
Root cause of software problems is unmanageable complexity
Dijkstra: “…we have to keep it crisp, disentangled, and simple if we refuse to be crushed by the complexities of our own making…”
Complexity = “that which makes large systems hard to understand”
III. Approaches to Understanding
Two approaches to understanding
Testing: system is black box
Leads to more erros being detected
Informal reasoning: examine from the inside
Leads to less errors being created
All methods to understand system have limits
Invest in simplicity over testing
IV. Causes of Complexity
Complexity caused by State
State makes programs hard to understand and therefore complex.
Too difficult to enumerate all possible states which causes unreliability
Impact of State on Testing
Test in one state tells you nothing about behavior in another
Tests start w/ clean state
Num of inputs to system large, num of states larger
Impact of State on Informal Reasoning
Num states grows exponentially, each bit of state doubles total
Complexity caused by Control
Choosing an ordering of statements that do not depend on that ordering introduces unnecessary complexity
Concurrency
Shared state sucks
Testing no longer is consistent
Complexity caused by Code Volume
Volume increases nonlinearly with size
Dijkstra: People tend to think intellectual effort to understand system grows by square of size .. i.e. because components of the system could all interact
Abstraction reduces interaction
Other causes of complexity
Duplicated code, dead code, unnecessary abstraction, …
Complexity breeds complexity
If it is not clear that functionality exists, tend to dupe
Simplicity is Hard
Power corrupts
In the absence of language-enforced guarantees, mistakes will happen.
The more possible in a language the harder it is to understand.
V. Classical approaches to managing complexity
Object-Orientation
State access procedures must enforce constraint for every piece of state manipulated
Where does constraint enforcement go for multiple-object relationships?
Identity and State
Every obj is independent entity regardless of attributes
Leads to awkward middle ground “Value Objects” that you’d like to have extensional identity
Functional Programming
Roots in stateless lambda calculus of Church
State
No state gives referential transparency: given set of args, always return same result
Control
Encourage more abstract use of control (map) than explicit looping
Kinds of State
Mutable vs. immutable state.
Functional way to write counter method below.
Caller becomes responsible for making sure fn called appropriately.
function (int, int) getNextCounter(int oldCounter) {
let int result = oldCounter + 1
let int newCounter = oldCounter + 1
return (newCounter, result)
}
Cont’d…
State and Modularity
Trade-off between modularity given by state and explicitness given by functional
Trade-off is between complexity and simplicity..i.e. you know the outcome of a procedure by its args
Logic Programming
Specify what needs to be done rather than how to do it
State a set of axioms along with attributes required of a solution
VI. Accidents and Essence
Essential complexity: Inherent in the problem as seen by the users
Accidental complexity: All the rest (performance, etc)
In ideal world, we could use lang and infra which would let us express users’ problem directly
VII. Recommended General Approach
What complexity can’t be avoided even in ideal world?
Ideal World
Start with informal requirements
Translate to formal ones
Must be done w/ no view to execution (i.e. only ensure there are no omissions)
State in ideal world
All data either provided directly (input) or derived
Derived data immutable if only for display or mutable if users can update
Input data
Derived data is always accidental state
For mutable state, function must have inverse
Theoretical and Practical Limitations
Systems have state as part of essence
Control generally is accidental
Practical (e.g. efficiency) dictates use of accidental state
Formal Spec Languages
Property based: what is required rather than how
Model-based: construct a potential model for the system and specify how it must behave
Generally formal langs not expressive enough for all specs
Ease of Expression
Sometimes it is more natural to express derived state (e.g. loc on a map rather than seq of moves)
Required Accidental Complexity
Performance
Ease of expression
Recommendations
Avoid and separate
Avoid state and control where not essential
Separate it out from rest of system
Required accidental complexity
Performance: leave to separate system
Ease of expression
Occurs when derived state is most natural way to express parts of system
Use an external component to mock derived state
Separation and the relationship between components
Separate all complexity from pure logic of sys
Divide complexity into accidental and essential
3 expected components
Accidental state and control: Never should affect other parts of system
Essential Logic: “heart” of the system, expresses in terms of state what must be true, must not be able to change state
Essential State: makes no reference to either of the other parts
VII. The Relational Model
Four aspects
Structure: relations are means to represent data
Manipulation: means to specify derived data
Integrity: inviolable restrictions on data
Data Independence: sep logical from physical
Structure
Set of records each of which has uniquely named attrs
Base relations vs derived relations
Access path independence
No up-front decisions have to be made about access paths used to query and process
Contrast w/ hierarchical
e.g. Department chosen to contain employees so to find employee you must access through department