Big Ball of Mud

Table of contents

These are my notes about the article Big Ball of Mud.

A big ball of mud is a very common software architecture style where the dependencies between different parts of a program are accidental and where the program's structure is unplanned. It causes the program to be difficult to understand.

Thoughts before reading the article

Is a big ball of mud the most efficient structure in some situations?

A small ball of mud can be, e.g. when you're creating a one-off hacky script that you know won't be reused or developer further in the future.

But a big ball of mud?

If you have no other option. E.g. if you don't have any skilled workers, and only unskilled workers, then a big ball of mud is possibly the only realistic approach that works (similarly to a "shanty town" when there are no skilled builders).
Maybe if you're in a Cynefin-chaotic environment for a long time, like a very fast-changing technology landscape, or a customer who only incentivizes new features instead of maintenance/refactoring for a long time. But even then a BBOM wouldn't be the most efficient solution, since "efficient" here means e.g. dollars spent per feature or capability — the existing code in a BBOM would surely slow down the creation of new code. Therefore software development in a Cynefin-chaotic environment would IMO require constant refactoring (i.e. deletion of old code and creation of new code) to be most efficient, which would limit the size of the ball of mud.
If the time horizon is known, and if time is the most important factor, i.e. you need to get to market ASAP. Then it might be defensible to create even a big ball of mud if you know that the technical debt incurred while doing that can be paid back later by getting to the market first.

I guess you can generalize the above like this:

If the creators of software (including the customers who create specs) have limited visibility in
- time (i.e. they can't see second-order effects and have no past data to learn from, due to being very junior or due to a high churn rate)
- or space (i.e. they can't see other projects and programs apart from their own),
and there's realistically no possibility of increasing visibility,
then a BBOM is the only option and therefore the most efficient option.

Term: bounded rationality

We can't predict the consequences of our actions past a certain time horizon. Therefore we sometimes choose a locally optimal solution rather than a globally optimal solution.

The more knowledge a person has (e.g. via learning from experience and books), the better he should be able to reach a globally optimal solution. More time and energy for decision-making should also help, since quick and tired decision-making can lead to very short-sighted solutions.

Is the BBOM pattern so common because the real world is complex? I.e. is "essential complexity" its cause?

In practice it surely is a cause — the more complex your specs are, the more it forces hacky and complex solutions in code.

But in a perfect world the program would be modularized in such a way that you can look at each module and be able to understand it without understanding the whole program. You would have no global state, and each function would be pure. Each module would represent a specific concept/domain/service/etc.

But even then, simply having a huge number of modules and complex communication between them can create a BBOM (at a higher abstraction level), even if you don't have the wrong abstractions.

Pessimistic view: there is no single best architecture. The optimal architecture changes as the program receives new requirements and is developed by various developers. Across time, a BBOM is the lowest common denominator and therefore the easiest solution.

Optimistic view: any architecture is better than no architecture (= BBOM), as long as the architecture is consistent, since it introduces structure and therefore understandability.

Is communication the main goal of clean architecture?

A program that is only developed by one person probably doesn't require as much structure, as that person would be deeply familiar with each piece of code in the program, and can probably compensate for lack of apparent structure with his memory. Is a clean architecture useful mainly for communication, learnability and teachability?

If all software was written as black boxes by an AI, and humans only knew the inputs and outputs of the programs, would a BBOM inside the black boxes be OK?

Even a one-person project benefits from a clean architecture, since that one person forgets things and is therefore essentially multiple different persons after a few years. ("How does this code work again..?")
AI and black boxes: those black boxes need to be composed somehow, which creates a structure of programs, i.e. one bigger program. Humans would need to understand how those programs fit together, i.e. the inputs and outputs of the program and its sub-programs. Therefore a clean architecture would remain important — but the insides of the AI-generated black boxes could be BBOM for all we care, as long as the outputs are always correct for the given inputs.
- Here you can just substitute "AI" for "someone else", i.e. another company or programmer. For example, end-users don't care about the internals of a program as long as it works (i.e. gives the correct output for a given input).

I think you can say that, for humans, a clean architecture is mainly important for understandability. Learnability and teachability are parts of understandability.

Foote (one of the writers of the BBOM article) said something along these lines as well. As long as we need human programmers to read and write code, then our codebases should be "livable" (i.e. the codebase needs to be survivable enough so that humans can "live" inside it): https://youtu.be/LH_e8NfNV-c?t=3139

And see here for Foote's comments about black boxes: https://youtu.be/LH_e8NfNV-c?t=3639

What things most affect the choice of architecture?

There are multiple types of architectures (i.e. program structures). How to choose the correct one?

Some of the things that affect choice of architecture in my opinion:

Expected lifetime of the program
Expected size/scope of the program
Expected changes to the program during its lifetime
Quality of specifications (i.e. skills of the customer)
Skills and experience of the developers
Number of teams. 10 teams will probably create 10 programs or services with interfaces. (See Conway's law.)
How set in stone the specs are, i.e. can we simplify them to simplify the architecture

What are the most important aspects? It probably depends a lot on the context.

Architecture is always a matter of trade-offs: see chapter 18 of Fundamentals of Software Architecture.

Forces

Code duplication: when is it good or bad?

Some of my own thoughts.

Code duplication is good when the alternative, code reuse, would be worse.

Code reuse has at least these negative aspects:

Reduces customizability/flexibility: makes changing code slower
- If two pieces of code are similar for a while, but one or both of them change very often in different ways, then it's difficult to reuse that code. In that case it makes sense to wait for the code to stabilize, and then only reuse the stable parts of those two pieces of code.
Increases dependencies: makes changing code slower
Increases layers of indirection: makes reading code slower
Increases abstraction: might be difficult for less skilled programmers to understand
- Sometimes reusable code might use inheritance, template-metaprogramming, decorators or other abstractions that other programmers in the team aren't familiar with
- This is similar to using a single fancy term (e.g. "canonical data model") for an idea that would take multiple paragraphs to explain to people who haven't read about that term yet. In that case it might be better to use simpler but more verbose terminology (or programming constructs).

My first knee-jerk reaction is this: code that changes often is OK to duplicate. Examples: CSS code, website assets (e.g. logos, icons, fonts), frontend JavaScript code, frontend HTML templating, configuration files.

The question is: at what point do the bad aspects of reuse overweigh the good aspects of reuse?

Or an easier question: how do we reduce the bad aspects of reuse, so that we maximize the likelihood of reuse being a better option than code duplication?

Create more unit tests and system tests, so that changes made to the reused code don't have to be tested manually in each dependent of that code, and will instead be tested automatically
- IMO this has to be done sparingly in frontend code, since frontends are usually modified often, and you'd need to be modifying the frontend unit tests often. And frontend testing solutions are usually very slow to run.
Use a monorepo-like Git repository for your projects, so that all of the dependencies and their dependents are in the same Git repository. This reduces layers of indirection when programmers have to modify the reused code and its dependents.
Wait until you reuse code, so that the code has a chance to settle. Then you can reuse the stuff that doesn't change all the time. This is a bottom-up approach, instead of top-down where you try to guess (prematurely) what needs to be reused. Delay abstraction.
- See also: https://wiki.c2.com/?CodeHarvesting
Use good development tools that can easily jump between the definition and usage of functions and classes. Use a code search tool that allows searching code from all Git repositories in your organisation with one search query.
- Use code analysis tools that show you duplicated code, like phpcpd and jscpd
Version the code that you reuse. Use dependency management to your advantage. Move breaking changes into a new version of the library/module/class/function, so that you represent a stable interface to the dependents of the code. If you need any larger changes to the code, you can duplicate the code again, or create another version of the reused code.
Reuse the code by using library-like constructs instead of framework-like constructs. In other words, don't force your program to use the reusable code at runtime (like a framework would), but instead let others choose whether to use the reusable code (like a library would).
- This is related to the idea of "sane defaults", where you should't need to be turning off specific default features of a framework(-like construct) in normal use, but instead only in special cases
Use simple language constructs when you reuse code. Use (pure) functions first and foremost.
Conway's law: combine teams to reduce the need for duplication
- If you have two wholly separate teams, then reusing code between those teams will increase the need for communication, which slows down development. (Imagine reusing code between Google and Microsoft.)
- If you have a single team, then reusing code should be faster, since communication inside that team is more efficient than between two different teams

Term: code entropy, architecture erosion

Entropy (disorder) increases constantly. You need to fight against it by applying force (i.e. refactoring).

This is a causal loop diagram.

Loops

Reinforcing loop:

Increasing "Churn rate"
increases "Structural erosion or entropy"
which increases "Maintenance difficulties"
which decreases "Developer morale"
which increases "Churn rate" .


Piecemeal growth ++> Structural erosion or entropy
Structural erosion or entropy ++> Maintenance difficulties
Maintenance difficulties --> Developer morale
Developer morale --> Churn rate
Churn rate ++> Structural erosion or entropy
Refactoring --> Structural erosion or entropy

Related: the broken windows theory.

Zoning (kaavoitus): top-down design; architecting; software architecture roadmaps
Facade renovation (julkisivu-uudistus): frontend touch up
Garbage collection: scheduled removing of redundant data in databases and filesystems; fixing TODOs and FIXMEs in code
Re-paving roads: upgrading CI/CD pipelines to the latest hardware and software
Installing streetlights: adding monitoring, logging and other instrumentation to software
Re-routing roads: refactoring the runtime call-graph and/or coding-time structure of functions, classes and modules
Planting trees: improving coding style rules (whitespace, variable names, etc.)
Adding noise barriers: encapsulation; information hiding; modularization
Plowing snow on streets: performance optimization; defensive programming (preparing software to handle rare edge cases)

What are the main factors behind why a big ball of mud forms?

Time constraints (e.g. via a limited budget)
Lack of skill or experience
Lack of planning (i.e. Extreme Programming gone bad)
Lack of refactoring (e.g. prototypes that were deployed to production and never rewritten)
The architecture is already too complex and new code tends to make it even more complex, but sunk cost sets in, and the software is never rewritten (or majorly refactored)
Accidental: piecemeal growth; see bounded rationality (not able to see the effects of a change far enough)
Visibility: only developers can see how dirty the code is. Managers and users can't see the dirtiness of code; they can only see the frontends of the software.
- Also, Conway's law: if multiple teams create the software, and the teams don't communicate, then the teams don't share a common view of how to develop the software
- Note: if there are no code reviews or other code sharing, then even some developers will probably not see certain pieces of code
Essential complexity: if the specs (or the real world) are complex, then the architecture will become complex

Big ball of mud

You need to deliver quality software on time, and under budget.

Therefore, focus first on features and functionality, then focus on architecture and performance.

Top-down design doesn't work. You really only know what the architecture should look like in hindsight:

Make it work, make it right, make it fast.

Make it right = refactor the architecture to fit what you've learned.

What "forcing functions" are there that guide the architecture towards better quality?

Simplifying and tweaking of specs
Static analysis and code complexity metrics
Code reviews
Test automation
Manual testing (e.g. user acceptance tests)
Monitoring of errors in production environments
Measurement of cost (i.e. time taken) of new modifications to the program, and taking active steps to improve that metric
Post-mortems and retrospectives, to maximize learning
Prototypes and shorter iterations, to speed up learning
Architecture decision records (ADRs), i.e. document the architecture decision, make a decision together, justify why the decision was made, and communicate and ensure compliance of the decision via the documentation
Architecture governance: periodically check that development teams are still on the path towards the desired architecture

digraph {
  bgcolor="transparent"
  rankdir="LR"
  node [fontname="Arial, Helvetica, sans-serif", fontsize=12]

  specs_to_devs [shape="point", width=0]
  specs_to_devs_note [shape="plaintext", fontsize=10, fontcolor="#666666", label="Developers must apply\nbackpressure to the specs\nto ensure that the specs\nare workable into\na good architecture"]
  specs_to_devs_note2 [shape="plaintext", fontsize=10, fontcolor="#666666", label="Tracking of time or money"]

  devs_to_code [shape="point", width=0]
  devs_to_code_note5 [shape="plaintext", fontsize=10, fontcolor="#666666", label="Architecture decision records"]
  devs_to_code_note [shape="plaintext", fontsize=10, fontcolor="#666666", label="Static analysis and\ntest automation"]
  devs_to_code_note2 [shape="plaintext", fontsize=10, fontcolor="#666666", label="Code reviews"]
  devs_to_code_note3 [shape="plaintext", fontsize=10, fontcolor="#666666", label="Tracking of time or money"]
  devs_to_code_note4 [shape="plaintext", fontsize=10, fontcolor="#666666", label="Prototypes and shorter iterations"]

  code_to_users [shape="point", width=0]
  code_to_users_note [shape="plaintext", fontsize=10, fontcolor="#666666", label="User acceptance testing\nand error monitoring"]
  code_to_users_note2 [shape="plaintext", fontsize=10, fontcolor="#666666", label="Post-mortems and retrospectives"]

  {
    rank="same"
    Specs
    specs_to_devs
    "Designers and\nprogrammers"
    devs_to_code
    "Code and\nsolutions"
    code_to_users
    Users
  }

  Specs -> specs_to_devs [arrowhead="none"]
  specs_to_devs -> "Designers and\nprogrammers"
  "Other environmental factors" -> "Designers and\nprogrammers"
  "Designers and\nprogrammers" -> devs_to_code [arrowhead="none"]
  devs_to_code -> "Code and\nsolutions"
  "Code and\nsolutions" -> code_to_users [arrowhead="none"]
  code_to_users -> Users

  specs_to_devs_note -> specs_to_devs [color="#999999"]
  specs_to_devs_note2 -> specs_to_devs [color="#999999"]

  devs_to_code_note5 -> devs_to_code [color="#999999"]
  devs_to_code_note -> devs_to_code [color="#999999"]
  devs_to_code_note2 -> devs_to_code [color="#999999"]
  devs_to_code_note3 -> devs_to_code [color="#999999"]
  devs_to_code_note4 -> devs_to_code [color="#999999"]

  code_to_users_note -> code_to_users [color="#999999"]
  code_to_users_note2 -> code_to_users [color="#999999"]
}

Most of these things are really about learning, and making sure that the feedback cycle between changes and learning from those changes is as short as possible.

Throwaway code

You need an immediate fix for a small problem, or a quick prototype or proof of concept.

Therefore, produce, by any means available, simple, expedient, disposable code that adequately addresses just the problem at-hand.

When is it acceptable to not care about maintainability of code?

When the code will never be read again in the future. But what code isn't? Mainly short shell scripts that are only used to automate a very specific task that never occurs again — or if it does, it's too easy to re-create the shell script rather than make it a reusable thing.

Code that is easy to re-create from scratch when needed
Code that will never be used again with high probability

What ensures that prototypes can be easily refactored into maintainable code?

Keep functions short and pure, so that they're easy to move around
Use code templates that provide developers with the team's common baseline for directory structure, coding style, unit testing, linting, etc.
Paradoxically: don't care about architecture too much — if it makes you prematurely create the wrong abstractions. YAGNI. Keep code simple and therefore easy to refactor.
Use code reviews, with a sidenote about the code being a prototype instead of production-ready code

Piecemeal growth

Master plans are often rigid, misguided and out of date. Users’ needs change with time.

Therefore, incrementally address forces that encourage change and growth. Allow opportunities for growth to be exploited locally, as they occur. Refactor unrelentingly.

"Successful software attracts a wider audience":

digraph {
    bgcolor="transparent"

    Tasks -> "Code it fast\n(for short time-to-market)"
    "Code it fast\n(for short time-to-market)" -> "Get more users"
    "Get more users" -> "Get more feedback\n(feature requests)"
    "Get more feedback\n(feature requests)" -> Tasks
}

The point: you'll need a refactor step somewhere in there. Otherwise a BBOM will form.

What problems does city planning versus no planning have?

City planning: when taken too far, top-down design doesn't allow for changing the city plan during construction, and causes the layout to become too awkward for future changes
No planning: when taken too far, bottom-up design causes problems with noise, pollution, distance from homes to shops, etc.

Both up-front planning and agile refactoring/adaptation are important!

What tools make refactoring as easy as possible?

Unit tests: easily make sure that refactoring doesn't change the behavior of the program
- In my opinion you should test the program at as high an abstraction level as you can. Writing unit tests for every low-level function would slow down refactoring, since you would need to refactor the unit tests as well. Testing the output of the whole program or a major component of it (i.e. REST API) would be better — it would automatically cover part of the lower-level code, since the program execution path travels through low-level code. You often don't need to refactor such high-level unit tests when you refactor low-level code.
Static analysis: prevent stupid errors
Code search: find function calls and class extensions easily. Preferably an organization-wide search that searches from all projects at once.
An efficient local development environment that matches the production environment closely

Term: premature abstraction

Creating abstractions before you know what abstractions are suitable for your program.

A thought: too much up-front design (i.e. top-down design) might itself be a reason for a BBOM. If the designer created the wrong abstractions or too much abstraction up front, then the result can be more complex than a more bottom-up design (i.e. YAGNI-style design or Extreme Programming).

Term: YAGNI

You aren't gonna need it.

Don't predict (i.e. don't do too much top-down design). Learn instead (i.e. implement and refactor). Adaptation is learning.

Term: adaptation

Maintenance is learning. — Brand

This is how software projects often go:

Experienced designers and programmers are reserved for the start of the project
The project is implemented
Original designers and programmers leave to start other new projects, and maintenance programmers (who are often less experienced with the project) take over
The project keeps receiving new requirements, which require changes to the architecture
Maintenance programmers implement features for the new requirements, but the (less experienced) programmers probably don't have a full understanding of the original architecture, or how the architecture should be evolved for the new requirements
Eventually the original architecture erodes, and the project risks becoming a BBOM
Eventually a full rewrite is seen as the best option

The point is: experienced designers (optimally the original designers) should be kept in the project even when the project enters maintenance mode, since that is often the best time to learn (with hindsight) what the architecture should look like. Then the architecture should be refactored according to what has been learned.

Writing documentation about the architectural decisions and keeping other designers and programmers informed about those decisions would probably help reduce this problem, i.e., help slow down architectural erosion.

What are the three levels of learning (in the context of systems) according to Brand?

Habit: the system serves only the function for which it was designed
Modification: the system is adapted to allow modifications. The ease of modification of the system is constant (i.e. some systems are easier than others to modify).
Learning to learn: the system adapts to make future modifications easier

The last one is related to refactoring: when the designers and maintainers learn what parts of the system change and how they change, the system should be refactored to make those changes easier in the future. See Adaptation above.

Term: homeostasis and feedback

Most adaptive systems don't rely on prediction. Instead, they rely on homeostasis and feedback.

If you can adapt quickly to changes, then you don't need prediction.

Homeostasis

Shield the system from short-term fluctuations.

E.g. YAGNI: ignore nice-to-have feature requests.

Any self-regulating process by which systems tend to maintain stability while adjusting to conditions that are optimal for survival. If homeostasis is successful, life continues; if unsuccessful, disaster or death ensues.

Any system in dynamic equilibrium tends to reach a steady state, a balance that resists outside forces of change. When such a system is disturbed, built-in regulatory devices respond to the departures to establish a new balance; such a process is one of feedback control.

This is a causal loop diagram.

Loops

Balancing loop:

Increasing "Body temperature"
increases "Difference between current body temperature and 37 °C"
which increases "Sweating"
which decreases "Body temperature" .


Outside temperature ++> Body temperature
Body temperature ++> Difference between current body temperature and 37 °C
Difference between current body temperature and 37 °C ++> Sweating
Sweating --> Body temperature

Feedback

Compare the current behavior to the desired long-term behavior, and adjust as quickly as you can.

I.e. develop the system with a prioritized task list (i.e. long-term roadmap) based on user feedback. Keep a balance between planning and responding to feedback from users.

Keep it working

Maintenance needs have accumulated, but an overhaul is unwise, since you might break the system.

Therefore, do what it takes to maintain the software and keep it going. Keep it working.

The more business-critical the system is, the more important it is that you don't break the system. Therefore you should only make incremental changes.

How can you best drive good/active maintenance?

Sense of ownership. Incentives.
- Dogfooding: if the programmer is also a user of the program, it should improve quality
Daily build: faster feedback. Fail fast.
Test automation: reduce the work needed for you to test whether the program actually works. Very important for efficient refactoring.
Reminders/habit: scheduled maintenance days or weeks. We tend to forget things.
Client buy-in/understanding: client must understand that paying a "maintenance tax" is good for the long-term health of a program
By doing maintenance — it's perhaps a self-reinforcing loop. The more maintenance you do, the cleaner the code-base is, which should make future maintenance efforts easier.
- See the broken windows theory
Visibility
- Make bad code visible to programmers, e.g. via static analysis and code quality metrics
- How do you make bad architecture visible to clients and non-technical managers? I think this is one the hardest parts in software development. I think the solution is clear communication, in a form that the receiving party prefers (e.g. text, spoken words, PowerPoint presentation, etc.). Trust and skill are important, so that even if the client doesn't understand the technical details of some particular architectural problem, they trust you to have the skills to know what the architectural problem and its solution are. You could also show code quality metrics to clients — but make sure that such a metric doesn't become a singular goal, since some other business-related metrics (e.g. cost, performance, etc.) might be more important in the architecture. You would optimally need a set of metrics, so that you can communicate the idea that architecture requires compromises.

Term: design space

The possible directions where you can develop a program. Each change risks breaking the consistency of the current program and adding bugs to it.

Experience and knowledge should help give you a "map" for the directions that keep the program consistent and running. But if you're creating something new (i.e. a thing for which you don't have a map), then you must proceed in small increments.

Term: incremental development (Harlan Mills)

Harlan Mills said that any software system should be grown via incremental development.

Stub out the components of the system
Run the system with only the component stubs in place
Start filling in the stubs
Keep running the system as you develop it

Result: you always have a running system. At first it doesn't do anything, but incrementally it starts being able to do more and more of its desired things.

Fred Brooks found that teams could build more complex systems this way, than otherwise.

Apparently even large changes can be made this way, if you create the change on the side as one larger increment, and then integrate it to the running system.

I guess the opposite of this would be depth-first development: you first finish a single component in one large increment, and then the next, etc., and then you integrate those at the end in one large increment.

I think the point here is that incremental development forces you to think about the whole program (e.g. component interfaces, end-user roles, modularization, etc.) for the whole duration of the project, rather only than at the end of the project.

Shearing layers

Different artifacts change at different rates.

Therefore, factor your system so that artifacts that change at similar rates are together.

This point goes back to the idea of homeostasis and feedback (see above). As you learn how things change in the system, the stable part of the codebase should be refactored into a stable base layer, and the varying part (that requires maximum adaptability) should be refactored into a separate upper layer (e.g. a polymorphic class with sub-classes).

Resulting architecture: Stuff at the top changes most often. Stuff at the bottom changes least often.

Long-living systems must be optimized for change (adaptation).

How would you create a program that lists the rate of change of each project or file, so that you can see what changes and what doesn't?

Get Git logs from the past year per project
Metric: calculate the number of code lines changed per file during that period
- Ignore configuration and data files, such as json files
List projects and files based on that metric in descending order
Automatically suggest moving the stable code or the varying code into a separate module

If you want to get fancy, parse the code into an abstract syntax tree and calculate changes per function. Then you can see if the same file contains functions that change at different frequencies, which might mea that those functions should be in different files or components.

I'm sure there are existing tools for this if you search GitHub long enough.

Term: wisdom

How do you know what is true and what works?

Lindy effect: stuff that has worked for a long time
Reasoning up from first principles (i.e. axioms, self-evident truths, for which you don't need supporting evidence)
Have supporting evidence
- E.g. statistical evidence (of good quality, since statistics can sometimes lie)
- See levels of evidence for a ranking of types of evidence from strongest to weakest. Also see evidence-based practice.

The stuff that was learned a long time ago, and that keeps working, is formalized into reusable frameworks and libraries (and institutions, laws, etc.).

Those formalizations (e.g. frameworks) should also have formalized the knowledge about how to adapt to common changes — i.e. what things usually change, and how those changes can be implemented quickly.

This is related to homeostasis and feedback (see above). When you use a framework and receive a change request from a client to change the authentication logic, the framework should have a formal way to implement common authentication flows quickly. If so, the framework (i.e. its creator) has learned that authentication flow varies in specific ways between projects, and that the framework must support those use-cases.

What is the most adaptable layer of a software system?

Data.

Metadata is useful in that it pushes control to users (i.e. domain experts).

Sweeping it under the rug

Overgrown, tangled, haphazard spaghetti code is hard to comprehend, repair, or extend, and tends to grow even worse if it is not somehow brought under control.

Therefore, if you can’t easily make a mess go away, at least cordon it off. This restricts the disorder to a fixed area, keeps it out of sight, and can set the stage for additional refactoring.

Don't underestimate the importance of developer morale. See the broken windows theory. If you can't replace all windows, at least cover the broken windows with something prettier for the time being.

Term: facade, proxy

If refactoring right now takes too much time, wrap the messy code with the desired interface. Present that interface to clients of that code, rather than the messy code's interface. Then refactor the messy code inside the wrapper later.

I think this method can be combined with Fred Brooks's comment above about incremental development. Always try to think: what would be the optimal architecture currently? Then steer the architecture towards that goal incrementally, in a breadth-first manner. (Rather than creating an empty stub for a component, like in the Fred Brooks example, in this case you would wrap an existing component with a new interface — but the idea is similar.)

Reconstruction

Your code has declined to the point where it is beyond repair, or even comprehension.

Therefore, throw it away and start over.

Note: an old system might still be profitable — even more profitable than some more technically superior alternative.

What things affect the decision between refactoring and rewriting?

In my opinion:

Lifetime: Expected lifetime of the system. If the system will live for a long time after the rewrite, then there is a higher probability that the rewrite pays for itself over time.
Money/TCO: Expected difference in income between the current and the rewritten system over the lifetime of the system
Architectural quality: Difference in skill (technical skills or domain knowledge) between the original system's designer and the new system's designer (even if they're the same person). For example, if the original system was created with the wrong abstractions and never refactored, then it might make sense to rewrite the whole system with proper abstractions.
Lack of knowledge transfer between old and new developers: If the old system will be maintained by a new team, and the old system lacks documentation and test automation, then it might be too difficult for the new team to understand what the system does, and what it should do. In that case it might be best to rewrite the system with a full set of new requirements from the client.
Staffing: Whether the old and new system will be developed by different teams, or the same team. If the old system still receives many changes, it might be difficult for the same team to maintain the old system and create the new system at the same time.
Stability of requirements: If the system doesn't receive any changes anymore, and only needs to be kept running, then it might make sense to not refactor nor rewrite, but instead wrap it with virtualization (e.g. an emulator, like big banks might do with their decades old COBOL code). On the other hand, if the old system is constantly changing, then the rewrite might not be based on a stable set of requirements.
Technology: If the old system is using very old technology that is not supported anymore, then the only way to keep the system running might be to rewrite it with newer technology
Ecosystem: There might not be enough developers, support, libraries, etc. for the old system's technology. In that case rewriting with new technology might make economic sense.
Risk: The business risk of replacing the old system with the new system. If risk is high, it might be better to incrementally refactor/replace specific parts of the system while you keep it running, instead of a big-bang replacement.

How do you rewrite as efficiently as possible?

In my opinion, if good architecture is a result of learning and refactoring across the lifetime of the system, then a shortcut to good architecture would be to learn from existing systems. Especially if you have ADRs (architecture decision records) that tell you why a specific architectural decision was made during the development or maintenance of the old system.

Use "formalized knowledge" from the old system (i.e. its code, documentation, developers and users) as much as possible: data models, terminology, the most important functionality, the most important algorithms and functions, etc.
Ask the designers and developers to write a post-mortem about what went right, and what went wrong with the old system
Prevent second system effect: get the first iteration of the new system running ASAP, without any extra features
Don't rewrite. Instead, refactor the current system wholly by replacing individual parts of it one by one.

A thought: "Always be rewriting." Even if you're not going to rewrite a system, always thinking of how you'd do it should lead to better refactoring ideas. For example, write post-mortem reports periodically for the current system.

You could say that all refactoring is rewriting, just on a smaller scale. Nowadays OOP, modularization, service-oriented architecture etc. allow us to rewrite smaller parts of a program, instead of having to rewrite the whole program.

Term: durable good

https://en.wikipedia.org/wiki/Durable_good

Items like bricks could be considered perfectly durable goods because they should theoretically never wear out. Highly durable goods such as refrigerators or cars usually continue to be useful for several years of use, so durable goods are typically characterized by long periods between successive purchases.

Software can also be thought of as a durable good, since a lot of software lives for a very long time nowadays. https://en.wikipedia.org/wiki/Software_durability

Conclusion

The key is to ensure that the system, its programmers, and, indeed the entire organization, learn about the domain, and the architectural opportunities looming within it, as the system grows and matures.

Quality of architecture is a function of domain knowledge and learning in all domains that the system touches, by everyone who touch the system.

Domain knowledge: knowledge of the whole problem space, i.e. programming language, industry, processes, data, people who you work with, users who use the system, etc.

Everyone who touch the system: programmers, managers, clients, users, etc. — the whole organization.