Wednesday, 26 June 2019

Software Design

This a collection of both online references and some my thoughts related to Software Design and Development in general.

Main Principles

Code should be correct, clear and efficient.
Prefer simple. Avoid clever.
(taken from https://yourbasic.org/)

Code exposed to be used by others should be written in such way that:

it is non-ambiguous so using it comes as intuitive
prevents errors when using it

Make MVP working (correctly) first. Release it and analyze feedback. Revenue, not the beauty of the code should drive development...but code should be well-designed if TDD is followed. Refactor evolutionary, not revolutionary.

Software design

Software design pattern

SOLID Principles

12 Factor Applications
Twelve-Factor Apps In node.js
Twelve-Factor App methodology

TDD - Test-Driven Development

I feel comfortable when replacing one implementation of the function with another only if that function is covered by unit tests.

POD - Performance-Oriented Development

Software Product Architecture

Plugins

Break up monolithic application into plugin-based one. Benefits:

update single plugin instead of entire application
bug in single plugin does not affect other plugins or entire application
some plugins can be free, some can be sold

Abstract 3rd Party Dependencies

If your application needs to use some 3rd party API or library, don't wire your code directly to it as it will then become dependent (hard-wired, tightly coupled) on the code you don't have control over. Create an abstraction of that 3rd party component, define an API that your code will call. This generic API is facade behind which you can always silently switch actual implementations in case you decide to use some other 3rd party library.

Metric for Good Source Code

how easy (or difficult) is to:

navigate the code
isolate the code

naming (of packages, namespaces, functions, classes, structs, variables etc...)
TBC...

Application Configuration (Settings)

Make application as configurable as possible. Application should read configuration from external source e.g. command line arguments, config file, environment variables, user input from stdin. If some settings are missing, application can use their failback values that are hardcoded.

Example:

if 'SETTING_X' not in os.environ:

os.environ["SETTING_X"] = "SettingX_default_value"

// use/read os.environ["SETTING_X"] value

When to use environment variables instead of CLI?

process - Argument passing strategy - environment variables vs. command line - Stack Overflow

Command-line Arguments

Usage message

To indicate optional arguments, Square brackets are commonly used, and can also be used to group parameters that must be specified together.
To indicate required arguments, Angled brackets are commonly used, following the same grouping conventions as square brackets.
Exclusive parameters can be indicated by separating them with vertical bars within groups.

Argument passing strategy - environment variables vs. command line

Logical Expressions

Use Boolean Algebra laws to simplify complex conditions (logical expressions).

Global Variables

They should be avoided unless they are static/singletons that represent an object with cross-cutting concern functionality.

Global Variables Are Bad

Functions

Functions should be simple, short and follow SRP principle. E.g. if function has to create a file at some path, don't make it also creating that path (if path does not exist). Create another function which is responsible ONLY for creating paths instead.

If function has to perform a task depending on the value of its argument, move that argument check out of the function - function should do only that task, not to verify whether that task should be performed or not.

Example:

func processAttachment(attachment attachment, ...) error {
if attachment.MimeType != "application/x-msdownload" {
return nil
}
// do here something with attachment
}

should be:

if attachment.MimeType == "application/x-msdownload" {
err := processAttachment(attachment)
}

This is more explicit and function is responsible for one thing only.

Don't make library/package functions asynchronous by default - allow users to choose how they want to consume them - synchronously or asynchronously. They can always create async wrapper around them.

The same stands for functions in Go. We could make them accept sync.WaitGroup argument so they can be awaited...but we should make function only do its main job as fiddling with wait group pollutes function's main functionality and thus break SRP.

func foo(arg1 T1, arg2 T2, ...wg *sync.WaitGroup) {
wg.Add(1)
...
defer wg.Done()
}

In the same way, don't add logging to library/package functions. Return errors/throw exceptions with error messages/codes instead. User of the library should decide what they want to see in the log output.

If function has multiple parameters and e.g. one parameter is used only in one part of the function, check if this part of the function is doing a task (or...has responsibility for one "thing") that could be extracted into a separate function.

Indentation & Single Point of Return

There are two schools here. The one which recommends that each function should have single point of return and one that allows multiple points of return.

Single point of return:

if function is long this increases chances of having multiple levels of nested conditions
returned value (error) is assigned at multiple places and at multiple levels
it's difficult to track positive execution path

Multiple points of return:

prevents deep levels of indentation (such functions usually have only two)
it is easy to track which expression would make function to return which error
we can use indentation here to visually create positive and error paths: positive path of execution are expressions in the 1st indentation level. Handling errors is in the 2nd (indented) level of indentation (see Code: Align the happy path to the left edge)

Here are some more Tips for a good line of sight from Mat Ryer:

Align the happy path to the left; you should quickly be able to scan down one column to see the expected execution flow
Don’t hide happy path logic inside a nest of indented braces
Exit early from your function
Avoid else returns; consider flipping the if statement
Put the happy return statement as the very last line
Extract functions and methods to keep bodies small and readable
If you need big indented bodies, consider giving them their own function

How small function should be?

How small should functions be?
Small Functions considered Harmful
What should be the maximum length of a function?
How can wrapping an expression as a function be Clean Code?

Terminology: Parameters VS Arguments

Parameter is the variable which is part of the method's signature (method declaration).
An argument is an expression used when calling the method.

From MSDN: "...the procedure defines a parameter, and the calling code passes an argument to that parameter. You can think of the parameter as a parking space and the argument as an automobile."

“Parameter” vs “Argument”

Arguments Validation

Arguments should be validated if their values are coming from the wild outside world and this happens in the public API. Contract validation frameworks can be used.

In private/internal functions we can use assertions (in C++/C#) or no validation at all and allow application to crash.

Structs & Classes

Prefer composition over inheritance.

For data members use their natural types, which are best describing them, rather than types used in their presentation. E.g.

Use

type Directory struct {
...
Created time.Time
LastModified time.Time
...
}

instead of

type Directory struct {
...
Created string
LastModified string
...
}

It might be necessary at some point to perform some manipulation with these values e.g. comparisons, period calculations etc...for which an API is provided on Time type. Otherwise we'd first need to convert strings to Time, do manipulation and then convert result back to string.

Perform conversion to string at the point when time values have to be displayed in UI.

Interfaces

Fluent design of the API can make code human readable even more.

Invariants

Know your Invariants!

invariant (computer science) - La Lojban

What are invariants, how can they be used, and have you ever used it in your program? - Software Engineering Stack Exchange

Class invariant - Wikipedia

Invariants in Code Design. Invariant, quite literally, means… | by Manik Jindal | code-design | Medium

Invariant-based programming - Wikipedia

language agnostic - What is an invariant? - Stack Overflow

Logging

Don't use logging as a substitute for proper debugging. Logging is poor man's debugging. Learn how to use debugging and profiling tools relevant for your development stack and IDE.

Think carefully what will go into log. If there is no error, don't make log message like this:

Created symlink ./.../myapp--setup.exe --> myapp-setup.exe. Error: <nil>

When you later analyze log file and look for word "error", you'll get tons of false positives.

Working with Data

Data Access Object (DAO)

Documentation

The older I am the less I like having documentation about the software I write anywhere else but in the source code itself. This reduces information redundancy, duplication and situations when documentation is not in sync with the implementation. Having brief but comprehensive comments and tools (example1) which can extract desired information from them should do the job. Having high unit test coverage (and BDD-style tests if you are kind to non-tech members of the team) should also help as reading test names should be as informative and as easy as reading a requirement specification.

Some Common Patterns and Anti-Patterns

MVC (Model - View - Conroller)

Model

processes data

View

shows the results
many dynamic languages generate data for views by writing code in static HTML files

Controller

handles user requests

Producer - Consumer

What is the benefit of writing to a temp location, And then copying it to the intended destination?

TBD...

Resource Management

Resources (data, files, directories, handles etc...) should be managed in a secure manner. Security is based on three (CIA) principles:

confidentiality

set of rules that limits access to information

integrity

the assurance that the information is trustworthy and accurate
involves maintaining the consistency, accuracy, and trustworthiness

availability

guarantee of reliable access to the information by authorized people

Unit Testing

Equivalence partitioning - Wikipedia

Be explicit when using data structures for representing data

Array members should mean the same thing. If an element has a special meaning, move it our from array and make it a member field of some other data structure. Example:

const (
copyFuncName string = "copy"
)

// Function parses CLI arguments and returns a string slice:
// first element = function name
// subsequent elements = function arguments
func parseArgs() ([]string, error) {
...
}

...and later we have:

switch funcDetails[0] {
case copyFuncName:
err = copy(funcDetails[1], funcDetails[2])
break
...
}

The red flag here was:

switch funcDetails[0]

and

copy(funcDetails[1], funcDetails[2])

When you read it, you are wondering "What is the meaning of that funcDetails[i], what value does it contain?". This should be explicit, something like:

switch funcDetails.Name:

So, we can transform the array into a struct with two fields: Name (string) and Args (array).

type funcInfo struct {
name string
args []string
}

Our switch statement now requires less mental effort to read and understand:

switch funcInfo.name {
case copyFuncName:
err = copy(funcInfo.args[0], funcInfo.args[1])
break
...
}

Meta-Data

It is often useful to add some meta-data to data or services which are consumed by external clients. They help client management.

Version

If is easier to manage clients if data format or services (e.g. public API) they are consuming are versioned. If we bump version, clients can detect that and request new data or client software update.