shpool is a tool like tmux and screen to keep persistent shell sessions across hangups.
It’s quite new (first released 2023), and extremely simple to use.
Usually I only need one persistent shell in remote machine, and this tool, unlike tmux,
is designed to support only one shell per session.
That makes me very satisfied.
Currently there is no Bash auto-complete setup provided from the official repo.
Therefore I implemented one for myself. With that, I can make my life easier. 🤣
1 2
shpool <tab> # complete sub-commands like attach / list / help ... shpool attach <tab> # complete available sessions
Here is the auto-complete setup script for Bash, for anyone who have same requirement.
if [[ ${#COMP_WORDS[@]} -eq 2 ]]; then local commands=(attach detach kill list version help) local partial="${COMP_WORDS[$COMP_CWORD]}" mapfile -t COMPREPLY < <(compgen -W "${commands[*]}" -- "${partial}") return 0 fi
[[ ${#COMP_WORDS[@]} -ne 3 ]] && return 0
# Now, try completing session names for commands that take them local partial="${COMP_WORDS[$COMP_CWORD]}" localcommand="${COMP_WORDS[$COMP_CWORD-1]}"
case"${command}"in attach|detach|kill) local -a sessions mapfile -t sessions < <(shpool list | tail -n +2 | awk '{print $1}') mapfile -t COMPREPLY < <(compgen -W "${sessions[*]}" -- "${partial}") return 0 ;; *) # No need for other commands, they don't take arguments COMPREPLY=() return 0 ;; esac }
In Python, we can make a container type iterable by implement __iter__ method.
For example, if we want to provide all value in a range [lo, hi) …
1 2 3 4 5 6 7
classTwoTillSix:
def__iter__(self): for val inrange(2, 6): yield val
list(TwoTillSix()) # [2, 3, 4, 5]
With the help of this generator function, we don’t need a separate iterator type.
Just keep the state during iteration directly in some local variable.
In this post, let’s see how we can do this in C++, with the help of std::generator<T>.
A shorter version of this post can be found in Gist and Compiler Explorer.
What happens in the example above is that Python interpreter binds the execution state of
that function to a generator object. Every time we want a value from that generator,
the execution resumes until the function yields the next value (or returns).
This is doable because Python provide a builtin support for coroutine, an interruptible/resumable
function, so the control flow can interleave between callee and caller functions. And that’s the
primitive building block of concurrent programming.
1 2 3 4 5 6 7
values = TwoTillSix()
for val in values: print(val) if val == 3: break # Print 2 and 3. The body of `TwoTillSix.__iter__` is only executed twice.
As we can see here, the generator makes it possible that some values are evaluated lazily,
and we can even stop the evaluation any time when we lost the interest.
All of this is possible because the support of coroutine.
C++ Coroutine and std::generator
Fortunately, C++ has language level support for coroutine since C++20.
When compiler see a function using new keywords co_await / co_yield and co_return,
it will generate corresponding code to provide the ability of partial execution, and the storage
of state for that function when it’s interrupted.
Also, starting from C++23, std::generator is included in standard library. As the
coroutine (promise-)type that capture the common usage of… yes, generator!!
(Actually I’m surprised that there is no any high-level coroutine type in standard library
when coroutine is accepted as a language feature when C++20 is finalized.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
std::generator<int> fibonacci(){ int a = 1, b = 0; while (true) { co_yield a; b = std::exchange(a, a + b); } }
intmain(){ for (int num : fibonacci()) { if (num > 10) break; std::println("{}", num); // prints 1 1 2 3 5 8 } return0; }
When fibonacci is called here, an interruptible function is created, and wrapped in the
std::generator return value. Every time we take next value out in the for-loop, the function
resumes and yield next value (and is interrupted again).
So now we have the ability to interleave control flow, just like the example in Python earlier!!
But… that’s a free function, usually what we encounter is a container class.
A class must has begin/end method, and corresponding increment/dereference operator on
the returned iterator type, in order to be iterable.
(It’s modeled as named requirements Container in standard.)
If a class need another method call to be iterated on, it just looks… not like C++.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
structFactorial { // a virtual container for factorial series std::generator<int> generate(){ int product = 1; int val = 1; while (true) { co_yield product; product *= val++; } } };
intmain(){ Factorial fac {}; for (int num : fac.generate()) { // well.. can we iterate on fac directly? if (num > 10) break; std::println("{}", num); } }
So… can we keep the implementation simple (using std::generator coroutine type),
while keeping the container being iterated in usual way?
CRTP: Curiously Recurring Template Pattern
Now our goal become clear:
How can we make a class be used as before, if the class only provides a method
that return a std::generator?”
That’s the place where we adopt CRTP to solve problem.
/// @brief HasGenerator identifies the customization point of CRTP base AutoContainer. /// @tparam C the child class type. /// @tparam T the value type. template <typename C, typename T> concept HasGenerator = requires(C c) { { c.generate() } -> std::same_as<std::generator<T>>; };
/// @brief A CRTP base type that allow easy iteration implementation. /// @tparam T the value type of element being iterated. Not child class type. /// @details Child class must provide std::generator<T> generate() member function. template <std::semiregular T> structAutoContainer {
// deducing-this so that we can do CRTP. // Also use concept to give better error message when doing mistake. Iter begin(this HasGenerator<T> auto&& self){ returnIter(self.generate()); }
// deducing-this not required, just be symmetric with begin. std::default_sentinel_tend(thisauto&& self){ return std::default_sentinel; }; };
AutoContainer can inherited by any class that want the all the implementation for iteration.
The base class here provides an iterator type to capture the generator, and forwards the
increment/dereference operation to it. With this iterator type, the remain
effort is on the begin method, just construct the iterator from self.generate().
Here self is typed by deducing-this so generate() can be provided later by child class.
And that’s the place where the base class connects to it’s child.
Also at the begin method, we add concept constrain HasGenerator.
This way we can avoid error waterfall when using templated library.
And the interesting thing here is: “The thing to be templated so that we achieve CRTP,
is the templated begin method, not the AutoContainer class itself.
If there are two child class both inherit from AuthContainer<int>,
only one AuthContainer type is instantiated, but two different begin are instantiated.
/// @brief Factorial sequence generator. structFactorial : public AutoContainer<int> {
// Implement `generate` so this class can be iterated like a container. std::generator<int> generate()const{ // The mutable state is the local variable of this coroutine, // so the method receiver can be const. int product = 1; int val = 1; while (true) { co_yield product; product *= val++; } } };
const Factorial fac {}; for (int value : fac) { // noticed that we don't call generate() here if (value > 10) break; std::println("{}", value); // prints 1, 1, 2, 6 } }
Now we can use the AutoContainer by inherit it in any child class.
In the example above, Factorial inherit from AutoContainer<int> (since it generates integers).
All we need to do is implement the generate method. Being a coroutine, the logic is clean.
Now everything works and looks nice!! :D
To make this class recognized as a std::ranges::input_range, that is, to make this class
works with <ranges> library, the base AutoContainer actually need more stuffs, like
value_type/difference_type on it’s iterator, support of post-increment operator, and so on…
But that’s probably too much in a tutorial, so let’s stop here.
More Standard Library Support, Maybe?
We describe the minimal concept about coroutine, one important application of coroutine as
Generator (std::generator), and how to use it in a beautiful way.
But there is another important application of coroutine: Promise (with Event Loop).
JS developers already get used to the Promise type (or maybe the async/await keyword).
Python also support this kind of usage since Python 3.5.
But unfortunately, there is still no such thing in C++ standard library, even if the coroutine
is standardized in C++20.
The good news is that there is already a proposal P2300 and a reference implementation NVIDIA/stdexec.
(They don’t use the term Promise, probably because promise_type already have other meaning
in the context of C++ coroutine)
Hope we can see more usage about these in the future.
It’s almost impossible to write a frontend without any UI framework nowadays.
All these frameworks provide some kind of mechanism that, whenever we change a value in JavaScript,
some element in UI will be changed automatically. For example, adding new row to a table
when we push a item into a list.
But how does this work under the surface?
That’s the Proxy Object.
Internal Method and Proxy
We need to talk about internal method in JS first.
A internal method is a special method/function that decide how a object behaves in JS runtime.
For example, [[GET]] internal method determines the result when we access some property prop
from an object obj, e.g. obj.prop. And [[Call]] internal method determines what to
proceed when we call the object obj as a function, e.g. obj().
These internal methods can be intercepted and customized from some existing object,
with a Proxy instance.
For example, if we want to intercept the property-read action of some object. Here is the way to do.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
var person = { name: "Alice" };
var agent = newProxy( person, { get: function(target, property, receiver) { console.log(`access "${property}" as ${target[property]}`); return target[property]; }, }, );
var _ = agent.name; // prints: access "name" as Alice"
In the snippet above, agent is the proxy object instance.
The second argument to Proxy constructor is the handler, and the get function in that handler
is a trap to the [[GET]] internal method.
See
The set Trap for Binding
Just like we can intercept the property read action for some object, we can also intercept
the property write action. And that’s what UI frameworks do underneath.
Assuming we have a paragraph in DOM tree, identified by id greeting.
1
<pid="greeting">Hi</p>
We can ensure that the text is updated automatically when we are doing some
change to JavaScript object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
var element = document.querySelector('#greeting'); var handler = { set: function(target, property, value, receiver) { if (property !== 'text') returnfalse; target[property] = value; element.textContent = value; // react to the change returntrue; } };
var greeting = newProxy({text: element.textContent}, handler); console.log(greeting.text); // print 'Hi' greeting.text = 'Hello'; // the text in the html page will be changed accordingly.
Once this proxy object is created, we make the view (HTML) reactive to
the model (JavaScript object), and the binding is established.
In the examples above, we may noticed that all the intercepted objects are an object literally.
That means we can not do similar thing to primitive type in JS, like number and string.
And that do have impact on the design of UI framework.
For example, Vue.js framework has a ref() wrapper for that.
However, Proxy is still a powerful feature in JS, and probably the most important feature
for any senior engineer who need to work with JS.
So just get used to it. :D
The problem is… in C++, this style is just a convention, _ is still a regular variable.
So if we want to ignore two value with different type, it does not work since the type mismatch.
1 2 3 4 5
voidfoo(){ auto [_1, done1] = someJob<int>(); auto [_2, done2] = someJob<std::string>(); // we need to separate _2 from _1 }
That’s frustrating, especially for people with experience of pattern-matching expression in
other languages.
So in C++26 (proposed by P2169), now we can new way to interpret the semantic of _.
The rule is simple.
If there is only one declaration of _ in some scope, everything is same as before.
A we can reference it later if we wan’t, although it’s probably a bad smell to use _ in this case.
If there are more declarations of _, they all refer to different objects respectively.
In this case, they can only be assigned to. Try to use them is a compiling error.
And we can finally write something that looks more natural.
1 2 3 4
voidfoo(){ auto [_, done1] = someJob<int>(); auto [_, done2] = someJob<std::string>(); }
Golang has this feature from the beginning, is called blank identifier.
For Python, although being a dynamic-type language, there is no problem to do use _ for different
type value. _ is defined as a wildcard when pattern-matching is introduced to Python (PEP 634).
But it’s intended to be a storage accessed through network, so any operation on it is
inherently going to fail at some time.
Also, We are 2024 now, loving FP, preferring expression over statement, monadic operation
being so cool.
Thus we decide to wrap all return type in std::optional to indicate these actions may fail.
...\include\optional(100,26): error C2182: '_Value': this use of 'void' is not valid
Well. template stuff.
What Happened?
The problem is that void is an incomplete type in C/C++, and always to be treat specially
when we are trying to use them.
By incomplete in C/C++, we mean a type that the size of which is not (yet) known.
For example, if we forward declare a struct type, and later define it’s member. The struct type
is incomplete before the definition.
1 2 3 4 5 6 7 8 9
structItem;
Item item; // <- invalid usage, since that the size of Item is unknown yet.
structItem { int price; };
Item item; // <- valid usage here.
And void is a type that is impossible to be complete by specification.
But we can have a function that return void?
Well, we return nothing
1 2 3
voidfoo(){ }
voidfoo(){ return; } // Or explicit return, both equivalent.
BTW, C before C23 prefer putting a void in parameter list to indicate that a function takes
nothing, e.g. int bar(void), but is’s kinda broken design here.
Since that we can not evaluate bar(foo()). There is no such thing that is a void and exists.
Conceptually, std::optional<T> is just a some T with additional information of value-existence.
1 2 3 4 5
template <typename T> structMyOptional { T value; bool hasValue; };
Because that there is impossible to have a member void value,
std::optional<void> is not going to be a valid type at first place.
(Well, we can make a specialization for void, but that’s another story.)
So, How can We Fix?
The problem here is that there is no a valid value for void in C/C++.
At some level, program can be though of a bunch of expressions. An running a program
is just the evaluation of these expressions. (Also the side effects, for real products / services)
The atom of expression is value. If there is a concept that’s not possible to be express
as a value, we are kicking ourselves.
Take Python for example, if we have a function that return nothing, then the function actually
returnsNone when it exits.
1 2 3 4
deffoo(): pass
assert(foo() isNone) # check pass here
1 2 3 4 5 6 7
deffoo(arg: None) -> None: pass
defbar(arg: None) -> None: pass
bar(foo(None)) # well.. if we really want to chain them together
So nothing itself is a thing, in Python, we call it None.
Every expression now is evaluated to some value that exists, the the type system build up that
is complete at any case.
The concept of nothing itself here is call unit type in type theory.
It’s a type that has one and only one value.
In Python, the value is None, in JavaScript it’s null, in Golang … maybe struct{}{} is a
good choice, although not standardized by the language.
Unit Type in C++
Now is the time for C++. As we already see, void is not a good choice for unit type because we
can not have a value for it. Are there other choices here?
Just define a empty struct and use it probably not a good choice, since that
now our custom unit type is not compatible with unit type from other code in the product code base.
How about nullptr, that’s the only one value for std::nullptr_t.
(So the type is std::optional<std::nullptr_t>).
It’s a feasible choice, but looks weird since that pointer implies indirect access semantic,
but it’s not the case when using with std::optional<T> here.
How about using std::nullopt_t? It’s also a unit type but it’s now more confusing.
What’s does it mean by std::optional<std::nullopt_t>? A optional with empty option?
There is a static assert in std::optional<T> template that forbid this usage directly,
probably because it’s too confusing.
Maybe std::tuple<>? A tuple with zero element, so it have only one value, the empty tuple.
That seems to be a good choice because the canonical unit type in Haskell is () the empty tuple.
So it looks natural for people came from Haskell.
But personally I don’t like this either since that now the type has nested angle bracket
as std::optional<std::tuple<>>.
There is a type called std::monostate, arrived at the same time as std::optional in C++17.
This candidate do not have additional implication by it’s type or it’s name.
It’s monostate! Just a little wordy.
std::monostate is originally designed to solve the problem for a std::variant<...> to be
default initialized with any value. But it’s name and it’s characteristic are all fit our
requirement here. Thus a good choice for wrapping a function that may fail but
return nothing.
For a long long long time, Golang have no standard way to represent a iterable sequence.
C++ has range adaptor and iterator (although not strictly typed, only by concept), Python has
iterable/iterator by __iter__/__next__, JavaScript has standardized for-of and
Symbol.iterator since ES6.
Now it’s time for Golang. Starting from Golang 1.23 Aug., we have iterator functions.
funcmain() { for idx := range Iota() { if idx == 3 { break } fmt.Println(idx) // print 0 1 2 } }
According to Go 1.23 Release Notes
Now the range keyword accept three kinds of functions, for which takes another yield function
that yield zero/one/two values.
The loop control variable and the body of the for-loop is translated into the yield function by
language definition. So we can still write imperative-style loop structure even though we are
actually doing some functional-style function composition here.
Why Do We Need This?
Standardize the iterable/iterator interface is a important pre-condition for lazy evaluation.
For example, how should we do when we need to iterates through all non-negative integer,
and doing some map/filter/reduce on them? It waste space to allocate a list for all these
integers (if possible).
Someone may say “we already have channel types”.
Well, but that requires a separate coroutine instance. We probably don’t want such heavy
cost every time we are doing some iterate operations.
Also a separate coroutine means additional synchronization and lifecycle control.
For example, how can we terminate the Count coroutine when we need early break in loop?
funcmain() { for idx := range Count(0) { if idx == 10 { break } fmt.Println("Loop: ", idx) } }
We need some mechanism like context object or another channel right?
That’s a burden for such easy task here.
On the other hand, iterator functions are just ordinary function that accept another function
to yield/output the iterated values, so it’s much lightweight than a separate coroutine.
We want fast program, right? :D
The Stop Design
For languages like Python and JavaScript, the iterator function (or generator in Python terms)
is paused and the control is transfer back to the function that iterates the values.
When break/return happens and no more value are required, the iterator function just
got collected by the runtime since that there are no more references to the function object.
But how do we early break the iteration process, if the control is transfer
into the iterator function? Let’s look at the function signature again.
(Take one value iterator function for example).
1
func(yield func(idx int)bool)
The yield function returns a bool to indicate that whether the loop body does reach the
end, or encounter a break statement. So in normal case, we continue to next possible value
after yield return, but if we got false from yield, our iterator function
can return immediately.
Ecosystem around Iterator
The beauty of iterator only appears if the ecosystem, or we say, the common operations around
iterator are already implemented in standard library. That means:
Conversion from and to standard container types, like slicemap and chan
Operations and compositions of iterators, e.g. map/filter/reduce/chain/take …
In Python, there are generator expressions, which evolves implicit map/filter.
reduce is a function at global scope, also there are many useful functions in itertools package,
e.g. pairwise, batched, chain. Most builtin container types takes iterable as first argument
in it’s constructor.
In Golang, the first part is mostly done along the release of Golang 1.23.
For example, to convert slice from and to iterator, we can use slices.Collect and slices.Values.
For second part, there is a plan to add x/exp/xiter package under golang.org namespace.
There should be at least Concat, Map, Filter, Reduce, Zip … once it’s released.
But unfortunately it’s not compete yet.
Any/All short-circuit lazy evaluation of a predicate on a sequence of values
Forward/Backward absorb input and iterate in reversed order.
For example, if we want to define a iterator function for a binary tree in recursive manner,
we can use Empty and Pack together with Chain to implement this easily.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
type Node struct { val int left *Node right *Node }
funcTraverse(node *Node) iter.Seq[int] { // Empty is useful as base case during recursive generator chaining. if node == nil { return Empty[int]() } // Pack is useful to promote a single value into a iterable for chaining. return Chain( Traverse(node.left), Pack(node.val), Traverse(node.right), ) }
assert5 == 2 | Adder(3) # Is 5 equals to 2 + 3 ? Yes!!
This works because the | operator of integer 2 check the type of Adder(3) and found that
is not something it recognized, so it returns NotImplemented and our reverse magic method goes.
In C++, the | operator is overloaded(?) on range adaptors to accept ranges.
So maybe we can make something similar, having some object implements __ror__ that accept
an iterable and return another value (probably a iterator).
Pipe-able Higher Order Function
So back to our motivation, Python already have something like filtermapreduce,
and also the powerful generator expression to filter and/or map without explicit function call.
1
values = filter(lambda v: v % 2 == 0, range(10))
1
values = (v for v inrange(10) if v % 2 == 0)
But it’s just hard to chain multiple operations together while preserving readability.
So let’s make a filter object that support | operator
It just take some time for we to write the class representation for filter, map, reduce,
take, any … and any higher function you may think useful.
Wait, it looks so tedious. Python should be a powerful language, isn’t it?
Piper and Decorators
The function capturing and __ror__ implementation can be so annoying for all high order function.
If we can make sure __ror__ only take left operand, and return the return value of the captured
function, than we can extract a common Piper class. We just need another function to produce a
function that already capture the required logic.
Now it looks a little nicer … but we still need to implement all wrapper functions for all
kinds of operations?
Again, the only difference between these wrapped functions is the logic inside apply function,
so we can extract this part again, with a decorator!! :D
The on decorator accept some function func, and return a function that first take the
tail arguments of func and return a function that accept head argument of func through
pipe operator.
So now we can express our thoughts in our codebase using pipeline style code,
just with one helper class and one helper decorator! :D
1 2 3 4 5 6 7 8
values = range(10) result = ( values | filter(lambda val: val % 2 == 0) | map(str) | on(lambda chunks: "".join(chunks))() # create pipe-able object on the fly ) assert result == "02468"
or
1 2
for val inrange(10) | filter(lambda val: val % 2 == 0): print(val)
defon(func: Callable[Concatenate[_T, _P], _R]) -> Callable[_P, Piper[_T, _R]]: """ "on" decorates a func into pipe-style function. The result function first takes the arguments, excluding first, and returns an object that takes the first argument through "|" operator. """
Thanks to the symbol-lazy-loading ability in Unix environment,
we can do many interesting thing on functions from some shared library
when executing some executables.
All we need to do are
Implement a shared library that contains the functions we want to hijack.
Run the executable with our magic library inserted.
Make a Shared Library
If we want to replace some function with stub / fake implementation.
we can just implement a function with the same name and the same signature.
For example, if we want to fixed the clock during unit test …
1 2 3 4 5 6 7 8 9 10 11
// in hijack.c
#include<time.h>
// a "time" function which always return the timestamp of the epoch. time_ttime(time_t* arg) { if (arg) *arg = 0;
return0; }
If we want do observation about some function call, but still delegate the
call to the original function, we can implement a function that load
corresponding function at runtime and pass the function call.
For example, if we want to monitor the call sequence of file open action.
classContainer: definvoke(t: type[T]) -> T: # search and build a instance of type t
c = Container() instance: SomeType = c.invoke(SomeType) # ok, we can infer instance is SomeType
這邊 type[T] (or typing.Type[T], the old way) 用來表示我們正在用 t 來傳遞傳遞某個 type T,
而非 type 為 T 的某個 instance。
From typing document:
A variable annotated with C may accept a value of type C. In contrast, a variable annotated with
type[C] (or typing.Type[C]) may accept values that are classes themselves
The Problem
OK,我們有 type[T] 可以用。 DI library 開發者可以用型別為 type[T] 的 t 來做 indexing,
library 使用者可以享受到 static type checker 帶來的 type safety。
於是我們拿這個 library 來用在真實情境.. 沒想到一下子就碰上問題了。
當我們定義 interface type,並透過 DI container 對該 interface 拿取對應的 implementation instance 時。
因為 interface 通常是個 abstract class (or protocol),mypy type checker 會報錯
(mypy: type-abstract)。
Variables and parameters annotated with Type[Proto] accept only concrete (non-protocol) subtypes of Proto.
The main reason for this is to allow instantiation of parameters with such type. For example:
1 2
deffun(cls: Type[Proto]) -> int: return cls().meth() # OK
由於 PEP 544 自從 2017 年就已經完成,mypy 預設執行這個檢查也行之有年,
現在再來改這個行為或許已經來不及了。
於是為了解決這個問題,2020 有人在開了新 issue 9773 · python/mypy
想要定義新的 annotation TypeForm[T]/TypeExpr[T] 來達成要表達任意 type 的 type 的需求。
到目前 (2024-06),對應的 PEP 747 draft 也已經被提出了。
若一切順利,以後我們就會用 TypeExpr[T] 來表達這類 generic function
1 2 3 4 5 6 7 8
classContainer: definvoke(t: TypeExpr[T]) -> T: # search and build a instance of type t
classSomeType(Protocol): ...
c = Container() instance = c.invoke(SomeType) # ok, we find a object for type SomeType for you! operator = c.invoke(Callable[[int], bool]) # you need a (int -> bool)? no problem!
至於目前嘛.. library user 在使用到這類 library 的檔案加入下面這行即可。
我想要修改的範圍和造成的影響應該都還可以接受。