If you know Haskell, you are familiar with the following expressions
Several of the above expressions produce infinite lists. For those of you who are used to mainstream programming voices may be confused as to how the concept of infinity can be expressed inside a finite amount of memory. The main reason is that Haskell is a language that uses an inert evaluation strategy by default, and the part that is not used is just an expression inside the memory and does not really do the computation.
If we just look at the above expressions, many of you may say that it doesn’t feel like there’s anything magical about it, and it doesn’t seem to do anything. Let’s look at the following code again.
The fibonacci
series in Haskell.
|
|
Here fibonacci
itself is an inert structure, so in the calculation, it will first count the two 1
s in front of the list, to get 1 : 1...
, and then how to express the fib(n) = fib(n - 1) + fib(n - 2)
property of fibonacci
? We can notice that n - 1
and n - 2
are exactly one place apart in the series, so n
can be seen as the result of the sum of the wrong places in the series.
Let’s look at a sieve method for finding prime numbers. If you are not familiar with the sieve method, you can click wiki
to see how the algorithm works. The following code is a simple implementation of Haskell
.
So, Why Lazy?
Inert lists make the code and structure more flexible for certain list operations of indefinite length. To use the primes
list as an example, in a traditional C or Java implementation, we would have to declare a maximum length or a maximum range of values, such as prime numbers up to 10000. If the later calculation is to use more than this range, we will have to re-call the generation function to regenerate a longer list. The problem here is: first, to actively call this factory function, and second, to manually maintain a cache list if we want to reuse the already computed data, which inevitably increases the complexity of the code. Another possible scenario is that we pre-generate a very long list and only use a drop of data from the head of the list in the later calculations, which is also a great waste.
The use of inert lists increases the expressiveness of our programming, allowing us to focus more on the properties of the data structure itself, rather than wasting time on how to manage the stack. Because the inert value feature ensures that we only compute a value when we need it, we can automatically minimize our computation and save resources.
For example, we can read and write files with lazy byteString, which itself does not load the whole file into our memory, but reads it on demand. Sometimes we can read a large file and filter out only the first few dozens of data we need, but we have to put hundreds of M or even G files into memory.
Here is also an article from more than 10 years ago How to Speed Up Lo-Dash ×100? Introducing Lazy Evaluation, if you are interested, you can click on it.
Implementing Lazy Lists in JavaScript
Is there an inert structure in JavaScript? Let’s look at the following example.
The fetch
method itself is executed immediately, but if the condition is met, the fetch
method here will be executed twice. This is not the behavior we expect, and the usual way to make this fetch
action execute when we need it, instead of starting it when it is declared, is to change it to look like the following.
Inspired by this, we can roughly implement the following structure.
List<T>
is essentially a single-linked table, and the tail
passed into the constructor is a factory function that builds new List
nodes. Only when we access a node, we evaluate its head
, and when we access its next node, we evaluate tail
, otherwise both head
and tail
are just expressions to be evaluated.
This seems to have solved my problem, but this structure has a lot of unnecessary extra overhead when converting to and from normal Array
.
So is there a more natural structure in JavaScript
that would save us from having to construct such a complex object, simplifying the code while making it more generalizable?
Getting Started with Iterable
A new feature of ES6 gave me the answer I was looking for, Iteration Protocols. If the MDN description is too long, you can just look at the equivalent type declaration below.
|
|
All objects that implement an Iterable interface can be used by means of an object such as for... ...of...
, ... .itor
and Array.from
. The iteration ends when done
is true
in the return value of the next
method. And only when we access the next
method will we move to the next iteration, the ideal Lazy
structure.
At this point we look at how our fibonacci
should be written.
|
|
At this point, we can already express an inert infinite series. But the above code is too cumbersome, so it’s a good thing that ES6
also provides us with Generator
, which makes it easy to write IterableItorator
, and in a sense, Generator
is syntactic sugar for the above code.
Using Generator
, the above code can be simplified to look like the following.
If you don’t know the syntax of Generator, you can read this article A Simple Guide to Understanding Javascript (ES6) Generators first.
Define Infinite List
Moving on from the above code, the following code implements the repeat
, cycle
, iterate
, range
and other methods from the beginning of the article.
|
|
As you can see, the code is very intuitive and easy to understand.
Define Operator
Once we have the list, we need to operate on top of it. The following code implements the map/filter/take/takeWhile
methods respectively.
|
|
The above code is relatively simple. The more difficult part is to implement the zip
method, that is, how to merge two lists into one?
The difficulty is that if you receive an Iterable
object, you don’t necessarily need to implement the next method, such as Array
, String
, etc. Also, not all Iterable
objects can be accessed by index
. In addition, if we want to turn Array.from
into an array first and then operate on the array, we will encounter a situation where the Iterable
object we pass in is infinite, like fibonacci
above, in which case we can’t use Array.from
.
One of my ideas is to elevate an Iterable
object to an IterableItorator
object, and then iterate through it one by one using the next method.
How
? Fortunately, Generator
gives us a yield*
operator that allows us to easily define a lift
method.
With this lift
method, it is easy to write zip
methods and zipWith
methods.
|
|
More methods can be found at the bottom of my repo
, so I won’t list them all here.
Summary
Generator
and Iterator
are very powerful language-level capabilities that ES6
brings to the table, and its own evaluation can be seen as inert.
When co
from TJ
first came out around 13 years ago, the code was amazingly short and concise. However, in our use, one is limited by browser compatibility, and the other is limited by our use scenario, I think we have not developed its features far enough. Combining IO
, network
, Generator
and Iterator
can do a lot more for us.
On the contrary, the abuse of inert structures will cache a lot of thunk during the execution of the program and increase the overhead in memory.
- GitHub lazyList。
- Slide Lazy List With Generator and Iterator
Reference https://tech.youzan.com/lazy-list-with-generator-and-iterator/