Friday, 1 November 2019

Lazy Split

Those working with C# should be rather familiar with the idea of performing lazy operations on collections. With linq to collections, when you chain operations (provided as extension methods by System.Linq.Enumerable) on an IEnumerable, these operations are not executed until the first time you iterate the resulting Enumerable (this is called deferred execution). Furthermore, whenever possible, the operations are performed one by one, as we retrieve each element during the iteration. For example, for the Select (equivalent to Map in other languages) or Where (equivalent to filter in other languages) methods, when we retrieve the first element, we will execute the delegate provided to Select for only one element, and the delegate provided to Where only until we obtain the first element fullfilling the condition. This is called lazy evaluation. However, a method like Sort needs to execute its delegate for all elements before retrieving the first item, this is called eager evaluation. All this is much better explained in this post.

The other day I came across lazy.js that brings this concept of deferred execution of operations on collections to the JavaScript world. Basically they have implemented the methods provided by underscore-lodash in a "lazy manner". They are all deferred, and I guess whenever possible the evaluation will be lazy rather than eager. Lazy.js is not only about collections, it also includes lazy funcionality for strings, like for example a lazy split. Honestly I'd never thought about something like this, but it makes much sense for huge strings.

Out of curiosity, I decided to implement my own lazy split, which is pretty straightforward thanks to generators:

function* lazySplit(source, separator){
 let curPos = 0;
 curPart = "";
 while (curPos < source.length){
  curPart += source[curPos];
  if ((curPart).endsWith(separator)){
   yield curPart.substring(0, curPart.length - separator.length);
   curPart = "";
  }
  curPos++;
 }
 yield curPart;
}

function doTest(str){
 console.log("- splitting: " + str);
 let generatorObj = lazySplit(str, ";");
 for (let it of generatorObj){
  console.log(it);
 }
 console.log("--------------\n");
}

doTest("a;bc;de");
doTest("a;bc;");
doTest("a;;de");
doTest(";;");

.Net lacks a lazy split in the standard library, so other people have implemented it.

No comments:

Post a Comment