2021. február 1., hétfő

batchgrep: how to match many regexps 25x faster!

I came across a situation where I had to look for 1100+ regexps in a large file.
Running grep 1100 times is trivial but slow: took 5+ minutes (using MSYS2 on Windows 10) which is kind of unacceptable in my build scenario.

So I came up with the below script that will merge as many regexp patterns in one run of grep as your platform allows (32000 on cygwin / Windows, 131072 on Linux)

TL;DR: the same amount of pattern matching now takes 10-12 seconds, which is 25 times faster!😎

(n.b: I know dividing by the longest line is not using all the possible length, but is safer this way and the further speed gain was not worth a more complex solution for me)

Rendszeres olvasók