Week 36 note: Writing a parser is hard
This week I had the honor of replacing libphonenumber-js from our frontend application. It was used to check the validity of the phone number but also to format the number in E.164 format.
We wanted to replace the library because it was affecting the bundle size adversely and we were not able to tree-shake it down to its bare essentials, we didn't really find any good alternative NPM libraries. So we opted to write our own instead. The end result was 30KB saved, with the custom implementation taking roughly 2KB.
I relearned that writing a parser is hard, as always. And, having already written one formatter as an exercise, I can say formatters are even harder†.
For example, while implementing my custom parser, I had to ask myself do I want to support the following use case: Phone number: +3581234567
? Some would say, hell yes. Whereas I said: please file a feature request to Jira, satan.
How model-complete do you need the parser to be? Perhaps you are just looking for simple validation in the frontend and your backend will do a more thorough/spec-complete parse of the number (where you can use, say, libphonenumber-js library for a model-complete check). This was the case for us here.
Sometimes you are forced to support non-spec-compliant features. For example, I once had to add support for parsing a query parameter key as an opaque string due to the way that advertisers used a non-percent-encoded URL after the ?
for tracking purposes. In the end, the library had to be vendored because the behavior was too non-compliant (and we couldn't find a way to land a change to upstream).
Sometimes an email is just a .+\@.+
regular expression. Or maybe it truly needs to be fully compliant. Or maybe you can just take a non-empty string, attempt to send an email and show user a generic message.
All this goes to say, writing a parser is tricky business. You're best not writing them most of the time unless you absolutely need to.
† Writing a formatter is not an easy undertaking. You should definitely write a formatter at least once in your life to learn how to write parsers, how to deal with tricky formatting requirements in particular (such as softlines), not to mention recursion, loads of recursion, and tricky regular expressions (especially if you are writing a context-free grammar using instaparse).