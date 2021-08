Understand subword-based tokenization algorithm used by state-of-the-art NLP models — WordPiece. Over the past few years, there has been a lot of buzz in the field of AI and especially NLP. 😎 Understanding and analyzing human language is not only a challenging problem but fascinating as well. The human language looks simple but is very complicated as even a short text can contain references to both personal life and the outside world. 🧐 This complexity brings a lot of challenges. Researchers around the world are working to overcome these challenges and are building smarter real-world applications. 👩‍💻