Meet Ruud Goorden - data scientist and cycling enthusiast. By day, Ruud applies his expertise in Natural Language Processing (NLP) to analyze customer feedback and improve services at Essent. But when he's not busy optimizing decision-making processes, Ruud is busy running a training consultancy for cyclists. With "data-driven cycling training" as the motto, Ruud uses his passion for cycling and a data-driven approach to help fellow cyclists achieve their performance goals.
At Essent, Ruud's expertise in NLP and data analysis extends beyond his work with customer feedback. The company is constantly seeking ways to improve and optimize their decision-making processes, and text data is an important component of this effort. Text data offers a rich source of information that can be used to better serve customers and improve services. But protecting the privacy of our customers while doing so is extremely important. Traditional methods for identifying this information are often incomplete and not up to date. That's why Essent has turned to Transformer models, which offer a more accurate and up-to-date way to identify and remove privacy-sensitive information.
Entities in project
How Transformer Models are Revolutionizing Textual Data Analysis
Transformer models - the buzzword in the world of natural language processing (NLP) - are large neural networks trained on vast text corpora. They have taken the NLP world by storm because they can be fine-tuned for specific tasks or domains, making them highly versatile. These models have a unique ability to comprehend the context of a sentence, paragraph, or document, allowing them to handle grammatical errors and understand the meaning of the text better than traditional techniques. In fact, Transformer models are the building blocks of Large Language Models (LLMs), a new generation of language models like ChatGPT.
But how well do these Transformer models perform? To find out, we conducted a Proof of Concept (POC). The results left us in awe. The Transformer models outperformed the traditional list method by a significant margin. Why? Because they don't just consider individual words, but they understand the context in which they are used. This makes them incredibly powerful tools for any NLP task.
Excited to share our findings, we presented our research at the highly anticipated BrabanDerS meetup. In front of the Data Science and Analytics community, we delved into the limitless possibilities of entity recognition and Transformer models. The response was overwhelmingly positive, as attendees were blown away by the potential of these technologies. Our research sparked a lively discussion, leaving everyone buzzing with ideas and inspiration.
Percentage correctly predicted entities.
STR = street, PL = place, PER = person name, PC = postal code, O = no entity, HNR = house number, DAT = date, BD = date of birth
From the table above, for example, it can be inferred that we correctly predicted 95.7% of the personal names.
While entity recognition is a powerful tool in the field of Natural Language Processing (NLP), it's just the tip of the iceberg when it comes to unlocking the full potential of textual data. The possibilities are endless, and we're only scratching the surface of what's possible.
That's why we're happy to announce that we'll be hosting the next BrabanDerS meetup at our headquarters in Den Bosch on Thursday, June 1st. The event promises to be a fantastic opportunity for Data Science and Analytics enthusiasts to come together and share their knowledge and expertise. We'll be giving two presentations on our Data Science work, and two other companies will be sharing their insights too. The event is designed to be accessible to everyone, regardless of their level of knowledge in the field. This is the perfect chance to showcase Essent as a leader in Data Science & Analytics and put us on the map as an employer of choice. Don't miss out - register now to secure your spot!