Recent advances in artificial intelligence and machine learning have made natural language processing so powerful that state-of-the-art models have surpassed human performance in existing benchmark datasets.
In the education space, we’ve seen NLP used in several powerful ways, from automated translation and helping students improve their writing skills, to enhancing learning experiences. For example, Google Translate helps make educational content useful for more students around the world. Duolingo uses AI to determine the difficulty of language learning content. Grammarly helps students with mistake-free writing, and TurnItIn helps teachers detect plagiarism. At Quizlet, we leverage ML and NLP for
grading written answers, generating questions, and understanding our content, among others.
Having spent the majority of my career applying (or leading teams to apply) ML and NLP to solve problems for users and businesses, here are some guidelines that I recommend keeping in mind when approaching NLP projects.
Know your problem:
For beginners starting a machine learning problem, it’s easy to get lost in the theory and code. Make sure you understand the problem and hypotheses well by writing them out and doing exploratory data analysis.
Collect your data: The data you use to train and validate NLP models is crucial to their success and it’s worth it to take this step seriously, thinking through creative solutions. For example, for our Subject Classifier
training data, we used existing user generated content that contained subject names in the titles. (For example, we could imply that content with the title “Photosynthesis Chapter 3” was about Photosynthesis.) For other problems, we’ve collected training data through human annotation or asking our users. Some models like OpenAI’s GPT-3 only need a few data points to learn a task, but these come with trade-offs.
Share example outputs: One of the best ways for others to grasp exactly what you’re working on is to share example results. When we generated advanced questions, the examples helped clarify to everyone the value that this new feature could provide and was crucial to getting the project prioritized on the product roadmap. Looking through results yourself also helps you to come with ideas on how to improve the algorithm.
Agree on success metrics: In addition to sharing examples, measure and share holistic performance. For estimating the quality of an algorithm, we’ve often labeled a sample of hundreds of outputs. Agree on which metrics matter (e.g. false positives, coverage) and acceptable thresholds. For example, we built a semantic (“smart”) grader
to grade freeform text answers. We decided that we should aim to maximize the coverage of true correct answers while keeping “False Corrects” under 3%.
Start simple (if you can): Some problems don’t need a fancy algorithm. For example, our “definition suggestion” are just the most common definitions for a given word, which uses a simple count function.
Stay vigilant: If generating content, be aware of bias and offensive/inaccurate content. All the cutting-edge NLP models are trained on internet text, i.e. human behavior, which can be problematic. We used OpenAI to generate example sentences for language learning and had to use their content filter (and our own filter on top of that) to exclude potentially offensive content. It’s also important to have guardrails and opportunities for users to provide feedback.
NLP has the power to help enhance a user experience and to create new features previously not possible. There are many courses and technical resources to help you learn the technology and tooling, and these steps will help you utilize them in real world settings.
Source link