How Advanced Is That Course? Building an AI Model for Learning Resource Levels

May 25, 2021

One of the coolest features that we are building as part of LearnerShape’s AI-powered learning infrastructure is the ability to recognize the level of a learning resource automatically. It looks like this in deployment:

This simple interface conceals a lot going on under the hood to provide the “For beginners” recommendation, from both conceptual and machine learning perspectives. We are not aware of any other provider offering this automatic capability.

Learning Resource Recommendation

Learning resource recommendation is one of the main features of LearnerShape. There are multiple things to consider when choosing a learning resource for a new skill. The resource has to be relevant to the skill, but there is rarely only one relevant course. Other factors that learners typically must consider include quality of the learning provider and the specific resource, level of the material and any associated prerequisites, time requirements, formats and other. Our infrastructure includes features to address all of these factors, which we are steadily improving. Recently, we have given a lot of focus to learning levels.

Levels and Learning Resources

Learning a new skill takes time and is not a binary state of either knowing or not knowing a skill. We may start with zero knowledge, perhaps even lacking the context for a skill, and progressively advance until we have a thorough familiarity. The LearnerShape infrastructure currently uses five levels (although we will support other level frameworks):

For learning resource recommendations it is important to match the level of the learner with the intended audience of a learning resource. If a learner is still a beginner then an advanced course on a narrow area of the skill is likely to confuse rather than educate. Conversely, showing a course developed as a first introduction to a skill to someone already advanced or expert will communicate little or no new information.

Using Data from Course Providers

When we began studying resource levels, we started with information supplied by course providers. Our learning resource catalog includes a variety of different sources. Some of these sources, such as Coursera, edX, and Udemy, provide level information for many of their courses. Initially, these platform-supplied levels seemed to be an excellent resource. When available we would be able to use them directly, and could use them as a training dataset for sources that do not provide course level.

This first approach quickly hit a roadblock when we realized that there are significant differences between how different platforms assess levels. And when we compared the platform levels against a manually-curated dataset, there were significant differences.

The reason for these discrepancies soon became evident, and relates to how levels are being used. LearnerShape’s goal is to generate a sensible pathway of learning resources for specific skills, while the primary goal of platform-supplied levels is to enable comparison of courses on the platform. Put differently, the inherent complexity of a skill being taught is independent of the ‘level’ at which a course teaches the skill. To illustrate this, let’s take a couple of examples:

As a result, while platform-supplied levels provide useful information for sequencing skills, we needed to develop our own model to predict levels associated with teaching specific skills.

Building an AI Model for Learning Resource Levels

To build our own AI model, we started with the academic literature on automated recognition of text and resource levels. This search spanned many decades of research. We first looked at readability indexes. Developed during the 1950s-1970s, these simple formulas were developed to gauge at what age someone would likely understand a piece of text on the first reading. We next looked at studies leveraging classical machine learning techniques and then more modern NLP features and deep learning approaches. Finally, we reviewed options for directly sequencing a collection of courses. While this previous work is highly interesting and helpful, no one seems to have cracked the problem of automatic recognition of learning resource levels. So we decided to dive in.

As a first step in level evaluation, we have focused on identifying courses suitable for a beginner. This required us to augment our training dataset of labelled learning resources and skill pairs. The dataset now includes information on whether learning resources are introductory or not, taking our approach to levels described above.

Our model development then progressed from relatively simple models focusing on specific text features (inspired in part by the literature described above) to leveraging the advances we have made on relevance by applying attention-based natural language processing models.

With these models we are now able to report level information and include it when making recommendations. Beginner level courses can be prioritised and are now highlighted by a flag on our platform. We will continue to develop our approach, with the aim of supporting more granular determination of levels over time.

These learning resource level functions are already available in our AI-powered learning infrastructure along with many other capabilities. If this type of nuanced recommendation is appealing, please contact us to discuss the educational and learning needs in your organization.

Dr Jonathan Street, Head of Data Science, LearnerShape
Maury Shenk, Co-Founder & CEO, LearnerShape