top of page

Natural Language Processing for Beginners: 5 Cool Projects for Students

Updated: Apr 11, 2022

The best way to learn a new concept is by getting some hands on experience building projects in it. We believe in the 4Cs philosophy to learning AI and will describe how to introduce the topic of NLP to kids of different grade levels. We provide five sample projects that kids can build and interact with along with grades that each project is suitable for. All the projects can be built using completely online tools. All they need is the internet and a web browser.


Types of Natural Language Processing (NLP) Artificial Intelligence AI for Beginners and Students

Concepts: What is natural language processing (NLP)?


Natural Language Processing (NLP) refers to techniques focussed on helping a computer understand the human language. Language can be in the form of text, audio signal etc. Goal is to automate tasks such as transcripting voicemails, answering questions (digital assistants), translating between languages etc to name a few.


What is the difference between NLP and AI?


Artificial Intelligence and Machine Learning refer to a large group of technologies that help computers recognize patterns from data and automate tasks that typically need human intelligence. You can find more information about this topic here. Natural language processing (NLP) is a subset of the techniques in AI that focus on human languages and linguistics. NLP is a very broad and deep field with a wide range of techniques within the machine learning literature.


Is NLP easy to learn and how to start?


To answer the first question about how hard or easy it is to learn NLP, it depends on your background and what you are trying to accomplish. If you are completely new to machine learning with no coding background, an easy way to learn it is by building projects using existing implementations of NLP algorithms. We share five example projects one can get started with to understand how natural language is processed by an AI.



Stages of Natural Language Processing (NLP) Artificial Intelligence AI for Beginners and Students

What are the components of natural language processing?


Building an NLP based AI consists of several steps. In order to build the AI model, it needs to undergo training based on data. After this, the model can be used in the real world for making predictions. In case of NLP, data is either in the form of text or an audio signal. This data needs to be converted to numbers before it can be processed by the AI. The figure below describes the high level steps involved in building a NLP application.


Context: What are some examples of NLP in real life?


Natural language processing is used in our daily lives in a number of applications. For example, spam filters in your email work by processing the text in title and text of the email and classifying it as spam or not spam. These days, we are using language to interact with our devices such as phones and digital assistants and these are all places NLP is being used behind the scenes. The auto-complete when you type emails or text messages, spelling correction suggestions and grammar checkers are all using NLP techniques behind the scenes.


Capability: What are some NLP projects students can build?

Students in middle and high school can build NLP projects using simple tools available online. Being able to build something cultivates interest and inspires them to learn more. This can be done in several ways depending on their grade level and programming experience. We describe five projects along with some guidance on what grades each one of them might be suitable for. All the projects can be built completely online.


Sentiment Analysis from scratch

In this project, kids build a sentiment analysis AI completely online. Kids from grades 4-12 can easily build one and customize it to their satisfaction. For this activity, we will use a free tool called Navigator available at https://www.corp.aiclub.world/. The cool thing about this tool is that you can download or run the entire code in colab as well. If you don't want to create an account with AIClub, just use it as a guest. Click on Sign Up and select the option that works for you. After you click on either `Get an account` or `use as a guest`, simply go to Navigator and we can build a language AI. Please click on the avatar on top right corner and select `Go to Navigator`.


Steps to build the AI are shown in the video below:


Things to try:

  • Try different sentences and record the ones that you think the AI predicted incorrectly

  • Enter double negative sentences like, `I am not happy`. How does the AI behave?

  • Try trick sentences like "I am so happy I am crying", or "I am crying with joy". Does the AI get confused by the word "crying"?

  • Try neutral sentences like "It is 4pm in the afternoon". Does the AI know that this neither happy nor sad? Can it give a third answer?

  • Most students will notice that the AI gets some obvious sentences incorrect.

  • Point out that the AI learnt from existing data and if they don't like something it predicted, they have the ability to re-train it

Re-train the AI:

  • Click on the re-train button under the monitor tab

  • When the AI predicts an answer, you dont like, click on the `This prediction is wrong` button

  • Provide your feedback and continue to try other sentences

  • At any point, hit re-trian button and test if the AI is better at predicting the sentiment of these sentences

If you are familiar with coding and want to try it in your own python environment, you have the option to download code from the train tab in Navigator.


Sentiment Analysis using pre-built models

This project is built using python code and is made available using colab, a free ipython notebook from Google. This project is suitable for students from grades 6-12. Even though the project is implemented in native python, they do not need programming experience to interact with it. You will notice that in the previous project, the vocabulary understood by the AI is limited by the sentences provided in the training dataset. Since sentiment analysis is a pretty popular application, we can use a pre-built model to accomplish it. Using a pre-build model skips the training step. You simply download the pre-built model and start using it. Here is a link to code in colab that you can use to interact with such an AI.


Text classification using custom data

This project is suitable for students in grades 6-12. So far, we have built NLP models for a very specific application of detecting emotion from sentences. However, NLP can be used for text classification in many different contexts. Two publicly available datasets that are in popular use are (a) Spam or Ham (b) IMDB reviews. You can download these datasets and use the Navigator tool described in the first project to build the AI. The only difference is that you would be uploading a file instead of using a pre-loaded dataset.


Question and Answering

This project is built using python code and is made available using colab, a free ipython notebook from Google. This project is suitable for students from grades 6-12. Having a little bit of programming experience will help them interact with the model better and change its parameters. Natural Language Processing can also be used for question answering.We will describe how one can easily build such an application for any custom data. We will use a model called BERT ( BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding) that uses a type of architecture called transformers. This is again a pre-trained model that we will download from a public repository. Here is the link to colab that demonstrates how it can be used for any custom data and questions.


Text Summarization

This project is built using python code and is made available using colab, a free ipython notebook from Google. This project is suitable for students from grades 6-12. Having a little bit of programming experience will help them interact with the model better and change its parameters. Just like the Question and Answering project, we will use a pre-built model based on BERT. However the model itself is very different and the way it is used is also different. Here is the link to colab that demonstrates how it can be used for any custom data.


Creativity: What can you do?

The projects described above are examples used to demonstrate the types of techniques and tools that can be used to build these projects. Students can use their creativity to build custom projects using their own data. Some example projects built by students are listed below


Sathee

Meenakshi, a middle school student built an App called Saathee: Your New Companion. She built several AIs based on NLP to power her chatbot. This app aims to address loneliness in elderly population. She was inspired to build this based on her observation of how the pandemic was impacting her grandparents. The student collected data for this by crowdsourcing using google forms and used it to train her AI. You can find more information about this project here.


Factopionator

A high schooler built an app that takes a url as an input and outputs if a news article contains facts or opinions. All the data needed to train this AI was also collected manually by the student. You can try her app out here.


3 Subject Detection

Shriya, a middle schooler built an app to predict which subject a given paragraph belongs to. She used a publicly available dataset on kaggle to train her AI. You can try her AI here.


865 views0 comments
bottom of page