Congratulations

Congratulations! You’ve just finished this workshop.

You should now be able to:

Describe how NER identifies possible entities within a text corpus
Identify potential names, places and organizations using NER tools
Explain why different NER tools may produce different results from each other

Additional Resources

To learn more about any particular topic, take a look at the links below.

Developing your NER expertise

The current workshop is intended to provide an accessible introduction to the concepts and practices of named entity recognition. Once you are comfortable with the basics, William Mattingly’s Introduction to Named Entity Recognition is a great next step in reinforcing and building on what you have learned. Mattingly’s lesson explores how to train your own model in SpaCy and takes you through an actual use case.

SpaCy

SpaCy’s documentation is extensive as it is designed for use by application developers. A few resources in particular that you may find helpful:

Advanced NLP with Spacy course, which goes over more complex tasks in SpaCy like training a language model
Linguistic Features Guide, if you would like to explore other natural language processing tasks you can perform with SpaCy, such as part-of-speech tagging
Training Models, if you’ve decided to go for it!

SpaCy’s developers, Explosion AI, also have a YouTube channel with numerous videos around the design and use of SpaCy.

Natural Language Toolkit (NLTK)

NLTK, briefly referenced in “Other NER Tools,” is another natural language processing library for Python that is widely used within the academic Digital Scholarship community. If you are using Spyder through the Anaconda environment, NLTK will already be installed for you.

Read more about NLTK, including installation instructions and numerous examples of use for different NLP tasks
Get an overview of working with NLTK for text preparation and analysis from the Sherman Centre’s very own Jay Brodeur (with Alexandra Provo of NYU Libraries)
Tutorials from the Programming Historian on performing sentiment analysis (Z. Saldaña) and stylometric analysis (F. Laramée) with NLTK
Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit, a free online textbook on NLTK that is actively maintained