OCR Error Correction with Python
Although we have discussed how you can export your pre-processing steps from OpenRefine to create a program of sorts in JSON, you may find it easier to perform your pre-processing tasks in Python if you are already familiar with programming concepts or if you would like to learn. Python has a number of natural language processing (NLP) libraries that can be used for exploratory data analysis as well.
The Sherman Centre’s Jay Brodeur and Alexandra Provo of NYU Libraries have created a Text Preparation and Analysis workshop with a section on programmatic approaches with Python. The workshop also has a tutorial on OpenRefine if you would like to further practice and build on the skills you have developed through the current workshop.