Getting started with spaCy

spaCy is an open-source Natural Language Processing library in Python. It helps to build the application that processes and understand the large volume of text.

  • spaCy is not “An API”
  • spaCy is not a chat bot
  • spaCy is not a software neither it is a company
  • It only provides text processing capabilities.

spaCy is compatible with 64-bit CPython 2.7 / 3.5+ and runs on Unix/Linux, macOS/OS X, and Windows.

  • pip install spacy
  • conda install -c conda-forge spacy

Tokenization : Tokenization is the way of splitting the text into individual words. This token could be paragraphs,words and sentences.

As of now in the version of 2.0 there are two popular visualizers displaCy and displaCy ENT are the part of library . These visualizers help in the visualization of dependency parse and named entities in a text and also they are helpful in speeding up development and debugging your code and training process.

visualizing the dependency parse
visualizing the dependency parse
visualizing the entity recognizer


This article is inspired by keynote speaker Ines Montani in PyconIndia 2019.Thank you for reading. Please give it a try, have fun and let me know your feedback!

Software Developer | Technical Writer