Posted on Jul 10 • Edited on Jul 11

What is POS tagging in NLP with Python using Spacy

How does an AI know that ‘run’ is a verb and ‘quick’ is an adjective? That’s the magic of Part Of Speech Tagging – teaching machines grammar!

"A woman without her man is nothing."
"A woman, without her, man is nothing."

One comma change the whole meaning of sentence. Same goes for AI model if AI doesn't understand this basic it can misinterpret of sentence or text. So Part of Speech, which is task of NLP help in it to make model work accurately.

What is Part Of Speech?

It is NLP where each word in a text is assigned a grammatical tag (like noun, verb, adjective etc.) This process helps computer understand the syntactic structure of a sentence and the role of each word, which is crucial for various NLP tasks.

Many words can have multiple meanings depending on their context. For example:

"Book a fight"
* Book -> verb (an action)
"Read the book"
* Book -> Noun (an object)

Without POS tagging, an NLP system might treat both "book" the same and get confused POS tagging helps resolve these ambiguities.

Code

Import necessary library and Initialise the text

 import spacy nlp=spacy.load('en_core_web_sm') text=u"Steve Jobs was a founder of Apple, he created his company April 1, 1976. Now company headquarter located in Cupertino,California,United State" d=nlp(text)

Part of Speech

 print(d[0].text,d[0].pos_,d[0].tag_)

Output

 Steve PROPN NNP

d[0].text first word of sentence of text Steve. d[0].pos_ assigning the grammatical categories PROPN proper noun. d[0].tag_ indicating its grammatical role NNP proper noun, singular.

Print for every word

 text=u"I like to play cricket" d=nlp(text) for token in d: print(f"{token.text:{15}}{token.pos_:{15}}{token.tag_:{15}}{spacy.explain(token.tag_)}")

Output

 I PRON PRP pronoun, personal like VERB VBP verb, non-3rd person singular present to PART TO infinitival "to" play VERB VB verb, base form cricket NOUN NN noun, singular or mass

token iterate through text.
token.text:{15} take word from text token.text and :{15} it give 15 spaces after the word.
token.pos_:{15} assign grammatical categories and :{15} it give 15 spaces after the word.
token.tag_ indicating grammatical role.
spacy.explain(token.tag_) it will explain the tag_.