DEV Community

Cover image for Spacy Library for NLP
datatoinfinity
datatoinfinity

Posted on

Spacy Library for NLP

Think of Spacy library more intelligent than nltk. Let's start with installing spacy library.

You can use google collab to avoid hassle for downloading it.

Write in terminal of code editor.

 pip install spacy 
 import spacy 

nlp = spacy.load('en_core_web_lg') loads a large pre-trained English language model in spaCy, making it available for natural language processing tasks. This specific model, en_core_web_lg, provides comprehensive capabilities like tokenization, part-of-speech tagging, dependency parsing, and named entity recognition.

 python -m spacy download en_core_web_lg 
 import spacy nlp=spacy.load('en_core_web_lg') 

Tokenisation

nltk.tokenize

 import nltk from nltk.tokenize import word_tokenize txt="Hello How it going U.S.A." print(word_tokenize(txt)) 
 Output: ['Hello', 'How', 'it', 'going', 'U.S.A', '.'] 

nltk.tokenize made '.' full stop also split.

spacy tokenize

 import spacy nlp=spacy.load('en_core_web_lg') text=nlp("Hello How it going U.S.A.") for token in text: print(token.text) 
 Hello How it going U.S.A. 

It doesn't split '.' full stop.

Here is question for you.
txt=nlp("I can't came there")
for token in text:
print(token.text)
Output:
I
ca
n't
came
there
Why it is treating "can't" separately "ca" "n't" how to solve this thing.

Part of Speech (POS).

 import spacy nlp=spacy.load('en_core_web_lg') text=nlp("Hello How it going U.S.A. we are 83 block") for token in text: print(token.text,token.pos) 
 Hello 91 How 98 it 95 going 100 U.S.A. 96 we 95 are 87 83 93 block 92 

These number is given to the part of speech.

 import spacy nlp=spacy.load('en_core_web_lg') text=nlp("Hello How it going U.S.A. we are 83 block") for token in text: print(token.text,token.pos_) 
 Hello INTJ How SCONJ it PRON going VERB U.S.A. PROPN we PRON are AUX 83 NUM block NOUN 

Now you see Hello is interjection it is pronoun and further more.

Sentence Tokenisation

 s=nlp(u"This is the first sentence. I gave given fullstop please check. Let's study now") for sentence in s.sents: print(sentence) 
 Output: This is the first sentence. I gave given fullstop please check. Let's study now 

Top comments (0)