Processing texts

Wednesday, May 8, 2019

In this class, we will learn how to enrich text with linguistic knowledge (postags, syntactic structure…) using NLTK (Natural Language Toolkit), SPacy and Stanford CoreNLP. We will also look at some standard pre-processing operations (lowercasing, punctuation removal) which are frequently used to normalize textual data.

Pre-processing
Tokenization and Sentence splitting
Part-of-speech (POS) tagging
Morphological analysis, Stemming and Lemmatization
Stop words recognition
Named Entity Recognition (NER)
Constituency/dependency parsing

Processing texts

CONTACT INFO

Mastodon

Address

E mail

Program Highlights

Collecting data

Processing texts

Analysing texts

PHOTO STREAM