Skip to content
NLP in Python with Deep Learning
Jupyter Notebook Python HTML
Branch: master
Clone or download

Latest commit

Latest commit d139cad Mar 31, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Part-08 Web Deployments Bump flask from 0.12.3 to 1.0 in /Part-08 Web Deployments Sep 12, 2019
examples Add vectors, text classification, chat Jul 8, 2019
.gitignore Add Web Deployments Oct 4, 2018
LICENSE Initial commit Mar 11, 2018
Part-01.ipynb Remove Verbose explanations Nov 19, 2018
Part-02-A.ipynb Remove Verbose explanations Nov 19, 2018
Part-02-B.ipynb Remove Verbose explanations Nov 19, 2018
Part-03 NLP with spaCy and Textacy.ipynb Remove Verbose explanations Nov 19, 2018
Part-04 Text Representations.ipynb Remove Verbose explanations Nov 19, 2018
Part-05 Modern Text Classification.ipynb Remove Verbose explanations Nov 19, 2018
Part-06 Deep Learning for NLP.ipynb Remove Verbose explanations Nov 19, 2018
Part-07 Building your own Chatbot in 30 minutes.ipynb Remove Verbose explanations Nov 19, 2018
Part-10 Bonus Content.ipynb Better filenames Jun 14, 2018
Playground.ipynb Better filenames Jun 14, 2018
README.md Update README.md Dec 17, 2018
READTHIS.md Start linguistics rewrite Aug 7, 2018
_config.yml Set theme jekyll-theme-tactile May 25, 2018
environment.yml Update environment.yml and requirements.txt Jul 25, 2018
requirements.txt Bump bleach from 3.1.2 to 3.1.4 Mar 30, 2020
sherlock.txt Create Part-03 and 04 notebooks Apr 1, 2018
tokenization.png Add download dataset in Part-05 Apr 10, 2018
tokenization.svg Add download dataset in Part-05 Apr 10, 2018

README.md

Natural Language Processing Notebooks

Available as a Book: NLP in Python - Quickstart Guide

Written for Practicing Engineers

This work builds on the outstanding work which exists on Natural Language Processing. These range from classics like Jurafsky's Speech and Language Processing to rather modern work in The Deep Learning Book by Ian Goodfellow et al.

While they are great as introductory textbooks for college students - this is intended for practitioners to quickly read, skim, select what is useful and then proceed. There are several notebooks divided into 7 logical themes.

Each section builds on ideas and code from previous notebooks, but you can fill in the gaps mentally and jump directly to what interests you.

Chapter 01

Introduction To Text Processing, with Text Classification

  • Perfect for Getting Started! We learn better with code-first approaches

Chapter 02

  • Text Cleaning notebook, code-first approaches with supporting explanation. Covers some simple ideas like:
    • Stop words removal
    • Lemmatization
  • Spell Correction covers almost everything that you will ever need to get started with spell correction, similar words problems and so on

Chapter 03

Leveraging Linguistics is an important toolkit in any practitioners toolkit. Using spaCy and textacy we look at two interesting challenges and how to tackle them:

  • Redacting names
    • Named Entity Recognition
  • Question and Answer Generation
    • Part of Speech Tagging
    • Dependency Parsing

Chapter 04

Text Representations is about converting text to numerical representations aka vectors

  • Covers popular celebrities: word2vec, fasttext and doc2vec - document similarity using the same
  • Programmer's Guide to gensim

Chapter 05

Modern Methods for Text Classification is simple, exploratory and talks about:

  • Simple Classifiers and How to Optimize Them from scikit-learn
  • How to combine and ensemble them for increased performance
  • Builds intuition for ensembling - so that you can write your own ensembling techniques

Chapter 06

Deep Learning for NLP is less about fancy data modeling, and more engineering for Deep Learning

  • From scratch code tutorial with Text Classification as an example
  • Using PyTorch and torchtext
  • Write our own data loaders, pre-processing, training loop and other utilities

Chapter 07

Building your own Chatbot from scratch in 30 minutes. We use this to explore unsupervised learning and put together several of the ideas we have already seen.

  • simpler, direct problem formulation instead of complicated chatbot tutorials commonly seen
  • intents, responses and templates in chat bot parlance
  • hacking word based similarity engine to work with little to no training samples
You can’t perform that action at this time.