scikit-learn
- Applying scikit-learn TfidfVectorizer on tokenized text (28 Feb 2018)
An example showing how to use scikit-learn TfidfVectorizer class on text which is already tokenized, i.e., in a list of tokens. - Hyperparameter optimization across multiple models in scikit-learn (23 Feb 2018)
This blog post shows how to perform hyperparameter optimization across multiple models in scikit-learn, using a helper class one can tune several models at once and print a report with the results and parameters settings. - Document Classification (01 Apr 2017)
An introduction to the Document Classification task, in this case in a multi-class and multi-label scenario, proposed solutions include TF-IDF weighted vectors, an average of word2vec words-embeddings and a single vector representation of the document using doc2vec. Includes code using Pipeline and GridSearchCV classes from scikit-learn.
viterbi
sequence-prediction
pos-tags
neural-networks
word2vec
scikit-learn
conditional-random-fields
NER
word-embeddings
syntactic-dependencies
reference-post
gensim
fasttext
evaluation_metrics
document-classification
classification
SyntaxNet
NLTK
LSTM
wikidata
tokenization
tf-idf
stanford-NER
sparql
seq2seq
relationship-extraction
recurrent-neural-networks
portuguese
pandas
nlp
named-entity-recognition
naive-bayes
multi-label-classification
maximum-entropy-markov-models
machine-translation
logistic-regression
language-models
information-extraction
imbalanced_data
hyperparameter-optimization
hidden-markov-models
grid-search
glove
embeddings
doc2vec
dependency-graph
deep-learning
data-challenge
convolutional-neural-networks
conference
cheat-sheet
character-language-models
character-embeddings
attention
RNN
PyData
KOVENS
GRU
ELMo
BERT