I study & do NLP, recently most interested in Speech processing.
Contact
@xujinghua on Kaggle
View my LinkedIn Profile
Tweet
class to represent tweet is included.Bayesian Regressian SoundInventoryPopulation
A series of Bayesian regression analyses to test whether an association between two variables(Sound inventory size vs. population size) is genuine, with the rethinking package.
Survival Analysis ld.chin
A survival analysis of lexical decision data of Mandarin Chinese with R.
Survival Analysis blp
A survival analysis(with R) of time-to-event data: the data for the British Lexicon Project.
Tweets Clustering
Dimensionality reduction with PCA. Clutering Tweets with k-means.
Segmentation as Sequence labeling
Segmentation as sequence labeling with Gated RNN.
Pos Tagging as Sequence labeling
Pos tagging as sequence labeling with Gated RNN.
NER on Reddit posts/comments with SpaCy
NER experiments: recognize DRUG
entities in reddit drug posts/comments.
Speaker Verification with NeMo
A speaker verification experiment with NeMo.
NLP without readymade annotated dataset: experiments with SpaCy and Snorkel
NER with SpaCy, information extraction with Snorkel.
English and German NER models trained on CoNLL-2003 data with spacy v3
I trained a series of (language-dependent) spaCy v3.0 (English and German) NER models with different configurations in order to achieve the best possible f-score. Among them, the best English NER model (benchmark model) had F-score 89.22, the best German NER model had F-score 83.29, both evaluated on the respective testb data.
…
Page template forked from evanca