February 18th 2018

W.P. McNeill:

I want to train a Textcategorizer model with the following (text, label) pairs.

Label COLOR:

The door is brown.
The barn is red.
The flower is yellow.

Label ANIMAL:

The horse is running.
The fish is jumping.
The chicken is asleep.

I am copying the example code in the documentation for TextCategorizer.


textcat = TextCategorizer(nlp.vocab)
losses = {}
optimizer = nlp.begin_training()
textcat.update([doc1, doc2], [gold1, gold2], losses=losses, sgd=optimizer)

The doc variables will presumably be just nlp("The door is brown.") and so on. What should be in gold1 and gold2? I'm guessing they should be GoldParse objects, but I don't see how you represent text categorization information in those.

Revolutionizing Gaming: The Endless P…
best projectors for home
A Popular Restaurant Now Makes An Une…
Exploring Shilajit: Natureâ€™s Gift t…
LAC Eichsfeld gewinnt den â€œGroÃŸen …

Posted in S.E.F
via StackOverflow & StackExchange Atomic Web Robots
This Question have been answered
HERE

This post first appeared on Stack Solved, please read the originial post: here