Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

SOLVED: How do I create gold data for TextCategorizer training?

W.P. McNeill:

I want to train a Textcategorizer model with the following (text, label) pairs.

Label COLOR:

  • The door is brown.
  • The barn is red.
  • The flower is yellow.

Label ANIMAL:

  • The horse is running.
  • The fish is jumping.
  • The chicken is asleep.

I am copying the example code in the documentation for TextCategorizer.


textcat = TextCategorizer(nlp.vocab)
losses = {}
optimizer = nlp.begin_training()
textcat.update([doc1, doc2], [gold1, gold2], losses=losses, sgd=optimizer)

The doc variables will presumably be just nlp("The door is brown.") and so on. What should be in gold1 and gold2? I'm guessing they should be GoldParse objects, but I don't see how you represent text categorization information in those.



Posted in S.E.F
via StackOverflow & StackExchange Atomic Web Robots
This Question have been answered
HERE


This post first appeared on Stack Solved, please read the originial post: here

Share the post

SOLVED: How do I create gold data for TextCategorizer training?

×

Subscribe to Stack Solved

Get updates delivered right to your inbox!

Thank you for your subscription

×