This module contains custom models, loss functions, custom splitters, etc... for token classification tasks like named entity recognition.
 
What we're running with at the time this documentation was generated:
torch: 1.9.0+cu102
fastai: 2.5.2
transformers: 4.10.0

Token classification

The objective of token classification is to predict the correct label for each token provided in the input. In the computer vision world, this is akin to what we do in segmentation tasks whereby we attempt to predict the class/label for each pixel in an image. Named entity recognition (NER) is an example of token classification in the NLP space

df_converters = {'tokens': ast.literal_eval, 'labels': ast.literal_eval, 'nested-labels': ast.literal_eval}

# full nlp dataset
# germ_eval_df = pd.read_csv('./data/task-token-classification/germeval2014ner_cleaned.csv', converters=df_converters)

# demo nlp dataset
germ_eval_df = pd.read_csv('./germeval2014_sample.csv', converters=df_converters)

print(len(germ_eval_df))
germ_eval_df.head()
1000
id source tokens labels nested-labels ds_type
0 0 n-tv.de vom 26.02.2005 [2005-02-26] [Schartau, sagte, dem, ", Tagesspiegel, ", vom, Freitag, ,, Fischer, sei, ", in, einer, Weise, aufgetreten, ,, die, alles, andere, als, überzeugend, war, ", .] [B-PER, O, O, O, B-ORG, O, O, O, O, B-PER, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O] [O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O] train
1 1 welt.de vom 29.10.2005 [2005-10-29] [Firmengründer, Wolf, Peter, Bree, arbeitete, Anfang, der, siebziger, Jahre, als, Möbelvertreter, ,, als, er, einen, fliegenden, Händler, aus, dem, Libanon, traf, .] [O, B-PER, I-PER, I-PER, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, B-LOC, O, O] [O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O] train
2 2 http://www.stern.de/sport/fussball/krawalle-in-der-fussball-bundesliga-dfb-setzt-auf-falsche-konzepte-1553657.html#utm_source=standard&utm_medium=rss-feed&utm_campaign=sport [2010-03-25] [Ob, sie, dabei, nach, dem, Runden, Tisch, am, 23., April, in, Berlin, durch, ein, pädagogisches, Konzept, unterstützt, wird, ,, ist, allerdings, zu, bezweifeln, .] [O, O, O, O, O, O, O, O, O, O, O, B-LOC, O, O, O, O, O, O, O, O, O, O, O, O] [O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O] train
3 3 stern.de vom 21.03.2006 [2006-03-21] [Bayern, München, ist, wieder, alleiniger, Top-, Favorit, auf, den, Gewinn, der, deutschen, Fußball-Meisterschaft, .] [B-ORG, I-ORG, O, O, O, O, O, O, O, O, O, B-LOCderiv, O, O] [B-LOC, B-LOC, O, O, O, O, O, O, O, O, O, O, O, O] train
4 4 http://www.fr-online.de/in_und_ausland/sport/aktuell/1618625_Frings-schaut-finster-in-die-Zukunft.html [2008-10-24] [Dabei, hätte, der, tapfere, Schlussmann, allen, Grund, gehabt, ,, sich, viel, früher, aufzuregen, .] [O, O, O, O, O, O, O, O, O, O, O, O, O, O] [O, O, O, O, O, O, O, O, O, O, O, O, O, O] train

We are only going to be working with small sample from the GermEval 2014 data set ... so the results might not be all that great :).

labels = sorted(list(set([lbls for sublist in germ_eval_df.labels.tolist() for lbls in sublist])))
print(labels)
['B-LOC', 'B-LOCderiv', 'B-LOCpart', 'B-ORG', 'B-ORGpart', 'B-OTH', 'B-OTHderiv', 'B-OTHpart', 'B-PER', 'B-PERderiv', 'B-PERpart', 'I-LOC', 'I-LOCderiv', 'I-ORG', 'I-ORGpart', 'I-OTH', 'I-PER', 'O']
model_cls = AutoModelForTokenClassification
pretrained_model_name = "bert-base-multilingual-cased"
config = AutoConfig.from_pretrained(pretrained_model_name)

config.num_labels = len(labels)

Notice above how I set the config.num_labels attribute to the number of labels we want our model to be able to predict. The model will update its last layer accordingly (this concept is essentially transfer learning).

hf_arch, hf_config, hf_tokenizer, hf_model = BLURR.get_hf_objects(pretrained_model_name, 
                                                                  model_cls=model_cls, 
                                                                  config=config)
hf_arch, type(hf_config), type(hf_tokenizer), type(hf_model)
('bert',
 transformers.models.bert.configuration_bert.BertConfig,
 transformers.models.bert.tokenization_bert_fast.BertTokenizerFast,
 transformers.models.bert.modeling_bert.BertForTokenClassification)
test_eq(hf_config.num_labels, len(labels))
before_batch_tfm = HF_TokenClassBeforeBatchTransform(hf_arch, hf_config, hf_tokenizer, hf_model,
                                                     is_split_into_words=True, 
                                                     tok_kwargs={ 'return_special_tokens_mask': True })

blocks = (
    HF_TextBlock(before_batch_tfm=before_batch_tfm, input_return_type=HF_TokenClassInput), 
    HF_TokenCategoryBlock(vocab=labels)
)

def get_y(inp):
    return [ (label, len(hf_tokenizer.tokenize(str(entity)))) for entity, label in zip(inp.tokens, inp.labels) ]

dblock = DataBlock(blocks=blocks, 
                   get_x=ColReader('tokens'),
                   get_y=get_y,
                   splitter=RandomSplitter())

We have to define a get_y that creates the same number of labels as there are subtokens for a particular token. For example, my name "Wayde" gets split up into two subtokens, "Way" and "##de". The label for "Wayde" is "B-PER" and we just repeat it for the subtokens. This all get cleaned up when we show results and get predictions.

dls = dblock.dataloaders(germ_eval_df, bs=2)
dls.show_batch(dataloaders=dls, max_n=2)
token / target label
0 [('Helbig', 'B-OTH'), ('et', 'I-OTH'), ('al.', 'I-OTH'), ('(', 'O'), ('1994', 'O'), (')', 'O'), ('S.', 'O'), ('593.', 'O'), ('Wink', 'O'), ('&', 'B-OTH'), ('Seibold', 'I-OTH'), ('et', 'I-OTH'), ('al.', 'I-OTH'), ('(', 'I-OTH'), ('1998', 'O'), (')', 'O'), ('S.', 'O'), ('32', 'O'), ('Inwieweit', 'O'), ('noch', 'O'), ('andere', 'O'), ('Falken,', 'O'), ('wie', 'O'), ('der', 'O'), ('Afrikanische', 'O'), ('Baumfalke', 'O'), ('(', 'O'), ('Falco', 'B-LOCderiv'), ('cuvieri', 'O'), (')', 'O'), ('oder', 'O'), ('der', 'O'), ('Malaienbaumfalke', 'O'), ('(', 'O'), ('Falco', 'O'), ('serverus', 'O'), (')', 'O'), ('dieser', 'O'), ('Gruppe', 'O'), ('zuzuzählen', 'O'), ('sind,', 'O'), ('ist', 'O'), ('Gegenstand', 'O'), ('der', 'O'), ('Forschung.', 'O')]
1 [('Zugang', 'O'), ('und', 'O'), ('Engagement', 'O'), (':', 'O'), ('das', 'O'), ('eigentlich', 'O'), ('Neue', 'O'), ('an', 'O'), ('der', 'O'), ('Netz', 'O'), ('(', 'O'), ('werk', 'O'), (')', 'O'), ('kunst,', 'O'), ('in', 'O'), (':', 'O'), ('Medien', 'O'), ('Kunst', 'O'), ('Netz,', 'O'), ('2004,', 'O'), ('URL', 'O'), (':', 'O'), ('*', 'O'), ('Arns,', 'B-PER'), ('Inke', 'O'), (':', 'B-PER'), ('Netzkulturen,', 'O'), ('Hamburg', 'O'), ('(', 'O'), ('eva', 'B-LOC'), ('),', 'O'), ('2002,', 'O'), ('S.', 'O'), ('46', 'O'), ('und', 'O'), ('81', 'O'), ('*', 'O'), ('Armin', 'O'), ('Medosch', 'O'), (':', 'O'), ('Public', 'B-PER'), ('Netbase', 'I-PER'), ('Wien.', 'O')]

Metrics

In this section, we'll add helpful metrics for token classification tasks

calculate_token_class_metrics[source]

calculate_token_class_metrics(pred_toks, targ_toks, metric_key)

Parameters:

  • pred_toks : <class 'inspect._empty'>

  • targ_toks : <class 'inspect._empty'>

  • metric_key : <class 'inspect._empty'>

Training

class HF_TokenClassMetricsCallback[source]

HF_TokenClassMetricsCallback(tok_metrics=['accuracy', 'precision', 'recall', 'f1'], **kwargs) :: Callback

A fastai friendly callback that includes accuracy, precision, recall, and f1 metrics using the seqeval library. Additionally, this metric knows how to not include your 'ignore_token' in it's calculations.

See here for more information on seqeval.

Parameters:

  • tok_metrics : <class 'list'>, optional

  • kwargs : <class 'inspect._empty'>

model = HF_BaseModelWrapper(hf_model)
learn_cbs = [HF_BaseModelCallback]
fit_cbs = [HF_TokenClassMetricsCallback()]

learn = Learner(dls, model, opt_func=partial(Adam),cbs=learn_cbs,splitter=hf_splitter)

learn.freeze()
learn.summary()
b = dls.one_batch()
preds = learn.model(b[0])
len(preds),preds[0].shape
(1, torch.Size([2, 76, 18]))
len(b), len(b[0]), b[0]['input_ids'].shape, len(b[1]), b[1].shape
(2, 3, torch.Size([2, 76]), 2, torch.Size([2, 76]))
print(preds[0].view(-1, preds[0].shape[-1]).shape, b[1].view(-1).shape)
test_eq(preds[0].view(-1, preds[0].shape[-1]).shape[0], b[1].view(-1).shape[0])
torch.Size([152, 18]) torch.Size([152])
print(len(learn.opt.param_groups))
3
learn.unfreeze()
learn.lr_find(suggest_funcs=[minimum, steep, valley, slide])
/home/wgilliam/miniconda3/envs/blurr/lib/python3.9/site-packages/fastai/callback/schedule.py:270: UserWarning: color is redundantly defined by the 'color' keyword argument and the fmt string "ro" (-> color='r'). The keyword argument will take precedence.
  ax.plot(val, idx, 'ro', label=nm, c=color)
SuggestedLRs(minimum=0.005754399299621582, steep=3.630780702224001e-05, valley=0.0002290867705596611, slide=0.0020892962347716093)
learn.fit_one_cycle(1, lr_max= 3e-5, moms=(0.8,0.7,0.8), cbs=fit_cbs)
epoch train_loss valid_loss accuracy precision recall f1 time
0 0.192442 0.143492 0.959979 0.631179 0.570447 0.599278 00:34
/home/wgilliam/miniconda3/envs/blurr/lib/python3.9/site-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
print(learn.token_classification_report)

Showing results

Below we'll add in additional functionality to more intuitively show the results of our model.

learn.show_results(learner=learn, max_n=2, trunc_at=10)
token / target label / predicted label
0 [('Scenes', 'B-OTH', 'O'), ('of', 'I-OTH', 'B-OTH'), ('a', 'I-OTH', 'I-OTH'), ('Sexual', 'I-OTH', 'I-OTH'), ('Nature', 'I-OTH', 'I-OTH'), ('(', 'O', 'I-OTH'), ('GB', 'O', 'I-OTH'), ('2006', 'O', 'O'), (')', 'O', 'B-LOC'), ('-', 'O', 'O')]
1 [('Der', 'O', 'O'), ('notenbeste', 'O', 'O'), ('Zweitligaspieler', 'O', 'O'), ('(', 'O', 'O'), ('2,', 'O', 'O'), ('91', 'O', 'O'), ('),', 'O', 'O'), ('der', 'O', 'O'), ('seine', 'O', 'O'), ('persönliche', 'O', 'O')]
res = learn.blurr_predict('My name is Wayde and I live in San Diego'.split())
print(res[0][0])
("['O', 'O', 'O', 'O', 'B-PER', 'B-PER', 'O', 'O', 'O', 'O', 'B-LOC', 'B-LOC', 'O']",)

The default Learner.predict method returns a prediction per subtoken, including the special tokens for each architecture's tokenizer.

Learner.blurr_predict_tokens[source]

Learner.blurr_predict_tokens(items:Union[str, List[str]], **kwargs)

Parameters:

  • items : typing.Union[str, typing.List[str]]

    The str (or list of strings) you want to get token classification predictions for

  • kwargs : <class 'inspect._empty'>
txt ="Hi! My name is Wayde Gilliam from ohmeow.com. I live in California."
txt2 = "I wish covid was over so I could go to Germany and watch Bayern Munich play in the Bundesliga."
res = learn.blurr_predict_tokens(txt.split())
for r in res: print(f'{[(tok, lbl) for tok,lbl in zip(r[0],r[1]) ]}\n')
[('Hi!', 'O'), ('My', 'O'), ('name', 'O'), ('is', 'O'), ('Wayde', 'B-PER'), ('Gilliam', 'I-PER'), ('from', 'O'), ('ohmeow.com.', 'B-ORG'), ('I', 'O'), ('live', 'O'), ('in', 'O'), ('California.', 'B-LOC')]

res = learn.blurr_predict_tokens([txt.split(), txt2.split()])
for r in res: print(f'{[(tok, lbl) for tok,lbl in zip(r[0],r[1]) ]}\n')
[('Hi!', 'O'), ('My', 'O'), ('name', 'O'), ('is', 'O'), ('Wayde', 'B-PER'), ('Gilliam', 'I-PER'), ('from', 'O'), ('ohmeow.com.', 'B-ORG'), ('I', 'O'), ('live', 'O'), ('in', 'O'), ('California.', 'B-LOC')]

[('I', 'O'), ('wish', 'O'), ('covid', 'O'), ('was', 'O'), ('over', 'O'), ('so', 'O'), ('I', 'O'), ('could', 'O'), ('go', 'O'), ('to', 'O'), ('Germany', 'B-LOC'), ('and', 'O'), ('watch', 'O'), ('Bayern', 'B-ORG'), ('Munich', 'B-LOC'), ('play', 'O'), ('in', 'O'), ('the', 'O'), ('Bundesliga.', 'B-LOC')]

It's interesting (and very cool) how well this model performs on English even thought it was trained against a German corpus.

Inference

export_fname = 'tok_class_learn_export'
learn.export(fname=f'{export_fname}.pkl')
inf_learn = load_learner(fname=f'{export_fname}.pkl')

res = learn.blurr_predict_tokens([txt.split(), txt2.split()])
for r in res: print(f'{[(tok, lbl) for tok,lbl in zip(r[0],r[1]) ]}\n')
[('Hi!', 'O'), ('My', 'O'), ('name', 'O'), ('is', 'O'), ('Wayde', 'B-PER'), ('Gilliam', 'I-PER'), ('from', 'O'), ('ohmeow.com.', 'B-ORG'), ('I', 'O'), ('live', 'O'), ('in', 'O'), ('California.', 'B-LOC')]

[('I', 'O'), ('wish', 'O'), ('covid', 'O'), ('was', 'O'), ('over', 'O'), ('so', 'O'), ('I', 'O'), ('could', 'O'), ('go', 'O'), ('to', 'O'), ('Germany', 'B-LOC'), ('and', 'O'), ('watch', 'O'), ('Bayern', 'B-ORG'), ('Munich', 'B-LOC'), ('play', 'O'), ('in', 'O'), ('the', 'O'), ('Bundesliga.', 'B-LOC')]

High-level API

BLearnerForTokenClassification

class BlearnerForTokenClassification[source]

BlearnerForTokenClassification(dls:DataLoaders, hf_model:PreTrainedModel, base_model_cb:HF_BaseModelCallback=HF_BaseModelCallback, loss_func=None, opt_func=Adam, lr=0.001, splitter=trainable_params, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95)) :: Blearner

Group together a model, some dls and a loss_func to handle training

Parameters:

  • dls : <class 'fastai.data.core.DataLoaders'>

  • hf_model : <class 'transformers.modeling_utils.PreTrainedModel'>

  • kwargs : <class 'inspect._empty'>

learn = BlearnerForTokenClassification.from_dataframe(germ_eval_df, 'bert-base-multilingual-cased', 
                                                      tokens_attr='tokens', token_labels_attr='labels', 
                                                      dblock_splitter=RandomSplitter(), 
                                                      dl_kwargs={'bs':2})

learn.unfreeze()
learn.dls.show_batch(dataloaders=learn.dls, max_n=2)
token / target label
0 [('(', 'O'), ('Standard', 'B-ORG'), ('Oil', 'I-ORG'), ('of', 'I-ORG'), ('New', 'I-ORG'), ('Jersey', 'I-ORG'), ('),', 'O'), ('die', 'O'), ('ausgesprochen', 'O'), ('„', 'O'), ('Esso', 'O'), ('ergeben', 'B-ORG'), ('(', 'O'), ('heute', 'O'), ('ExxonMobil', 'O'), (').', 'O'), (';', 'B-ORG'), ('Exxon', 'O'), (':', 'O'), ('Ein', 'O'), ('Name,', 'B-ORG'), ('der', 'O'), ('in', 'O'), ('den', 'O'), ('frühen', 'O'), ('1970ern', 'O'), ('von', 'O'), ('Esso', 'O'), ('erfunden', 'O'), ('wurde,', 'O'), ('um', 'O'), ('ein', 'B-ORG'), ('neutrales', 'O'), ('aber', 'O'), ('eindeutiges', 'O'), ('Markenzeichen', 'O'), ('für', 'O'), ('das', 'O'), ('Unternehmen', 'O'), ('zu', 'O'), ('haben.', 'O')]
1 [('Der', 'O'), ('notenbeste', 'O'), ('Zweitligaspieler', 'O'), ('(', 'O'), ('2,', 'O'), ('91', 'O'), ('),', 'O'), ('der', 'O'), ('seine', 'O'), ('persönliche', 'O'), ('Bilanz', 'O'), ('auf', 'O'), ('sieben', 'O'), ('Tore', 'O'), ('und', 'O'), ('13', 'O'), ('Assists', 'O'), ('aufstockte', 'O'), ('und', 'O'), ('schon', 'O'), ('vor', 'O'), ('Wochen', 'O'), ('seinen', 'O'), ('Wechsel', 'O'), ('zu', 'O'), ('Dortmund', 'B-ORG'), ('bekannt', 'O'), ('gegeben', 'O'), ('hatte,', 'O'), ('war', 'O'), ('zuletzt', 'O'), ('in', 'O'), ('Mainz', 'O'), ('in', 'B-LOC'), ('die', 'O'), ('Kritik', 'O'), ('geraten.', 'O')]
learn.fit_one_cycle(1, lr_max= 3e-5, moms=(0.8,0.7,0.8), cbs=[BlearnerForTokenClassification.get_metrics_cb()])
epoch train_loss valid_loss accuracy precision recall f1 time
0 0.195376 0.180452 0.945981 0.548638 0.550781 0.549708 00:35
/home/wgilliam/miniconda3/envs/blurr/lib/python3.9/site-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
learn.show_results(learner=learn, max_n=2, trunc_at=10)
token / target label / predicted label
0 [('Helbig', 'B-OTH', 'O'), ('et', 'I-OTH', 'B-PER'), ('al.', 'I-OTH', 'I-PER'), ('(', 'O', 'I-PER'), ('1994', 'O', 'O'), (')', 'O', 'O'), ('S.', 'O', 'O'), ('593.', 'O', 'O'), ('Wink', 'O', 'O'), ('&', 'B-OTH', 'O')]
1 [('NEWSru.', 'B-OTH', 'O'), ('ua', 'O', 'B-ORG'), ('/', 'O', 'B-ORG'), (':', 'B-OTH', 'O'), ('Политисполком', 'I-OTH', 'O'), ('СПУ', 'I-OTH', 'O'), ('отказал', 'I-OTH', 'O'), ('Морозу', 'I-OTH', 'O'), ('в', 'I-OTH', 'O'), ('отставке', 'O', 'O')]
print(learn.token_classification_report)
              precision    recall  f1-score   support

         LOC       0.76      0.48      0.59        79
    LOCderiv       0.48      0.73      0.58        15
     LOCpart       0.00      0.00      0.00         0
         ORG       0.42      0.31      0.35        75
     ORGpart       0.00      0.00      0.00         0
         OTH       0.03      0.25      0.05         4
    OTHderiv       0.00      0.00      0.00         0
         PER       0.86      0.82      0.84        83
    PERderiv       0.00      0.00      0.00         0
     PERpart       0.00      0.00      0.00         0

   micro avg       0.55      0.55      0.55       256
   macro avg       0.25      0.26      0.24       256
weighted avg       0.66      0.55      0.59       256

txt ="Hi! My name is Wayde Gilliam from ohmeow.com. I live in California."
txt2 = "I wish covid was over so I could watch Lewandowski score some more goals for Bayern Munich in the Bundesliga."
res = learn.blurr_predict_tokens(txt.split())
for r in res: print(f'{[(tok, lbl) for tok,lbl in zip(r[0],r[1]) ]}\n')
[('Hi!', 'O'), ('My', 'O'), ('name', 'O'), ('is', 'O'), ('Wayde', 'B-PER'), ('Gilliam', 'I-PER'), ('from', 'O'), ('ohmeow.com.', 'O'), ('I', 'O'), ('live', 'O'), ('in', 'O'), ('California.', 'B-LOC')]

res = learn.blurr_predict_tokens([txt.split(), txt2.split()])
for r in res: print(f'{[(tok, lbl) for tok,lbl in zip(r[0],r[1]) ]}\n')
[('Hi!', 'O'), ('My', 'O'), ('name', 'O'), ('is', 'O'), ('Wayde', 'B-PER'), ('Gilliam', 'I-PER'), ('from', 'O'), ('ohmeow.com.', 'O'), ('I', 'O'), ('live', 'O'), ('in', 'O'), ('California.', 'B-LOC')]

[('I', 'O'), ('wish', 'O'), ('covid', 'O'), ('was', 'O'), ('over', 'O'), ('so', 'O'), ('I', 'O'), ('could', 'O'), ('watch', 'O'), ('Lewandowski', 'B-PER'), ('score', 'O'), ('some', 'O'), ('more', 'O'), ('goals', 'O'), ('for', 'O'), ('Bayern', 'B-ORG'), ('Munich', 'B-LOC'), ('in', 'O'), ('the', 'O'), ('Bundesliga.', 'B-ORG')]

Tests

The tests below to ensure the token classification training code above works for all pretrained token classification models available in Hugging Face. These tests are excluded from the CI workflow because of how long they would take to run and the amount of data that would be required to download.

Note: Feel free to modify the code below to test whatever pretrained token classification models you are working with ... and if any of your pretrained token classification models fail, please submit a github issue (or a PR if you'd like to fix it yourself)

arch tokenizer model_name result error
0 albert AlbertTokenizerFast AlbertForTokenClassification PASSED
1 bert BertTokenizerFast BertForTokenClassification PASSED
2 camembert CamembertTokenizerFast CamembertForTokenClassification PASSED
3 distilbert DistilBertTokenizerFast DistilBertForTokenClassification PASSED
4 electra ElectraTokenizerFast ElectraForTokenClassification PASSED
5 flaubert FlaubertTokenizer FlaubertForTokenClassification PASSED
6 funnel FunnelTokenizerFast FunnelForTokenClassification PASSED
7 longformer LongformerTokenizerFast LongformerForTokenClassification PASSED
8 mpnet MPNetTokenizerFast MPNetForTokenClassification PASSED
9 mobilebert MobileBertTokenizerFast MobileBertForTokenClassification PASSED
10 roberta RobertaTokenizerFast RobertaForTokenClassification PASSED
11 squeezebert SqueezeBertTokenizerFast SqueezeBertForTokenClassification PASSED
12 xlm XLMTokenizer XLMForTokenClassification PASSED
13 xlm_roberta XLMRobertaTokenizerFast XLMRobertaForTokenClassification PASSED
14 xlnet XLNetTokenizerFast XLNetForTokenClassification PASSED

Summary

This module includes all the low, mid, and high-level API bits for token classification tasks training and inference.