The text.modeling.seq2seq.core module contains core custom models, loss functions, etc… for Seq2Seq based tasks (e.g., language modeling, summarization, translation, etc…)
Mid-level API
We add a custom param splitter to give us a bit more depth in applying discriminative learning rates for Seq2Seq tasks.
A dictionary of seq2seq metrics we want to use. See below and the various task specific seq2seq docs
for examples of how to configure this per task
Calculation of these metrics requires text generation, which is expensive. You can choose to calculate
these metrics on every ‘epoch’, ‘other_epoch’, or ‘last_epoch’ instead (default: ‘epoch’)
The token ID that should be ignored when calculating the loss
Any keyword arguments to pass to the hf_model.generate method
BLURR provides a special callback for seq2seq models for calculating a variety of useful metrics that require decoding both given and predicted input_ids to be calculated. The are:
It's official: U.S. President Barack Obama wants lawmakers to weigh in on whether to use military force in Syria. Obama sent a letter to the heads of the House and Senate on Saturday night, hours after announcing that he believes military action against Syrian targets is the right step to take over the alleged use of chemical weapons. The proposed legislation from Obama asks Congress to approve the use of military force "to deter, disrupt, prevent and degrade the potential for future uses of chemical weapons or other weapons of mass destruction." It's a step that is set to turn an internat...
Syrian official: Obama climbed to the top of the tree, "doesn't know how to get down"\nObama sends a letter to the heads of the House and Senate .\nObama to seek congressional approval on military action against Syria .\nAim is to determine whether CW were used, not by whom, says U.N. spokesman .
(CNN) -- Usain Bolt rounded off the world championships Sunday by claiming his third gold in Moscow as he anchored Jamaica to victory in the men's 4x100m relay. The fastest man in the world charged clear of United States rival Justin Gatlin as the Jamaican quartet of Nesta Carter, Kemar Bailey-Cole, Nickel Ashmeade and Bolt won in 37.36 seconds. The U.S finished second in 37.56 seconds with Canada taking the bronze after Britain were disqualified for a faulty handover. The 26-year-old Bolt has now collected eight gold medals at world championships, equaling the record held by American trio...
Usain Bolt wins third gold of world championship .\nAnchors Jamaica to 4x100m relay victory .\nEighth gold at the championships for Bolt .\nJamaica double up in women's 4x100m relay .
<s> (CNN) -- When Ji Yeqing awakened, she was already in the recovery room. Chinese authorities had dragged her out of her home and down four flights of stairs, she said, restraining and beating her husband as he tried to come to her aid. They whisked her into a clinic, held her down on a bed and forced her to undergo an abortion. Her offense? Becoming pregnant with a second child, in violation of China's one-child policy. "After the abortion, I felt empty, as if something was scooped out of me," Ji told a congressional panel in September. "My husband and I had been so excited for our new baby. Now suddenly all that hope and joy and excitement disappeared.... I was very depressed and despondent. For a long time, whenever I thought about my lost child, I would cry." As she lay unconscious, she said, an IUD to prevent future pregnancies was inserted. The issue of forced abortions -- and in some cases, forced sterilizations -- in China has seized the spotlight in recent days with news of escaped activist Chen Guangcheng. Chen, a blind, self-taught lawyer, rose to fame in the late 1990s because of his advocacy for what he calls victims</s>
China's one-child policy results in forced abortions and sterilizations, activists say.\nWomen tell of emotional and physical consequences from the procedures.\nActivist Chen Guangcheng works to advocate for victims of such practices.
<s> Few question that there was a major chemical attack in Syria last week, and the United States has made clear that it blames the government of President Bashar al-Assad. Now, the question is how President Barack Obama will respond. For almost two years, Obama has avoided direct military involvement in Syria's civil war, only escalating aid to rebel fighters in June after suspected smaller-scale chemical weapons attacks by Syrian government forces. However, last week's attack on a Damascus suburb that reportedly killed and wounded more than 3,000 people obliterated the "red line" Obama set just over a year ago against the use of Syria's chemical weapons stocks. At the White House, spokesman Jay Carney told reporters Monday that Obama was evaluating "a response to the clear use on a mass scale with repugnant results of chemical weapons," adding that "there is very little doubt that the Syrian regime... used those weapons." Meanwhile, U.S. Secretary of State John Kerry called the attack "inexcusable" and "undeniable," and said there was "a clear reason that the world has banned entirely chemical weapons." He said that evidence "strongly indicates" chemical weapons were used in Syria and that "we know the Syrian regime maintains custody" of such weapons and has</s>
U.S. evidence includes satellite imagery, official says.\nObama is considering how to respond to Syrian chemical attack.\nOfficial: Obama could be presented with options within days.\nA U.S. strike "can't just be one and done," a Middle East analyst says.
[nltk_data] Downloading package wordnet to /home/wgilliam/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /home/wgilliam/nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /home/wgilliam/nltk_data...
[nltk_data] Package omw-1.4 is already up-to-date!
b = dls.one_batch()preds = learn.model(b[0])len(preds), preds["loss"].shape, preds["logits"].shape
(4, torch.Size([]), torch.Size([2, 58, 50264]))
b = dls.one_batch()preds = learn.model(b[0])len(preds), preds["loss"].shape, preds["logits"].shape
(4, torch.Size([]), torch.Size([2, 69, 50264]))
learn.lr_find(suggest_funcs=[minimum, steep, valley, slide])
Below we’ll add in additional functionality to take advantage of Hugging Face’s PreTrainedModel.generate model, which can be used to easily implement beam search, top-k/nucleous sampling, etc… so that we get more human sounding results.
To make things even easier, for text generation tasks you can simply call the Learn.blurr_generate method, optionally passing in whatever text generation kwargs you wish, to accomplish the same as above.
test_article ="""About 10 men armed with pistols and small machine guns raided a casino in Switzerland and made off into France with several hundred thousand Swiss francs in the early hours of Sunday morning, police said. The men, dressed in black clothes and black ski masks, split into two groups during the raid on the Grand Casino Basel, Chief Inspector Peter Gill told CNN. One group tried to break into the casino's vault on the lower level but could not get in, but they did rob the cashier of the money that was not secured, he said. The second group of armed robbers entered the upper level where the roulette and blackjack tables are located and robbed the cashier there, he said. As the thieves were leaving the casino, a woman driving by and unaware of what was occurring unknowingly blocked the armed robbers' vehicles. A gunman pulled the woman from her vehicle, beat her, and took off for the French border. The other gunmen followed into France, which is only about 100 meters (yards) from the casino, Gill said. There were about 600 people in the casino at the time of the robbery. There were no serious injuries, although one guest on the Casino floor was kicked in the head by one of the robbers when he moved, the police officer said. Swiss authorities are working closely with French authorities, Gill said. The robbers spoke French and drove vehicles with French lRicense plates. CNN's Andreena Narayan contributed to this report."""
b = dls.valid.one_batch()tfm = first_blurr_tfm(dls)b_hf_tokenizer = tfm.hf_tokenizerb_ignore_token_id = tfm.ignore_token_idtest_input_ids = b[0]["input_ids"][0].unsqueeze(0).to(learn.model.hf_model.device)test_trg_ids = b[1][0].unsqueeze(0).to(learn.model.hf_model.device)test_trg_ids = [trg[trg != b_ignore_token_id] for trg in test_trg_ids]gen_text = learn.model.hf_model.generate(test_input_ids, num_beams=4, max_length=130, min_length=30)print("=== Target ===")print(f"{b_hf_tokenizer.decode(test_trg_ids[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)}\n")print("=== Prediction ===")print(b_hf_tokenizer.decode(gen_text[0], skip_special_tokens=True, clean_up_tokenization_spaces=True))
=== Target ===
A winter storm slams the northeastern United States.
The U.S. House of Representatives condemns the Arizona shooting.
Massive floods leave vast areas of Australia underwater.
Use the Daily Discussion to help students understand today's featured news stories.
=== Prediction ===
This is a RUSH transcript of today's CNN Student News show.
Use the Transcript to help students with reading comprehension and vocabulary.
The Story of a problem that won't be solved, even if the solution is clear.
A look at the storm system that iced out the southeast.
Learn about the problem that will never be solved in Australia.
Explore the story and the reasons why a problem won't solve itself.
[{'generated_texts': [" The robbers made off with several hundred thousand Swiss francs in the early hours of Sunday morning .\nThe men, dressed in black clothes and black ski masks, split into two groups during the raid on the Grand Casino Basel .\nAs the thieves were leaving the casino, a woman driving by unknowingly blocked the armed robbers' vehicles .\nA gunman pulled the woman from her vehicle, beat her up, and took off for the French border .\nThere were about 600 people in the casino at the time of the robbery .",
" The robbers made off with several hundred thousand Swiss francs in the early hours of Sunday morning .\nThe men, dressed in black clothes and black ski masks, split into two groups during the raid on the Grand Casino Basel .\nAs the thieves were leaving the casino, a woman driving by unknowingly blocked the armed robbers' vehicles .\nA gunman pulled the woman from her vehicle, beat her up, and took off for the French border .\nThere were about 600 people in the casino at the time of the raid .",
" The robbers made off with several hundred thousand Swiss francs in the early hours of Sunday morning .\nThe men, dressed in black clothes and black ski masks, split into two groups during the raid on the Grand Casino Basel .\nAs the thieves were leaving the casino, a woman driving by unknowingly blocked the armed robbers' vehicles .\nA gunman pulled the woman from her vehicle, beat her up, and took off for the French border .\nFrench authorities are working closely with Swiss authorities ."]}]
Showing results
Much nicer!!! Now, we can update our @typedispatchedshow_results to use this new method.
(CNN Student News) -- January 13, 2011. Download PDF maps related to today's show:. • Arizona • Australia. Transcript. THIS IS A RUSH TRANSCRIPT. THIS COPY MAY NOT BE IN ITS FINAL FORM AND MAY BE UPDATED. CARL AZUZ, CNN STUDENT NEWS ANCHOR: A problem that won't be solved, even if the solution is clear. The story and the reasons, leading off today's broadcast of CNN Student News! My name is Carl Azuz! First Up: Winter Storm Woes. AZUZ: Florida is the only state in the union without snow on the g
A winter storm slams the northeastern United States.\nThe U.S. House of Representatives condemns the Arizona shooting.\nMassive floods leave vast areas of Australia underwater.\nUse the Daily Discussion to help students understand today's featured news
[ This is a RUSH transcript of today's CNN Student News show .\nUse the Transcript to help students with reading comprehension and vocabulary .\nThe Story of a problem that won't be solved, even if the solution is clear .\nA look at the storm system that iced out the southeast .\nLearn about the problem that will never be solved in Australia .\nExplore the story and the reasons why a problem won't solve itself ., The Cotswolds are a slice of picture-postcard England .\nThe wool trade boomed in these rolling hills in medieval times and today the region is littered with achingly pretty villages .\nLeading members of the arts and crafts movement were among the first to visit Chipping Campden with its long curving high street .]
[{'generated_texts': " The robbers made off with several hundred thousand Swiss francs in the early hours of Sunday morning .\nThe men, dressed in black clothes and black ski masks, split into two groups during the raid on the Grand Casino Basel .\nAs the thieves were leaving the casino, a woman driving by unknowingly blocked the armed robbers' vehicles .\nA gunman pulled the woman from her vehicle, beat her up, and took off for the French border .\nThere were about 600 people in the casino at the time of the robbery ."}]