Departament de Llenguatges i Sistemes Informàtics

Comunicació

Títol:Automated Fact Checking Importa'l al teu calendari:
[CSV]
Tipus:Xarrada d'investigació
Per:James Thorne. University of Cambridge researching Natural Language Processing
Lloc:Sala Claude Shannon (Escola Politècnica Superior IV) Soterrani
Dia/hora:12.00 08/04/2019
Més informació:https://jamesthorne.co.uk/
Persona de contacte:

Saquete Boró, Estela (stela[Perdone'm]dlsi.ua.es)
Resum:
With billions of individual pages on the web providing information on almost
every conceivable topic, we should have the ability to collect facts that
answer almost every conceivable question. However, only a small fraction
of this information is contained in structured sources (Wikidata, Freebase,
etc.) – we are therefore limited by our ability to transform free-form text
to structured knowledge. There is, however, another problem that has become
the focus of a lot of recent research and media coverage: false information
coming from unreliable sources.

In this presentation, I’m going to be discussing some of the unique challenges
that are involved in Fact Checking: the task of predicting whether written
information is true or false given evidence. I will first compare the task
to fake news detection and highlight the need to have a rational explanation
for the predictions made by the model and the trade-off between the types
of evidence that can be used. I will use this as motivation for the Fact
Extraction and VERification (FEVER) dataset and challenge which we released
in 2018. The dataset comprises more than 185,000 human-generated claims
extracted from Wikipedia pages. False claims were generated by mutating true
claims in a variety of ways, some of which were meaning-altering. During the
verification step, annotators were required to label a claim for its validity
and also supply full-sentence textual evidence from (potentially multiple)
Wikipedia articles for the label. 

I will explore some of the challenges for performing multi-sentence natural
language inference on these these diverse texts for the FEVER challenge
and explore two ways in which the models can be diagnosed: the first is
through generating token-level explanations from the model without explicitly
labelled training data and the second is with adversarial evaluation. Finally,
I will conclude by discussing the future challenges in the task and the next
iteration of the FEVER shared task. With FEVER , we aim to help create a
new generation of transparent and interpretable knowledge extraction systems.

[ Tancar ]