You are on page 1of 98

presentation: NTCIR-11 Temporalia

Using machine learning to predict temporal


orientation of search engines queries
in the Temporalia challenge
Michele Filannino, Goran Nenadic
filannim@cs.man.ac.uk, g.nenadic@manchester.ac.uk

Tokyo, 11/12/2014

presentation: NTCIR-11 Temporalia

temporal intent of queries (TIQ)


Given a user query and its submission time, can a
system predict its temporal intent?

input: queries & submission date


output: temporal intent
PAST, RECENCY, FUTURE or ATEMPORAL

easy for people


hard for machines
Source:

11/12/2014, Tokyo

/25

presentation: NTCIR-11 Temporalia

TQI: recency

Source: https://www.google.co.uk/search?q=google+stock+price

11/12/2014, Tokyo

/25

presentation: NTCIR-11 Temporalia

TQI: future

Source: https://www.google.co.uk/search?q=weather+forecast+manchester

11/12/2014, Tokyo

/25

presentation: NTCIR-11 Temporalia

TQI: past

Source: https://www.google.co.uk/search?q=who+was+eliminated+on+dancing+with+the+stars

11/12/2014, Tokyo

/25

presentation: NTCIR-11 Temporalia

TQI: atemporal

Source: https://www.google.co.uk/search?q=who+was+eliminated+on+dancing+with+the+stars

11/12/2014, Tokyo

/25

the data
training set

<query_issue_time>May 1, 2013 GMT+0</query_is


<temporal_class>recency</temporal_class>
</query>
<query> presentation: NTCIR-11 Temporalia
<id>005</id>
<query_string>i am a gummy bear</query_string>
<query_issue_time>May 1, 2013 GMT+0</query_is
<temporal_class>atemporal</temporal_class>
</query>
<query>
<id>006</id>
<query_string>madden 2014 release date</query
<query_issue_time>May 1, 2013 GMT+0</query_is
<temporal_class>future</temporal_class>
</query>
<query>
<id>007</id>
<query_string>comet coming in 2013</query_stri
<query_issue_time>May 1, 2013 GMT+0</query_is
<temporal_class>past</temporal_class>
</query>
<query>
<id>008</id>
<query_string>french open 2013 official</query_s
<query_issue_time>May 1, 2013 GMT+0</query_is
<temporal_class>future</temporal_class>
</query>
<query>
<id>009</id>
<query_string>daylight savings time 2010 united
<query_issue_time>May 1, 2013 GMT+0</query_is
<temporal_class>past</temporal_class>
</query>
<query>
<id>010</id>
<query_string>attack synonym</query_string>
<query_issue_time>May 1, 2013 GMT+0</query_is
<temporal_class>atemporal</temporal_class>
</query>
<query>
<id>011</id>
11/12/2014,
Tokyo
8 /25
<query_string>may 2013
calendar
printable</que
<query_issue_time>May 1, 2013 GMT+0</query_is

80 instances +

20 instances (released as preliminary test set)

test

300 instances

presentation: NTCIR-11 Temporalia

proposed approach
data-driven rather than rule-based
low-sparsity attributes
external resources:

TempoWordNet1, a temporal lexical KB

ManTIME, a temporal expression

NLTK

extraction system
[1] G. H. Dias, M. Hasanuzzaman, S. Ferrari, and Y. Mathet. TempoWordNet for sentence time tagging. In
Proceedings of the 23rd International Conference on World Wide Web Companion, pages 833838, Republic
and Canton of Geneva, Switzerland, 2014.

11/12/2014, Tokyo

/25

presentation: NTCIR-11 Temporalia

1
ManTIME

usage

a ML-based temporal expression extraction system


madden 2014 release date
madden <TIMEX3 value=2014-XX-XX type=DATE>2014</TIMEX3> release date

drudge report 2013 september


drudge report <TIMEX3 value=2013-09-XX type=DATE>2013 september</TIMEX3>

[1] M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification and
normalization in the TempEval-3 challenge. In Proceedings of SemEval 2013, pages 5357,
Atlanta, USA, June 2013. ACL.

11/12/2014, Tokyo

10 /25

presentation: NTCIR-11 Temporalia

trigger classes
Feature selection RELIEF algorithm
BOW representation
4 dictionaries (1 per class)
PAST

RECENCY

FUTURE

ATEMPORAL

ancient
days
death
did
history
last
months

actual
cost
costs
current
daily
day
direction

agenda
calendar
chance
coming
dates
forecast
forthcoming

chords
lyrics

21 triggers

44 triggers

2 triggers

27 triggers

11/12/2014, Tokyo

11

/25

presentation: NTCIR-11 Temporalia

attributes
#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Attribute description
Is it a Wikipedia page title?
Does it contain a temporal expression?
Submissions term
Submissions trimester
Timing
Most frequent trigger class
Wh type
Most frequent TempoWordNet class
Most frequent POS tag tense
Most frequent coarse-grained POS tag
Trigger classes footprint
Temporal between submission and query

Tenses footprint
Ordered TempoWordNet classes
Most frequent fine-grained POS tag
Coarse-grained POS tag ordered footprint
Fine-grained POS tag ordered footprint
Coarse-grained POS tag footprint
Fine-grained POS tag footprint

Sparsity is measured on the full data set: training + test

Sparsity
2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265

Example
Input (query/time) attribute value
New York Times YES
june 2013 movies YES
Feb 28, 2013 GMT+0 B
Aug 26, 2013 GMT+0 M2
Movies 2012, Feb 28, 2013 past
peso dollar exchange rate present
how did hitler die how
current stock prices present
what is stop kony 2012 VBZ
kony 2012 fake N
what was I thinking lyrics past-atemporal
fathers day 2010, Feb 28, 2013 36.0
when does fall start VBZ-VB
the last song past-future-presentkony 2012 fake NN
when is labour day N-W-V
when is labour day NN-WRB-VBZ
when is labour day W-V-N-N
when is labour day WRB-VBZ-NN-NN
11/12/2014, Tokyo

12 /25

presentation: NTCIR-11 Temporalia

run 1: minimal
#

classifier:
SVM with polynomial
kernel

* default parameters (C and gamma)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Sparsity

Attribute description
Is it a Wikipedia page title?
Does it contain a temporal expression?
Submissions term
Submissions trimester
Timing
Most frequent trigger class
Wh type
Most frequent TempoWordNet class
Most frequent POS tag tense
Most frequent coarse-grained POS tag
Trigger classes footprint
Temporal between submission and query
Tenses footprint
Ordered TempoWordNet classes
Most frequent fine-grained POS tag
Coarse-grained POS tag ordered footprint
Fine-grained POS tag ordered footprint
Coarse-grained POS tag footprint
Fine-grained POS tag footprint

11/12/2014, Tokyo

2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265

13 /25

presentation: NTCIR-11 Temporalia

run 2: intermediate
#

classifier:
SVM with polynomial
kernel

* default parameters (C and gamma)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Sparsity

Attribute description
Is it a Wikipedia page title?
Does it contain a temporal expression?
Submissions term
Submissions trimester
Timing
Most frequent trigger class
Wh type
Most frequent TempoWordNet class
Most frequent POS tag tense
Most frequent coarse-grained POS tag
Trigger classes footprint
Temporal between submission and
Tenses footprint
Ordered TempoWordNet classes
Most frequent fine-grained POS tag
Coarse-grained POS tag ordered footprint
Fine-grained POS tag ordered footprint
Coarse-grained POS tag footprint
Fine-grained POS tag footprint

11/12/2014, Tokyo

2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265

14 /25

presentation: NTCIR-11 Temporalia

run 3: full
#

classifier:
Random Forests

1000 random trees

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Sparsity

Attribute description
Is it a Wikipedia page title?
Does it contain a temporal expression?
Submissions term
Submissions trimester
Timing
Most frequent trigger class
Wh type
Most frequent TempoWordNet class
Most frequent POS tag tense
Most frequent coarse-grained POS tag
Trigger classes footprint
Temporal between submission and
Tenses footprint
Ordered TempoWordNet classes
Most frequent fine-grained POS tag
Coarse-grained POS tag ordered footprint
Fine-grained POS tag ordered footprint
Coarse-grained POS tag footprint
Fine-grained POS tag footprint

11/12/2014, Tokyo

2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265

15 /25

presentation: NTCIR-11 Temporalia

results (submitted runs)


100

Accuracy

75

50

25

55.00%

61.33%

66.33%

Intermediate

Minimal

0
Full

1st ranked system

11/12/2014, Tokyo

16 /25

presentation: NTCIR-11 Temporalia

results: 5 x 10 cross-fold v.
100

Accuracy

75

50
73.18%

77.95%

78.33%

Full

Intermediate

Minimal

25

1st ranked system

11/12/2014, Tokyo

17 /25

presentation: NTCIR-11 Temporalia

a posteriori fix
100

Accuracy

75

50

25

55.00%

61.33%

66.33%

Intermediate

Minimal

72.33%

0
Full

best combination of attributes

Minimal
fixed
11/12/2014, Tokyo

18 /25

how to reach the peak?


presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

confusion matrix
Classified as
Recency

Past

Future

Atemporal

43

21

11

60

Future

38

35

Atemporal

61

Recency
Past

minimal run

11/12/2014, Tokyo

20 /25

presentation: NTCIR-11 Temporalia

confusion matrix
Classified as
Recency

Past

Future

Atemporal

43

21

11

60

Future

38

35

Atemporal

61

Recency
Past

minimal run

11/12/2014, Tokyo

21 /25

presentation: NTCIR-11 Temporalia

dicult queries
iPhone 5 release date

it can be FUTURE or PAST according to the submission time

keywords dont help here

2061: Odyssey Three

keywords can lie!

season 2 dexter

use of external sources of knowledge

11/12/2014, Tokyo

22 /25

presentation: NTCIR-11 Temporalia

dicult queries
iPhone 5 release date

it can be FUTURE or PAST

keywords dont help here

Ventura Stern 2016

keywords could possibly lie

season 2 dexter

use of external sources of knowledge

11/12/2014, Tokyo

23 /25

presentation: NTCIR-11 Temporalia

temporal footprint
a continuous period on the time-line that temporally
defines the existence of a particular concept.

Source: Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings of
the First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland,
August 2014), ACL.

11/12/2014, Tokyo

24 /25

presentation: NTCIR-11 Temporalia

online material

Source: http://www.cs.man.ac.uk/~filannim/projects/temporalia/

11/12/2014, Tokyo

25 /25

Thank
you

S
N
O
I
T
S
E
QU

Contact:

filannim@cs.man.ac.uk

presentation: NTCIR-11 Temporalia

Machine
Learning

Natural Language
Processing

Statistics

Semi-structured
data

Text
Mining

Linguistics

Parallel
computing

11/12/2014, Tokyo

28 /25

presentation: NTCIR-11 Temporalia

the task
Temporal aspects of events provide a natural
mechanism for organising information

source: written texts


goal: a (machine-understandable)
temporal representation of the texts

easy for people


hard for machines
11/12/2014, Tokyo

/25

presentation: NTCIR-11 Temporalia

linguistic key concepts


temporal expressions: phrases denoting a temporal
entity such as an interval or a time point

01/05/2014, March 15, the next week, Saturday, at that time,


yesterday, 5 oclock, 3 days, every 4 hours

events: phrases denoting eventuality and states

inflected verbs and nouns: spoken, deliver, will be published

links: temporal relation between two phrases

BEFORE, AFTER, INCLUDES, ENDS, DURING, BEGINS

Source: ISO-TimeML (ISO/TC37/SC 4 N412 ), rev. 12, 2007

11/12/2014, Tokyo

30 /25

presentation: NTCIR-11 Temporalia

example

Yesterday, Deutsche Bank released a note saying


that China's current economic policies would result in an
enormous surge in coal consumption over the next
decade.

Source: CNN news article published on 28th February 2010.

11/12/2014, Tokyo

31 /25

presentation: NTCIR-11 Temporalia

example: temporal expressions


value: 2010-02-27
type: DATE

Yesterday(T), Deutsche Bank released a note saying


that China's current economic policies would result in
an enormous surge in coal consumption over the next
decade(T).
value: P10Y
type: DURATION

Source: CNN news article published on 28th February 2010.

11/12/2014, Tokyo

32 /25

presentation: NTCIR-11 Temporalia

example: events
class: REPORTING
class: OCCURRENCE

Yesterday(T), Deutsche Bank released(E) a note saying(E)


that China's current economic policies would result(E) in
an enormous surge(E) in coal consumption over the next
decade(T).
class: OCCURRENCE
class: OCCURRENCE

Source: CNN news article published on 28th February 2010.

11/12/2014, Tokyo

33 /25

presentation: NTCIR-11 Temporalia

example: links
is included

is included

Yesterday(T), Deutsche Bank released(E) a note saying(E)


that China's current economic policies would result(E) in
after
an enormous surge(E) in coal consumption over the next
decade(T).
is included

Source: CNN news article published on 28th February 2010.

11/12/2014, Tokyo

34 /25

presentation: NTCIR-11 Temporalia

example: ISO-TimeML output


<TimeML xsi:noNamespaceSchemaLocation=http://timeml.org/timeMLdocs/TimeML_1.2.1.xsd>
<DOCID>nyt_20100228_china_pollution</DOCID>
<DCT><TIMEX3 functionInDocument="CREATION_TIME" tid="t0" type="DATE"
value=2010-02-28">2010-02-28</TIMEX3></DCT>
<TITLE>As Pollution Worsens in China, Solutions Succumb to Infighting</TITLE>
<TEXT>
<TIMEX3 tid="t1" type="DATE" value=2010-02-27">Yesterday</TIMEX3>, Deutsche Bank <EVENT
class="OCCURRENCE" eid="e1">released</EVENT> a note <EVENT class="REPORTING" eid="e2">saying</EVENT>
that China's <TIMEX3 tid="t2" type="DATE" value=PRESENT_REF>current</TIMEX3> economic policies
would <EVENT class="OCCURRENCE" eid="e3">result</EVENT> in an enormous <EVENT class="OCCURRENCE"
eid="e4">surge</EVENT> in coal <EVENT class="OCCURRENCE" eid="e5">consumption</EVENT> over <TIMEX3
tid="t3" type="DURATION" value="P10Y">the next decade</TIMEX3>.
</TEXT>
<TLINK eventInstanceID="ei1" lid="l52" relType="IS_INCLUDED" relatedToTime="t1"/>
<TLINK eventInstanceID="ei4" lid="l53" relType="IS_INCLUDED" relatedToTime="t2"/>
<TLINK eventInstanceID="ei2" lid="l54" relType=IS_INCLUDED" relatedToTime="t1"/>
<TLINK eventInstanceID="ei4" lid="l59" relType="AFTER" relatedToEventInstance=ei1"/>
</TimeML>

11/12/2014, Tokyo

35 /25

presentation: NTCIR-11 Temporalia

visual representation

released,
saying

27 Feb. 2010 now

2020

surge

Utterance time: 28th February 2010.

11/12/2014, Tokyo

36 /25

presentation: NTCIR-11 Temporalia

TempEval-3 results
Identification
Research group

Normalisation
accuracy

Overall
score

Prec.

Rec.

F1

The University of Heidelberg

0.93

0.88

0.9

0.86

0.776

US Naval Academy

0.89

0.91

0.9

0.79

0.71

The University of Manchester

0.95

0.85

0.9

0.77

0.69

Stanford University

0.89

0.91

0.9

0.75

0.674

AT&T Lab Research

0.98

0.75

0.85

0.77

0.656

University of Colorado Boulder

0.94

0.87

0.9

0.72

0.647

Jadavpur University

0.93

0.8

0.86

0.74

0.638

Katholieke Universiteit Leuven

0.93

0.76

0.84

0.75

0.63

Joint Research Centre European Commission

0.9

0.8

0.85

0.68

0.582

Rule-based

Machine learning-based

11/12/2014, Tokyo

37 /25

presentation: NTCIR-11 Temporalia

model selection
93 features, 4 models:

M1: morpho-lexical only


M2: morpho-lexical + syntactic
M3: morpho-lexical + gazeetters
M4: morpho-lexical + gazeetters
+ WordNet

Source: Filannino, M., and Nenadic G. ManTIME: Temporal expression extraction with
systematic feature type selection and a posteriori label adjustment. Journal of Information
processing and Management: Special Issue on Time and Information Retrieval, (2014),
Elsevier. (under review)
*5x10-fold cross validation

11/12/2014, Tokyo

38 /25

Better software, better research


presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

temporal footprint

A temporal footprint is a continuous period


on the time-line that temporally defines
the existence of a particular concept.

Source: Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings of
the First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland,
August 2014), ACL.

11/12/2014, Tokyo

40 /25

presentation: NTCIR-11 Temporalia

evaluation
subjects: people
lived from 1000 AD to 2014

text from Wikipedia web pages

year of birth and death from DBpedia

228,824 people collected


simple definition of temporal footprint

birth and death dates


11/12/2014, Tokyo

41 /25

presentation: NTCIR-11 Temporalia

results
Galileo Galilei (1564-1642), prediction: 1556-1654

Error: 0.204

11/12/2014, Tokyo

42 /25

presentation: NTCIR-11 Temporalia

results
Computer (1940-today), prediction: 1882-1982

Source: http://www.cs.man.ac.uk/~filannim/projects/temporal_footprints/

11/12/2014, Tokyo

43 /25

presentation: NTCIR-11 Temporalia

application?

Source: http://start.csail.mit.edu/answer.php?query=

11/12/2014, Tokyo

44 /25

presentation: NTCIR-11 Temporalia

i2b2 shared Task 12


ADMISSION DATE: 2011-02-06;
DISCHARGE DATE: 2011-02-08;
HISTORY OF PRESENT ILLNESS: Mr. Pohl is a 53 - year-old male with history of alcohol use
and hypertension. Blood alcohol level was 383. Agitated in emergency room requiring 4
leather restraints, received 5 mg of Haldol, 2 mg of Ativan. He became hypotensive in the
emergency room with a systolic blood pressure in the 80 's and had decreased respiratory
rate. He received a normal saline bolus of 2 litres of good blood pressure response. The
patient was then admitted to the medical Intensive Care Unit for observation and then
transferred to our service on medicine when the blood pressures remained stable
overnight...
06/02/2011

07/02/2011

General
Tests
Treatments
Problems

admission

transfer

SBP ~80

Ativan 2mg
BAL 383

Saline bolus 2l

hypotensive

discharge

SBP stable

decreased respiratory rate


Haldol 4mg

08/02/2011

blood pressure medications

stable

Source: Kovaevi, A., Dehghan, A., Filannino, M., Keane, J. A., and Nenadic, G. Combining
rules and machine learning for extraction of temporal expressions and events from clinical
narratives. Journal of American Medical Informatics (2013).

hands tremor

improved

11/12/2014, Tokyo

45 /25

presentation: NTCIR-11 Temporalia

clinical data
disease progression
modelling

analysis of the eectiveness


of treatments

extraction of patients clinical


pathway

11/12/2014, Tokyo

46 /25

st
1

year backup
presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

identification techniques
ACE-2004 dev & eval

TempEval Task#15

TempEval-3 Task#1

(TERN2004 corpus)

(in SemEval07)

(in SemEval13)

TempEval-2 Task#13

TimeML

(in SemEval10)

(standard)

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

TimeBank

SVM

Conditional Random Fields

(corpus)

(machine learning)

(machine learning)

Hand grammar approach

Maximum Entropy Class.

Markov logic network

(rule-based)

(machine learning)

(machine learning)

11/12/2014, Tokyo

48 /25

presentation: NTCIR-11 Temporalia

scientific interest
temporal expressions AND clinical
70
63
61

56
49

49
46

42

46
43
38

35
28
25

21
18

14
7
0

10

2000

16

15

2003

2004

12

2001

2002

Source: Google Scholar (27/02/2012)

2005

2006

2007

2008

2009

2010

11/12/2014, Tokyo

2011

49 /25

presentation: NTCIR-11 Temporalia

conferences & journals


SemEval: Evaluation Exercises on Semantic Evaluation

TempEval: Temporal Evaluation Task

TIME: Time International Symposium Series


JAMIA: Journal of American Informatics Association
COLING: Computational Linguistics Conference
IJHI: International Journal of Health Informatics
11/12/2014, Tokyo

50 /25

presentation: NTCIR-11 Temporalia

ISO-TimeML
DATE

[YYYY-MM-DD]

TIME

[date]T[hh:mm:ss]

SET

P[[n][Y/M/D/w/h/m/s]]

DURATION

R[n][set]
11/12/2014, Tokyo

51 /25

presentation: NTCIR-11 Temporalia

temporal forms
time or date references

11pm, February 14th

time references that


anchor on another time

one hour after midnight

durations

two days, five years

recurring times

twice in the hour

context-dependent
times

today, last year

vague references

the near future

times indicated by an
event

the day after Silvio


Berlusconi resigned

J. Poveda, M. Surdeanu, and J. Turmo, An analysis of Bootstrapping for the Recognition of


Temporal Expressions, 2009

11/12/2014, Tokyo

52 /25

presentation: NTCIR-11 Temporalia

temporal binding
fully-qualified: no reference to any other temporal
entity

March 15, 2001

deictic: reference to the time of utterance

today, yesterday, three weeks ago, last Thursday

anaphoric: reference to a temporal expression


previously evoked in the text

March 15, the next week, Saturday, at that time

D. Ahn, S. Fissaha Adafre, and M. de Rijke, Towards Task-BasedTemporal Extraction and


Recognition, 2005

11/12/2014, Tokyo

53 /25

presentation: NTCIR-11 Temporalia

NorMA architecture
design, implement and evaluate a novel:

identification architecture

normalisation architecture

investigate the dierence between general and clinical domain


investigate the use of the proposed frameworks to the general
domain

suggest a more temporally-aware error measure for


normalisation phase
11/12/2014, Tokyo

54 /25

presentation: NTCIR-11 Temporalia

clinical NorMA architecture


design, implement and evaluate a novel:

identification architecture

normalisation architecture

investigate the dierence between general and clinical domain


investigate the use of the proposed frameworks to the general
domain

suggest a more temporally-aware error measure for


normalisation phase
11/12/2014, Tokyo

55 /25

presentation: NTCIR-11 Temporalia

clinical NorMA pipeline


design, implement and evaluate a novel:

identification architecture

normalisation architecture

investigate the dierence between general and clinical domain


investigate the use of the proposed frameworks to the general
domain

suggest a more temporally-aware error measure for


normalisation phase
11/12/2014, Tokyo

56 /25

presentation: NTCIR-11 Temporalia

example of clinical rule


pattern = re.findall(^(?:the |her |his |their )?([09][09])(?:st|nd|rd |th)
(?:post|post|day)? ?(?:pod| operative |op| hospital |hsp|day|hd)(?:ly)?
(?:day|night|afternoon)?$, raw_expression)
if pattern:
value = add_date(reference_date , int(pattern[0]) )
return expression, DATE , value, postoperative_literals3

temporal expression

type

ISO-8601 representation
(value)

rule name

11/12/2014, Tokyo

57 /25

presentation: NTCIR-11 Temporalia

Rules activation distribution


design, implement and evaluate a novel:

identification architecture

normalisation architecture

investigate the dierence between general and clinical domain


investigate the use of the proposed frameworks to the general
domain

suggest a more temporally-aware error measure for


normalisation phase
11/12/2014, Tokyo

58 /25

presentation: NTCIR-11 Temporalia

Rules activation distribution


design, implement and evaluate a novel:

identification architecture

normalisation architecture

investigate the dierence between general and clinical domain


investigate the use of the proposed frameworks to the general
domain

suggest a more temporally-aware error measure for


normalisation phase
11/12/2014, Tokyo

59 /25

presentation: NTCIR-11 Temporalia

example: raw text


Admission Date :
02/01/2002
Discharge Date :
02/08/2002
HISTORY OF PRESENT ILLNESS :
Saujule Study is a 77-year-old woman with a history of obesity and
hypertension who presents with increased shortness of breath x 5
days. Her shortness of breath has been progressive over the last
2-3 years. On admission , she was diuresed with Lasix and was
negative 1-2 liters per day for several days.

Source: i2b2 2012 clinical corpus

11/12/2014, Tokyo

60 /25

presentation: NTCIR-11 Temporalia

example: identification
Admission Date :
02/01/2002
Discharge Date :
02/08/2002
Unisys must
about ILLNESS
$100 million
HISTORY
OF pay
PRESENT
: in interest every quarter, on
top of $27
million
dividendswoman
on preferred
Saujule
Study
is a in
77-year-old
with astock.
history of obesity and
hypertension who presents with increased shortness of breath x 5
days. Her shortness of breath has been progressive over the last
2-3 years. On admission , she was diuresed with Lasix and was
negative 1-2 liters per day for several days.

Source: i2b2 2012 clinical corpus

11/12/2014, Tokyo

61 /25

presentation: NTCIR-11 Temporalia

example: normalisation

<TIMEX3 type="DATE" val="2002-02-01" mod="NA">02/01/2002</TIMEX3>


<TIMEX3 type="DATE" val="2002-02-08" mod="NA">02/08/2002</TIMEX3>
<TIMEX3 type="DURATION" val="P5D" mod="NA">5 days</TIMEX3>
<TIMEX3 type="DURATION" val="P2.5Y" mod="APPROX">2-3 years</TIMEX3>
<TIMEX3 type="DURATION" val="P3D" mod="APPROX">several days</TIMEX3>

Source: i2b2 2012 clinical corpus

11/12/2014, Tokyo

62 /25

nd
2

year backup
presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

ml-driven identification phase


Conditional Random Fields

Features: harvested from the literature

Tagging scheme: BIO (beginning, inside, outside)

Factor graph:

11/12/2014, Tokyo

64 /25

presentation: NTCIR-11 Temporalia

factor graph

... was | discovered | in | 1977 | , | Feynman | immediately ...


w-3

Source: Richard P. Feynmans page

w-2

w-1

w0

w+1

w+2

w+3

11/12/2014, Tokyo

65 /25

presentation: NTCIR-11 Temporalia

unique values per feature


12000

10800

9600

8400

7200

6000

4800

3600

2400

1200

phon_length
phon_last_phoneme
phon_form
phon_first_phoneme
lex_pnp
lex_is_space
lex_chunk
temp_year
temp_weekday
temp_time
temp_temporal_prepositions
temp_temporal_conjunctives
temp_temporal_co-reference
temp_temporal_adverbs
temp_temporal_adjectives
temp_signal
temp_season
temp_present_ref
temp_pod
temp_period
temp_past_ref
temp_ordinal
temp_number
temp_month
temp_modifier
temp_literal_number
temp_fuzzy_quantifier
temp_future_ref
temp_festivity
temp_digit
temp_compound
temp_cardinal
lex_unusual
lex_last_s
lex_is_upper
lex_is_title
lex_is_numeric
lex_is_lower
lex_is_digit
lex_is_decimal
lex_is_alpha
lex_is_alnum
lex_is_all_digits_and_dots
lex_is_all_caps_and_dots
lex_has_symbols
lex_has_digit
lex_first_upper
gaz_stopword
gaz_male_names
gaz_festivities
gaz_female_names
TIMEX3 (class)
lex_polarity
gaz_uscities
gaz_nationalities
gaz_iso_countries
gaz_countries
lex_tense
lex_token_with_no_letters_and_numbers
lex_treetagger_pos
lex_pattern
lex_vocal_pattern
lex_extended_pattern
lex_token_with_no_letters
lex_sux
lex_lancaster_stem
lex_prefix
lex_treetagger_lemma
lex_porter_stem
lex_lemma
_word_preprocessed
_word

66 /25
11/12/2014, Tokyo

presentation: NTCIR-11 Temporalia

Post-processing analysis

11/12/2014, Tokyo

67 /25

presentation: NTCIR-11 Temporalia

Temporal

Galileo Galilei
(1564-1642)

ManTIME
wikipedia pages
using dates only

Dante
(1265-1321)

gaussian fit

to be improved

11/12/2014, Tokyo

68 /25

ManTIME architecture
presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

feature type selection


93 features

morpho-lexical, syntactic, gazetteers and WordNet

4 models

M1: morpho-lexical only


M2: morpho-lexical + syntactic
M3: morpho-lexical + gazeetters
M4: morpho-lexical + gazeetters + WordNet

model selection
11/12/2014, Tokyo

70 /25

presentation: NTCIR-11 Temporalia

model selection result

M1: morpho-lexical only


M2: morpho-lexical + syntactic
M3: morpho-lexical + gazeetters
M4: morpho-lexical + gazeetters + WordNet

That means Unisys must pay about $100 million in interest every
quarter, on top of $27 million in dividends on preferred stock.

Silver + Gold, 5x10-fold cross validation

11/12/2014, Tokyo

71 /25

presentation: NTCIR-11 Temporalia

TempEval-3

temporal information
extraction
challenge
#
#
annotation
Corpus

purpose

organised every 3 years


(ACL)
AQUAINT
73 in SemEval
33.973
experts

training

TimeBank

documents

TempEval-3 silver
TempEval-3 eval

words

source

183

61.418

experts

training

2.452

666.309

systems

training

20

6.375

experts

testing

Source: TempEval-3 challenge; Corpora released in October 2012 (except the eval).

11/12/2014, Tokyo

72 /25

presentation: NTCIR-11 Temporalia

identification post-processing

Probabilistic correction module


BIO fixer
Threshold-based label switcher

CRFs

PCM

Silver + Gold; 4x10-fold cross validation

BIO
fixer

TbLS

BIO
fixer

11/12/2014, Tokyo

73 /25

presentation: NTCIR-11 Temporalia

TempEval-3: results (Task A)


investigate semi-supervised techniques
Normalisation
approach
the
normalisation
phase
in
a
novel
way

Training data (postaccuracy


strict matching
lenient matching
Identification

processing)

Overall
score

Prec
Rec
F1
Prec
Rec
F1
Type
Value
investigate
the
dierences
between
general
and
clinical

Human&Silver (no)

0.79

0.64

0.7

0.97

0.79

0.87

0.89

0.77

0.672

Human&Silver (yes)

0.8

0.66

0.72

0.97

0.8

0.88

0.87

0.76

0.667

Human (no)

0.76

0.64

0.7

0.95

0.8

0.87

0.87

0.77

0.675

0.78

0.63

0.7

0.97

0.8

0.87

0.89

0.77

0.672

0.66

0.73

0.98

0.79

0.88

0.91

0.78

0.683

domain

use of0.74
the proposed
other0.69
0.79 the 0.7
0.95
0.85 framework
0.9
0.86 to 0.77
investigate

Human (yes)
Silver (no)

domains0.82

Silver (yes)

suggest a more temporally-aware error measure in the


normalisation
Source: M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification and
normalization in the TempEval-3 challenge. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 5357, Atlanta, Georgia, USA, June 2013. ACL.

11/12/2014, Tokyo

74 /25

presentation: NTCIR-11 Temporalia

TempEval-3: ranking (Task A)


Identification
System
(best run only)

strict matching

lenient matching

Normalisation
accuracy

Overall
score

Prec

Rec

F1

Prec

Rec

F1

Type

Value

HeidelTime

0.84

0.79

0.81

0.93

0.88

0.9

0.91

0.86

0.776

NavyTime

0.79

0.8

0.8

0.89

0.91

0.9

0.89

0.79

0.71

ManTIME

0.79

0.7

0.74

0.95

0.85

0.9

0.86

0.77

0.69

SUTime

0.79

0.8

0.8

0.89

0.91

0.9

0.89

0.75

0.674

ATT

0.91

0.7

0.79

0.98

0.75

0.85

0.91

0.77

0.656

ClearTK

0.86

0.8

0.83

0.94

0.87

0.9

0.93

0.72

0.647

JU-CSE

0.82

0.7

0.75

0.93

0.8

0.86

0.87

0.74

0.638

KUL

0.77

0.63

0.69

0.93

0.76

0.84

0.89

0.75

0.63

FSS-TimEx

0.52

0.46

0.49

0.9

0.8

0.85

0.81

0.68

0.582

Source: Naushad UzZaman, Hector Llorens, Leon Derczynski, James Allen, Marc Verhagen,
and James Pustejovsky. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions,
events, and temporal relations. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 1-9, Atlanta, Georgia, USA, June 2013. ACL.

11/12/2014, Tokyo

75 /25

presentation: NTCIR-11 Temporalia

TempEval-3: results (Task A)


investigate semi-supervised techniques
Normalisation
approach
the
normalisation
phase
in
a
novel
way

Training data (postaccuracy


strict matching
lenient matching
Identification

processing)

Overall
score

Prec
Rec
F1
Prec
Rec
F1
Type
Value
investigate
the
dierences
between
general
and
clinical

Human&Silver (no)

0.79

0.64

0.7

0.97

0.79

0.87

0.89

0.77

0.672

Human&Silver (yes)

0.8

0.66

0.72

0.97

0.8

0.88

0.87

0.76

0.667

Human (no)

0.76

0.64

0.7

0.95

0.8

0.87

0.87

0.77

0.675

0.78

0.63

0.7

0.97

0.8

0.87

0.89

0.77

0.672

0.66

0.73

0.98

0.79

0.88

0.91

0.78

0.683

domain

use of0.74
the proposed
other0.69
0.79 the 0.7
0.95
0.85 framework
0.9
0.86 to 0.77
investigate

Human (yes)
Silver (no)

domains0.82

Silver (yes)

suggest a more temporally-aware error measure in the


normalisation
Source: M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification and
normalization in the TempEval-3 challenge. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 5357, Atlanta, Georgia, USA, June 2013. ACL.

11/12/2014, Tokyo

76 /25

presentation: NTCIR-11 Temporalia

TempEval-3: ranking (Task A)


Identification
System
(best run only)

strict matching

lenient matching

Normalisation
accuracy

Overall
score

Prec

Rec

F1

Prec

Rec

F1

Type

Value

HeidelTime

0.84

0.79

0.81

0.93

0.88

0.9

0.91

0.86

0.776

NavyTime

0.79

0.8

0.8

0.89

0.91

0.9

0.89

0.79

0.71

ManTIME

0.79

0.7

0.74

0.95

0.85

0.9

0.86

0.77

0.69

SUTime

0.79

0.8

0.8

0.89

0.91

0.9

0.89

0.75

0.674

ATT

0.91

0.7

0.79

0.98

0.75

0.85

0.91

0.77

0.656

ClearTK

0.86

0.8

0.83

0.94

0.87

0.9

0.93

0.72

0.647

JU-CSE

0.82

0.7

0.75

0.93

0.8

0.86

0.87

0.74

0.638

KUL

0.77

0.63

0.69

0.93

0.76

0.84

0.89

0.75

0.63

FSS-TimEx

0.52

0.46

0.49

0.9

0.8

0.85

0.81

0.68

0.582

Source: Naushad UzZaman, Hector Llorens, Leon Derczynski, James Allen, Marc Verhagen,
and James Pustejovsky. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions,
events, and temporal relations. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 1-9, Atlanta, Georgia, USA, June 2013. ACL.

11/12/2014, Tokyo

77 /25

presentation: NTCIR-11 Temporalia

feature type selection


93 features

morpho-lexical, syntactic, gazetteers and WordNet

4 models

M1: morpho-lexical only


M2: morpho-lexical + syntactic
M3: morpho-lexical + gazeetters
M4: morpho-lexical + gazeetters + WordNet

model selection
11/12/2014, Tokyo

78 /25

presentation: NTCIR-11 Temporalia

model selection result

M1: morpho-lexical only


M2: morpho-lexical + syntactic
M3: morpho-lexical + gazeetters
M4: morpho-lexical + gazeetters + WordNet

That means Unisys must pay about $100 million in interest every
quarter, on top of $27 million in dividends on preferred stock.

Silver + Gold, 5x10-fold cross validation

11/12/2014, Tokyo

79 /25

presentation: NTCIR-11 Temporalia

TempEval-3

temporal information
extraction
challenge
#
#
annotation
Corpus

purpose

organised every 3 years


(ACL)
AQUAINT
73 in SemEval
33.973
experts

training

TimeBank

documents

TempEval-3 silver
TempEval-3 eval

words

source

183

61.418

experts

training

2.452

666.309

systems

training

20

6.375

experts

testing

Source: TempEval-3 challenge; Corpora released in October 2012 (except the eval).

11/12/2014, Tokyo

80 /25

presentation: NTCIR-11 Temporalia

identification post-processing

Probabilistic correction module


BIO fixer
Threshold-based label switcher

CRFs

PCM

Silver + Gold; 4x10-fold cross validation

BIO
fixer

TbLS

BIO
fixer

11/12/2014, Tokyo

81 /25

rd
3

year backup
presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

why is it challenging?

1. Matt exercised during his lunch break.


2. He stretched, lifted weights, and ran.
3. He showered, got dressed and returned work.

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

83 /25

presentation: NTCIR-11 Temporalia

linguistic knowledge
1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

84 /25

presentation: NTCIR-11 Temporalia

linguistic knowledge
1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised
stretch, lift, run

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

85 /25

presentation: NTCIR-11 Temporalia

linguistic knowledge
1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised

shower, dress, return

stretch, lift, run

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

86 /25

presentation: NTCIR-11 Temporalia

common sense knowledge


1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised

shower, dress, return

stretch, lift, run

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

87 /25

presentation: NTCIR-11 Temporalia

common sense knowledge


1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised

shower, dress, return

stretch, lift, run

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

88 /25

presentation: NTCIR-11 Temporalia

common sense knowledge


1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised

shower

dress

return

stretch, lift, run

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

89 /25

presentation: NTCIR-11 Temporalia

domain knowledge
1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.

lunch break
exercised
stretch

run

stretch

shower

dress

return

lift

Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

11/12/2014, Tokyo

90 /25

Temporal footprint
presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

results

Robin Williams (1951 - 2014), prediction: 1953-2006

E: 0.159

11/12/2014, Tokyo

92 /25

presentation: NTCIR-11 Temporalia

other types of temporal footprint?

!
A
H

Christopher Columbus will die in 2057 ?!

Prediction: 1366-2057 (1451-1506), E: 0.92

11/12/2014, Tokyo

93 /25

presentation: NTCIR-11 Temporalia

physical existence vs. social coverage

Anne Franks footprint is shifted in the future

11/12/2014, Tokyo

94 /25

Temporalia
presentation: NTCIR-11 Temporalia

presentation: NTCIR-11 Temporalia

data
Query

Submission date

CLASS

Feb 28, 2013

past

Upcoming Movies in 2013

Jan 1, 2013

future

2013 MLB Playo Schedule

Jan 1, 2013

future

Feb 28, 2013

present

benchmark test set: 300 queries


Number
of Neck Muscles
Feb 28, 2013

present

Movies 2012

training
set: 100 queries
current
price of gold
Amazon Deal of the Day

Feb 28, 2013

atemporal

11/12/2014, Tokyo

96 /25

presentation: NTCIR-11 Temporalia

attributes
ID

Query

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Is it a Wikipedia page title?


Does the query contain a temporal expression?
Submissions term
Submissions trimester
Timing
Most frequent trigger class
Wh type
Most frequent TempoWordNet class
Most frequent POS tag tense
Most frequent coarse-grained POS tag
Trigger classes footprint
Temporal between submission and query
Tenses footprint
Ordered TempoWordNet classes
Most frequent fine-grained POS tag
Coarse-grained POS tag ordered footprint
Fine-grained POS tag ordered footprint
Coarse-grained POS tag footprint
Fine-grained POS tag footprint

Submitted runs
Minimal

Intermediate

Full

11/12/2014, Tokyo

97 /25

presentation: NTCIR-11 Temporalia

error measure

gold
prediction

union

overlap

Fatima De Carvalho. 1996. Histogrammes et indices de proximite en analyse donne es


symboliques. Acyes de le cole de te sur lanalyse des donne es symboliques. LISECEREMADE, Universite de Paris IX Dauphine, pages 101127.

11/12/2014, Tokyo

98 /25